Batch DG usage Redis writes every 60s instead of per-chunk#5868
Batch DG usage Redis writes every 60s instead of per-chunk#5868
Conversation
record_dg_usage_ms was called on every audio chunk (~50/sec/session), causing ~100 Redis ops/sec/session. With ~100 concurrent sessions this produced 8.5-12.5k ops/sec on Redis Cloud. Fix: accumulate DG usage ms locally in dg_usage_ms_pending and flush to Redis every 60s in _record_usage_periodically, plus on session end. Reduces Redis ops from ~100/sec/session to ~0.03/sec/session (~3000x). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Greptile SummaryThis PR reduces Redis write pressure by replacing per-chunk Key issues found:
Confidence Score: 2/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant AC as Audio Chunk (per 20ms)
participant RD as receive_data / flush_stt_buffer
participant ACC as dg_usage_ms_pending (local int)
participant PL as _record_usage_periodically (60s)
participant FIN as finally block (session end)
participant RD2 as Redis (record_dg_usage_ms)
AC->>RD: audio arrives (~50x/sec)
RD->>ACC: dg_usage_ms_pending += chunk_ms
Note over RD,ACC: No Redis write here (was 2 ops/chunk before)
loop Every 60 seconds
PL->>ACC: read dg_usage_ms_pending
PL->>RD2: record_dg_usage_ms(uid, pending)
PL->>ACC: dg_usage_ms_pending = 0
end
Note over FIN: Session disconnect / timeout
FIN->>ACC: read dg_usage_ms_pending (remaining)
FIN->>RD2: record_dg_usage_ms(uid, pending)
FIN->>ACC: dg_usage_ms_pending = 0
|
Local Dev Backend Test Evidence1. Backend startup — OK2. WebSocket endpoint reachable — OK3. DG Usage Batching Verification — 6/6 PASSED4. Existing fair-use tests — 81 passedby AI for @beastoin |
Greptile review fixes: 1. Add nonlocal dg_usage_ms_pending to receive_data() — without it, multi-channel sessions would hit UnboundLocalError on the += line. 2. Move DG usage flush above use_custom_stt:continue guard so all STT paths get flushed consistently. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Greptile Review Fixes AppliedBoth issues from Greptile's review have been fixed in commit d8d7a17: Fix 1 (Critical):
|
11 tests covering: - Structure: no per-chunk Redis calls, 4 accumulation points, 3 nonlocal declarations, 2 flush resets, flush before custom-STT guard - Behavior: batched 60s single Redis write, large accumulation no overflow, disabled skips Redis, zero ms skips Redis - Math: 3000x reduction factor, 100 sessions ops calculation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
test_budget_accounting_across_providers now checks for 4 accumulation points (dg_usage_ms_pending +=) and 2 flush calls instead of 4 direct record_dg_usage_ms calls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All checkpoints passed — PR ready for merge
Awaiting explicit merge approval. by AI for @beastoin |
Post-deploy verification — SUCCESSDeployed by @mon, 25/25 pods running.
The remaining ~313 ops/sec is the expected baseline from by AI for @beastoin |
Summary
Fixes Redis ops/sec spike caused by
record_dg_usage_msbeing called on every audio chunk in the realtime transcription pipeline.Root cause
record_dg_usage_ms(2 Redis ops: INCRBY + EXPIRE) was called per audio chunk (~50 chunks/sec/session). With ~100 concurrent sessions across 22 pods, this produced 8,500-12,500 Redis ops/sec on Redis Cloud — up from ~0 beforeFAIR_USE_ENABLED=truewas deployed.Fix
Added local accumulator
dg_usage_ms_pendingthat batches DG usage locally and flushes to Redis every 60s via the existing_record_usage_periodicallyloop, plus on session end.record_dg_usage_ms(uid, chunk_ms)calls →dg_usage_ms_pending += chunk_msnonlocaldeclarations for all 3 nested functions (_record_usage_periodically,receive_data,flush_stt_buffer)use_custom_sttguard for consistencyImpact
Review cycle
nonlocalinreceive_data, flush ordering) — both fixedTests
test_dg_usage_batch.py): structure (5), behavior (4), math (2)test_fair_use_api.py: budget accounting test matches batched patterntest.shupdated with new test fileLocal dev backend evidence
/v4/listenreachable (403 for unauthenticated — correct)Risks
by AI for @beastoin