Problem
Private cloud audio sync uploads raw PCM16 chunks (80KB each) directly to GCS every 5 seconds per active session. At current scale (200 concurrent sessions), this generates 3.5M GCS Class A write ops/day (~$518/month). At 100x scale this becomes $51,840/month in ops alone, and the approach doesn't scale further.
Root Causes
- No compression: Raw PCM16 at 16KB/s stored as-is (or encrypted). Opus encoding would give ~10x reduction in storage and bandwidth.
- Per-chunk GCS writes: Every 5-second chunk = 1 GCS Class A write op. No batching.
- In-memory only queue: Audio chunks are buffered in memory. Pod crash = data loss.
Proposed Solution (Phased)
Phase 1: Opus Encoding (Immediate Win)
- Add
opuslib.Encoder(sample_rate, 1, APPLICATION_VOIP) in pusher
- Encode PCM → Opus before encryption:
PCM → Opus → Encrypt
- ~10x storage reduction: 80KB → ~8KB per 5-second chunk
- CPU cost: ~0.3-0.8% vCPU per session (32.5ms per 5s chunk)
- New extensions:
.opus (standard) / .opus.enc (enhanced)
- Update
storage.py list/delete/download to support new extensions
- Update
merge_conversations.py extension handling
- Update
conversations.py duration math (don't assume fixed +5s)
Phase 2: 60-Second Batch Upload
- Accumulate 60s of chunks before single GCS upload (12x fewer ops)
- Flush triggers:
max_age=60s, max_bytes, conversation end, pod shutdown
- Run batch sync in dedicated worker deployment (not on pusher hot path)
- GCS upload idempotency via deterministic object names
Phase 3: Filestore Spool (If Durability SLO Requires)
- Write chunks to Filestore NFS mount instead of memory
- State machine:
.open → .ready → .uploading → .done
- Atomic rename + lease timeout for crash recovery
- Filestore tier selection:
- 1x: Basic HDD 100GiB on GKE (~$49/mo)
- 100x: Basic SSD or Zonal (~$768/mo)
- 10,000x: Must shard across multiple instances
Cost Impact
| Scale |
Current (5s writes) |
Phase 2 (60s batches) |
Phase 3 (+ Filestore) |
| 1x (200 sessions) |
$518/mo |
$43/mo |
~$92/mo |
| 100x (20K sessions) |
$51,840/mo |
$4,320/mo |
~$5,088/mo |
| 10,000x (2M sessions) |
$5,184,000/mo |
$432,000/mo |
+ sharding cost |
Opus encoding reduces storage/bandwidth ~10x but does NOT reduce Class A ops (same number of objects unless batched).
Compatibility Requirements
- Must handle mixed legacy (
.bin/.enc) and new (.opus/.opus.enc) chunks
- Backward compatible: existing recordings remain readable
download_audio_chunks_and_merge() must decode Opus back to PCM for playback
- Feature-flagged rollout per phase
Files to Modify
backend/routers/pusher.py — Opus encoder init, batch accumulation
backend/utils/other/storage.py — new extensions, batch upload, Opus-aware list/download
backend/utils/encryption.py — encrypt Opus bytes (same API, different input)
backend/database/conversations.py — flexible duration math
backend/utils/conversations/merge_conversations.py — generic extension handling
- Helm values — Filestore mount config (Phase 3)
Acceptance Criteria
References
Problem
Private cloud audio sync uploads raw PCM16 chunks (80KB each) directly to GCS every 5 seconds per active session. At current scale (200 concurrent sessions), this generates 3.5M GCS Class A write ops/day (~$518/month). At 100x scale this becomes $51,840/month in ops alone, and the approach doesn't scale further.
Root Causes
Proposed Solution (Phased)
Phase 1: Opus Encoding (Immediate Win)
opuslib.Encoder(sample_rate, 1, APPLICATION_VOIP)in pusherPCM → Opus → Encrypt.opus(standard) /.opus.enc(enhanced)storage.pylist/delete/download to support new extensionsmerge_conversations.pyextension handlingconversations.pyduration math (don't assume fixed +5s)Phase 2: 60-Second Batch Upload
max_age=60s,max_bytes, conversation end, pod shutdownPhase 3: Filestore Spool (If Durability SLO Requires)
.open → .ready → .uploading → .doneCost Impact
Opus encoding reduces storage/bandwidth ~10x but does NOT reduce Class A ops (same number of objects unless batched).
Compatibility Requirements
.bin/.enc) and new (.opus/.opus.enc) chunksdownload_audio_chunks_and_merge()must decode Opus back to PCM for playbackFiles to Modify
backend/routers/pusher.py— Opus encoder init, batch accumulationbackend/utils/other/storage.py— new extensions, batch upload, Opus-aware list/downloadbackend/utils/encryption.py— encrypt Opus bytes (same API, different input)backend/database/conversations.py— flexible duration mathbackend/utils/conversations/merge_conversations.py— generic extension handlingAcceptance Criteria
References