backend: lower precache concurrency cap 4 → 2 to free storage_executor for sync by mdmohsin7 · Pull Request #7526 · BasedHardware/omi

mdmohsin7 · 2026-05-28T19:09:18Z

Summary

Halve _PRECACHE_FILE_SEM from 4 → 2 in backend/utils/other/storage.py.

Why

The storage_executor (96 workers) is shared between the sync v2 pipeline and the playback / precache flows. Under load, precache fan-out can pin up to ~36 workers per instance (4 outer slots × up to 9 each for chunk downloads), starving the sync pipeline's GCS reads.

Recent prod data (48h, all services):

caller=process_conversation: 7,914 events (37% of audio_merge work)
caller=precache_endpoint: 8,430 events (40%)
caller=sync_urls_first: 3,562 events (17%)
caller=sync_urls_bg: 1,285 events (6%)

Storage pool saturation warnings on backend-sync were averaging 96% utilization with peak queue depth 67. Sync's decode_ms p95 was sitting at 252s, largely queue wait.

Halving the cap drops precache's worst-case storage footprint to ~18 workers per instance, leaving ~78 free for sync.

Tradeoff

Speculative cache warming takes longer. Users see a one-off slowdown only if they open a brand-new conversation before warming finishes; on-demand /v1/sync/audio/{conv}/urls playback is unchanged.

Test plan

Watch executor_pool_health warnings on backend-sync for 30 min post-deploy — storage pool max_q should drop noticeably
Watch sync_v2 bg complete decode_ms p95 — should improve toward p50
No new 5xx errors on /v2/sync-local-files or /v1/sync/audio/*/precache
Smoke: open a few recently-created conversations and confirm playback still works

…sync The storage_executor pool (96 workers) is shared between the sync v2 pipeline and the playback/precache flows. Under load, precache fan-out can pin up to ~36 workers per instance (4 outer slots × up to 9 each for chunk downloads), starving the sync pipeline's GCS reads. Halving the per-process precache concurrency cap halves precache's worst-case storage footprint to ~18 workers per instance, leaving the sync pipeline more headroom on the shared pool. The cost is slower speculative cache warming (one-off ~few-second hit when a user opens a brand-new conversation before warming completes); on-demand /urls playback path is unchanged.

greptile-apps · 2026-05-28T19:12:31Z

Greptile Summary

Halves _PRECACHE_FILE_SEM from 4 → 2 in backend/utils/other/storage.py to reduce the peak storage_executor thread budget consumed by background audio pre-caching, freeing capacity for the sync v2 pipeline that shares the same 96-worker pool.

Each active precache slot occupies 1 blocking _cache_single thread plus up to 8 chunk-download threads via the _CHUNK_WINDOW_SIZE window, so the cap of 2 slots limits precache to ≤ 18 storage_executor workers (down from ≤ 36), leaving ~78 threads free for sync workloads.
The tradeoff is slower speculative cache warming; on-demand playback is unaffected because /v1/sync/audio/{conv}/urls calls get_or_create_merged_audio directly and bypasses _PRECACHE_FILE_SEM.

Confidence Score: 5/5

Safe to merge — the change is a single constant reduction with no logic alterations and no new code paths.

The semaphore value is the only thing that changed. The acquire/release pattern in _precache_all and its done-callback wiring are untouched, so there is no new deadlock surface. The new cap of 2 is mathematically consistent with the stated worker-budget goals (2 × 9 = 18 threads), and the production data cited in the PR description makes the direction of the change unambiguous.

No files require special attention.

Important Files Changed

Filename	Overview
backend/utils/other/storage.py	Single-line change lowering `_PRECACHE_FILE_SEM` from 4 → 2; reduces worst-case `storage_executor` thread usage for background precache from ~36 to ~18 per instance.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[precache_conversation_audio] -->|postprocess_executor.submit| B[_precache_all]
    B -->|_PRECACHE_FILE_SEM.acquire max=2 was 4| C[storage_executor.submit _cache_single]
    C --> D[get_or_create_merged_audio]
    D -->|cache hit| E[return cached WAV bytes]
    D -->|cache miss| F[download_audio_chunks_and_merge]
    F -->|STORAGE_CHUNK_SEM x8 in-flight per call| G[storage_executor chunk downloads]
    G --> H[GCS download + decode]
    C -->|done_callback: _PRECACHE_FILE_SEM.release| B

_{Reviews (1): Last reviewed commit: "backend: lower _PRECACHE_FILE_SEM 4 → 2 ..." | Re-trigger Greptile}

mdmohsin7 merged commit bf4a1d6 into main May 28, 2026
2 checks passed

mdmohsin7 deleted the caleb/precache-sem-cap branch May 28, 2026 19:12

mdmohsin7 mentioned this pull request May 28, 2026

Sync infra changes — May 28: storage pool, autoscaling, and app-side rate-limit fixes #7531

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backend: lower precache concurrency cap 4 → 2 to free storage_executor for sync#7526

backend: lower precache concurrency cap 4 → 2 to free storage_executor for sync#7526
mdmohsin7 merged 1 commit into
mainfrom
caleb/precache-sem-cap

mdmohsin7 commented May 28, 2026

Uh oh!

greptile-apps Bot commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mdmohsin7 commented May 28, 2026

Summary

Why

Tradeoff

Test plan

Uh oh!

greptile-apps Bot commented May 28, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant