Problem
After PR #6175 migrated desktop CRUD from Rust backend to Python backend, desktop traffic is generating 504 timeouts on listing endpoints. Root cause is two-fold:
- Aggressive polling — desktop app polls 6+ endpoints every 15-30 seconds, generating ~4,275 requests/user/day across ~500 users (2.15M total req/day peak)
- Slow Firestore queries — p50 latency is sub-second but long tail (users with large collections) exceeds the 2-minute app timeout
The mobile endpoints are unaffected (/v2/messages stayed at ~3 504s/day), proving the backend itself didn't regress — the desktop migration just added massive new traffic that amplified a pre-existing tail latency issue.
Evidence
Before/after PR #6175 (mon's data):
| endpoint |
before 504s/day |
after 504s/day |
desktop reqs/day added |
| /v1/action-items |
2 |
298 |
141K-648K |
| /v1/conversations |
12 |
203 |
97K-347K |
| /v2/desktop/messages |
0 (new) |
206 |
430K |
| /v1/conversations/count |
0 |
36 |
100% desktop |
| /v3/memories |
2 |
103 |
10K-30K |
No load balancer rate limiting — Cloud Armor not configured, zero 429s. All requests pass through.
Root Cause: Desktop Polling Timers
Every polling timer found in the desktop app, ranked by request volume:
| # |
Source |
File:Line |
Endpoint |
Interval |
Req/user/hr |
Guards |
| 1 |
ChatProvider messagePoll |
ChatProvider.swift:553 |
GET /v1/messages |
15s |
240 |
isSignedIn, !sending, !loading, messages not empty |
| 2 |
DesktopHomeView refresh |
DesktopHomeView.swift:238 |
GET /v1/conversations + /v1/conversations/count |
30s |
240 (2 calls/tick) |
isSignedIn, !loading |
| 3 |
TasksStore auto-refresh |
TasksStore.swift:165 |
GET /v1/action-items |
30s |
120 |
isActive (page visible), isSignedIn |
| 4 |
MemoriesPage auto-refresh |
MemoriesPage.swift:210 |
GET /v3/memories |
30s |
120 |
isActive (page visible), isSignedIn |
| 5 |
CrispManager |
CrispManager.swift:63 |
GET /v1/crisp/unread |
120s |
30 |
!AuthBackoff |
| 6 |
TranscriptionRetryService |
TranscriptionRetryService.swift:25 |
GET /v1/conversations |
60s |
0-60 |
hasPendingSessions |
| 7 |
didBecomeActive cascade |
DesktopHomeView.swift:200 |
GET /v1/conversations + count |
on app activate |
~48/hr |
every cmd-tab back |
Key issues:
- ChatProvider (15s) and DesktopHomeView (30s) have no page-visibility guard — they run even when the window is hidden. Menu bar apps keep windows alive indefinitely.
refreshConversations() makes 2 separate API calls (getConversations + getConversationsCount) per tick
- No timers are stopped when the app window is closed/hidden
- ~500 users x ~14,240 req/user/day (if app runs 24h as menu bar apps do) = ~7.1M theoretical max
APIs Requiring Backend Optimization
These are the Python backend endpoints that 504 for heavy users (large Firestore collections):
| Endpoint |
p50 |
p99 |
Max |
504 rate |
Issue |
| GET /v1/conversations |
0.68s |
5.8s |
158s |
0.04% |
Full doc reads including compressed transcript_segments, no field projection |
| GET /v1/conversations/count |
fast |
- |
- |
0.02% |
.count().get() on unindexed filter combos |
| GET /v1/action-items |
0.51s |
6.0s |
111s |
0.05% |
Double Firestore query for has_more pagination + Python re-sort |
| GET /v2/desktop/messages |
0.35s |
4.0s |
111s |
0.03% |
3-4 sequential Firestore round-trips on POST (save_message) |
| GET /v3/memories |
0.64s |
24s |
110s |
0.53% |
Hardcoded limit=5000 on first page, no pagination |
Specific backend code issues:
conversations.py:277 — reads full documents with no .select() field projection. Transcript segments are large compressed blobs unnecessary for list views.
action_items.py:234-246 — executes a second identical query with offset+limit just to check has_more. Should request limit+1 instead.
chat.py:689-720 — save_message does 3-4 sequential Firestore round-trips (acquire_session + set + get + update). Should batch writes.
memories.py:48 — hardcodes limit=5000 when offset=0. Should cap at 200 and paginate.
Proposed Fix Plan
Phase 1: Reduce polling frequency (desktop client, highest impact)
Phase 2: Optimize backend queries (server-side)
Phase 3: Infrastructure
Impact
- Phase 1 alone would reduce desktop traffic by 4-8x (from ~2M to ~250-500K req/day)
- Phase 2 would reduce p99 latency and eliminate remaining 504s for heavy users
- Combined: should bring desktop 504s from ~800/day to near-zero
Problem
After PR #6175 migrated desktop CRUD from Rust backend to Python backend, desktop traffic is generating 504 timeouts on listing endpoints. Root cause is two-fold:
The mobile endpoints are unaffected (/v2/messages stayed at ~3 504s/day), proving the backend itself didn't regress — the desktop migration just added massive new traffic that amplified a pre-existing tail latency issue.
Evidence
Before/after PR #6175 (mon's data):
No load balancer rate limiting — Cloud Armor not configured, zero 429s. All requests pass through.
Root Cause: Desktop Polling Timers
Every polling timer found in the desktop app, ranked by request volume:
Key issues:
refreshConversations()makes 2 separate API calls (getConversations + getConversationsCount) per tickAPIs Requiring Backend Optimization
These are the Python backend endpoints that 504 for heavy users (large Firestore collections):
Specific backend code issues:
conversations.py:277— reads full documents with no.select()field projection. Transcript segments are large compressed blobs unnecessary for list views.action_items.py:234-246— executes a second identical query with offset+limit just to checkhas_more. Should request limit+1 instead.chat.py:689-720—save_messagedoes 3-4 sequential Firestore round-trips (acquire_session + set + get + update). Should batch writes.memories.py:48— hardcodeslimit=5000when offset=0. Should cap at 200 and paginate.Proposed Fix Plan
Phase 1: Reduce polling frequency (desktop client, highest impact)
Phase 2: Optimize backend queries (server-side)
.select()field projection for list endpoints (skip transcript_segments)Phase 3: Infrastructure
Impact