Skip to content

TestFlight builds split conversations across prod/staging backends on WS reconnect #5949

@beastoin

Description

@beastoin

TestFlight iOS builds silently switch from production to staging backend (api.omiapi.com) on WebSocket reconnection, causing conversation data to split across environments. This corrupts audio storage, breaks speaker extraction, and makes audio playback fail. Discovered investigating uid [REDACTED] conversation 5c100e1a on iOS build 787 where a 45-minute recording had only 72 seconds on prod and the remaining 45 minutes on dev.

Current Behavior

  • TestFlight builds default testFlightApiEnvironment to 'staging' (preferences.dart:630), overriding apiBaseUrl to api.omiapi.com at startup (main.dart:160-171)
  • Initial WS connection may go to prod (cached state or race at startup), but after disconnect (code=1006), the 15s keep-alive reconnection (capture_provider.dart:1382) uses the staging URL
  • Conversation continues on staging backend for the remainder of the session — live transcripts work, but all audio chunks go to the dev GCS bucket
  • Dev GKE pusher writes chunk_timestamps (46 entries) to shared Firestore, but dev Cloud Run cannot read those chunks ("No chunks found" error), and prod backend cannot access them either
  • extract_speaker_samples runs on dev backend where person profiles may not exist → speaker extraction silently fails
  • current_session_segments (per-WS-session dict at transcribe.py:386) resets on each reconnection, so segments from the prod session are invisible to can_assign on the dev session — blocking extraction even if audio were accessible
  • Staging banner never shows: isUsingStagingApi (env.dart:34-41) reads _instance.stagingApiUrl directly (empty string — STAGING_API_URL env var not set), bypassing the stagingApiUrl getter's hardcoded fallback to https://api.omiapi.com/. Returns false even when the app IS running on staging. Introduced in commit 2f2b2e01d3 (Mar 16, beastoin).

Expected Behavior

TestFlight builds should connect to a single consistent backend for the entire session. WS reconnections must use the same backend as the initial connection. When the app is running on staging, the staging banner must be visible.

Affected Areas

File Line Description
app/lib/main.dart 160-171 TestFlight detection overrides apiBaseUrl to staging
app/lib/backend/preferences.dart 630 testFlightApiEnvironment defaults to 'staging'
app/lib/providers/capture_provider.dart 1357-1402 Keep-alive reconnection uses current (overridden) apiBaseUrl
app/lib/services/sockets/pure_socket.dart 72 buildHeaders called on each connect — picks up overridden URL
backend/routers/transcribe.py 386, 2518-2523 current_session_segments per-session scope blocks cross-session assignment
app/lib/env/env.dart 34-41 isUsingStagingApi reads _instance.stagingApiUrl (empty) instead of Env.stagingApiUrl getter — banner never shows

Solution

  1. Pin WS backend URL at session start: When a recording session begins, capture the resolved apiBaseUrl and use it for all reconnections within that session — do not re-resolve from Env.apiBaseUrl on each reconnect.
  2. Fix current_session_segments scope: Persist segment tracking across WS reconnections for the same conversation (e.g., keyed by conversation_id instead of per-session).
  3. Dev infra: Ensure dev GKE pusher and dev Cloud Run share the same GCS bucket/SA, or disable private cloud sync on staging entirely.
  4. Fix staging banner: isUsingStagingApi should compare against Env.stagingApiUrl (the getter with fallback) instead of _instance.stagingApiUrl (the raw env var). One-line fix in env.dart:38.

Git Blame — isUsingStagingApi History

Date Author Commit Change
Feb 26 beastoin 44a961e0dd Created isUsingStagingApi — compared against stagingApiUrl getter ✓
Feb 27 Thinh 51df175b28 Refactored to multi-line with _normalizeUrl (#5202) ✓
Mar 16 beastoin 2f2b2e01d3 Changed to _instance.stagingApiUrl — bypasses fallback, banner broken

Files to Modify

  • app/lib/providers/capture_provider.dart — pin backend URL at session start
  • app/lib/services/sockets/transcription_service.dart — pass pinned URL to socket creation
  • backend/routers/transcribe.py — persist current_session_segments across reconnections for same conversation
  • app/lib/main.dart — consider not overriding URL if a recording session is active
  • app/lib/env/env.dart — fix isUsingStagingApi to use Env.stagingApiUrl getter instead of _instance.stagingApiUrl

Impact

Affects all TestFlight users who experience a WS reconnection during recording. Causes silent data corruption: split audio, broken playback, failed speaker extraction. Staging banner never shows despite app running on staging — users have no visual indication they are on the wrong backend. Production App Store builds are not affected (no staging override).


by AI for @beastoin

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingp2Priority: Important (score 14-21)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions