fix: voice chat 408 clock skew tolerance + user notification (#5929)#5934
fix: voice chat 408 clock skew tolerance + user notification (#5929)#5934
Conversation
Greptile SummaryThis PR adds only a coordination manifest ( Key observations:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant App as Flutter App (shared.dart)
participant BE as Backend (TimeoutMiddleware)
Note over App,BE: Current state (main — unchanged by this PR)
App->>BE: POST /voice-chat (X-Request-Start-Time header)
BE-->>App: 408 "Request is too old and has been rejected." (plain text)
Note over App: No clock-skew detection, no user notification
Note over App,BE: Planned state (sub-PRs #5932 + #5933 — NOT yet merged)
App->>BE: POST /voice-chat (X-Request-Start-Time header)
BE-->>App: 408 JSON {server_time, client_time, skew_seconds}
App->>App: Parse clock_skew from JSON body
App->>App: Show snackbar: "Device clock is off by N seconds"
Reviews (1): Last reviewed commit: "Add ownership manifest for #5929 collab" | Re-trigger Greptile |
| 12+ unit tests. | ||
|
|
||
| - name: app-clock-skew-notification | ||
| owner: kelvin |
There was a problem hiding this comment.
status: planned contradicts PR description
The PR description states that sub-PR #5933 already implements "Parse 408 JSON clock_skew response in shared.dart HTTP layer. Show snackbar warning with skew magnitude so users know to fix their device clock." However, this scope is marked as status: planned, indicating work has not yet started. If sub-PR #5933 is genuinely done, this status should be completed (or at least in_progress). If the work is not done, the PR description and title are misleading.
| owner: kelvin | |
| status: in_progress |
| issue: 5929 | ||
| title: "Voice chat transcription fails with 408 on retry due to clock skew" | ||
| driver: kelvin | ||
| integration_branch: collab/5929-integration |
There was a problem hiding this comment.
Integration branch diverges from merge target
The integration_branch is listed as collab/5929-integration, but this PR targets main. The actual code changes to backend/utils/other/timeout.py and app/lib/backend/http/shared.dart described in the PR body are not present in this diff — only this coordination YAML has been added. Merging this PR to main will not fix issue #5929; main will still have the old plain-text 408 response in TimeoutMiddleware and no clock-skew notification in the Flutter HTTP layer. The PR title "fix: voice chat 408 clock skew tolerance + user notification" and "Fixes #5929" in the description are premature if the actual sub-PRs (#5932 and #5933) haven't landed yet.
Verification EvidenceBackend Tests (12/12 PASS)E2E Integration (3/3 PASS)408 JSON Response (verified format){
"error": "clock_skew",
"message": "Request rejected — your device clock may be out of sync",
"server_time": 1774240978.08,
"client_time": 1774240078.07,
"skew_seconds": 900.0,
"hint": "Check your device date/time settings and enable automatic time"
}App Static AnalysisMerge Matrix
by AI for @beastoin |
|
All checkpoints passed — PR is ready for merge. Checkpoint summary:
Test totals: 41 unit tests + 3 E2E = 44 tests, all passing. Awaiting explicit merge approval. by AI for @beastoin |
CP9B L2 Evidence: Integrated Backend + App TestSetup
Screenshot: Clock Skew SnackbarSnackbar text: "Your device clock is off by ~15 min. Check your date & time settings." Flutter Logs (clock skew detection)8 API calls detected clock skew, rate-limiter showed only 1 snackbar (45s cooldown). Backend Proxy Logs (20+ endpoints returning 408)Evidence LinksVerified Behaviors
by AI for @beastoin |
|
lgtm |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add HTTP_CLOCK_SKEW_ALLOWANCE env var (default 5min) for clock drift - Effective stale threshold = max_age + skew_allowance (10min default) - 408 response returns JSON with server_time, client_time, skew_seconds so the app can detect drift and show user-facing warning Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
12 tests: tolerance boundaries, JSON diagnostics fields, env var config, zero-allowance fallback, multipart with skew, 504 timeout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Parse backend JSON 408 response with error=clock_skew in the HTTP layer. Show snackbar warning with the skew magnitude so users know to fix their device clock settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Parse backend's JSON 408 {error:"clock_skew"} response in the HTTP layer.
When detected, show a localized snackbar warning the user their device
clock is out of sync.
Improvements over initial attempt:
- Content-type check before JSON parsing (avoid crash on non-JSON 408)
- Navigator state null guard (no crash if app in background)
- Rate-limited snackbar (45s cooldown, no spam on retries)
- Warning-level logging with server/client time details
- Localized message via l10n (clockSkewWarning key)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add localized string for the clock skew user warning with {minutes}
parameter placeholder. Translations provided for all 33 non-English
locales.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Auto-generated by flutter gen-l10n after adding clockSkewWarning key. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add test_at_threshold_minus_1s_passes (599s should pass with strict > comparison). Rename test_at_exact_boundary_rejected to test_just_beyond_boundary_rejected for clarity. Now 13 tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
25 tests covering: - JSON parsing (valid, malformed, missing fields, wrong error type) - Content-type gating (json, text/plain, text/html) - Type coercion (int, float, string skew_seconds) - Skew minutes calculation (ceiling, minimum 1, negative) - Rate limiter (first call allowed, immediate second blocked) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Import globalNavigatorKey from app_globals.dart instead of MyApp from main.dart. Low-level HTTP utilities should not depend on the app entry point — app_globals.dart exists specifically to break this dependency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5aefca4 to
5126716
Compare
Move clock skew parsing, event model, and rate-limiting into a dedicated ClockSkewDetector singleton with a broadcast stream. shared.dart now delegates to the detector instead of handling UI directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
shared.dart now delegates clock skew detection to ClockSkewDetector. Removes imports of app_globals, AppLocalizations, and AppSnackbar — HTTP transport layer no longer controls UI directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AppShell now listens to ClockSkewDetector.onClockSkew stream and shows the localized snackbar with proper BuildContext. This moves UI control to the appropriate layer (widget tree) instead of the HTTP layer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace MyApp.navigatorKey with globalNavigatorKey — MyApp is not in scope since the file imports app_globals.dart, not main.dart. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests now import and test the production ClockSkewDetector.parseResponse and ClockSkewEvent.skewMinutes directly, eliminating duplicated test abstractions that could drift from production code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests ClockSkewDetector.checkResponse behavior: emits on first valid 408, suppresses within 45s cooldown, ignores non-clock-skew and non-JSON responses. Covers the reviewer-noted gap. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Test emission resumes after 45s cooldown expires - Test suppression at 44s (just before cooldown boundary) - Test broadcast stream delivers to multiple subscribers - Test missing content-type header returns null - Test JSON array body returns null - Total: 26 tests (16 parseResponse + 3 skewMinutes + 7 checkResponse) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Test exact 45s boundary: strict < means == cooldown emits
- Test string skew_seconds ('900') parsing via _parseInt
- Total: 28 tests (17 parseResponse + 3 skewMinutes + 8 checkResponse)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CP9B L2 Evidence (v2): Integrated Backend + App Test (Updated Code)Setup
Screenshot: Clock Skew SnackbarSnackbar text: "Your device clock is off by ~15 min. Check your date & time settings." Flutter Logs (48 detections, 1 snackbar)48 API calls detected clock skew → rate-limiter showed only 1 snackbar (45s cooldown). Backend Proxy Logs (20+ endpoints returning 408)Evidence LinksVerified Behaviors (with refactored architecture)
Test Summary (28 tests)
by AI for @beastoin |
…rdware#5929) (BasedHardware#5934) Fixes BasedHardware#5929 — voice chat transcription fails with 408 when user's device clock is out of sync with server. ## Changes ### Backend (kenji, sub-PR BasedHardware#5932) - Add `HTTP_CLOCK_SKEW_ALLOWANCE` env var (default 5 min) to `TimeoutMiddleware` - Stale request threshold becomes `max_age + skew_allowance` (effective 10 min) - Return structured JSON on 408 with `server_time`, `client_time`, `skew_seconds`, `hint` for client-side detection - 13 unit tests ### App (kelvin, sub-PRs BasedHardware#5937, BasedHardware#5938) - **New: `ClockSkewDetector`** singleton (`backend/http/clock_skew_detector.dart`) — parses 408 JSON, emits typed `ClockSkewEvent` via broadcast stream, rate-limits (45s cooldown) - **`shared.dart` cleaned** — delegates to `ClockSkewDetector.instance.checkResponse()`, zero UI imports (no `app_globals`, `AppLocalizations`, `AppSnackbar`) - **`AppShell` subscribes** to `ClockSkewDetector.onClockSkew` stream — shows localized snackbar with proper `BuildContext` - Content-type check before JSON parsing (ignores HTML/text 408s from proxies) - `clockSkewWarning(minutes)` l10n key in all 34 locales - 28 unit tests — parsing (17), skewMinutes (3), checkResponse cooldown/emission/broadcast (8) ### Architecture (review feedback) - **Separation of concerns**: HTTP transport layer (`shared.dart`) no longer controls UI — it detects and emits, `AppShell` subscribes and renders - **Consistent with 401 pattern**: 401 handling does domain actions (refresh/signout), UI reacts via auth state at higher layers. Clock skew now follows the same boundary. - **Testability**: Tests import and verify real production classes directly instead of duplicating private logic - **Scalable**: Broadcast stream pattern supports future global HTTP signals (rate limit warnings, maintenance mode, etc.) - `.coordination/` added to `.gitignore` - Fixed pre-existing `MyApp.navigatorKey` → `globalNavigatorKey` in `device_provider.dart` - Rebased onto latest `main` ## Test Results - **Backend**: 13/13 unit tests pass (`test_timeout_middleware.py`) - **App**: 28/28 unit tests pass (`clock_skew_detection_test.dart`) — tests real production classes - CP7 reviewer: PR_APPROVED_LGTM - CP8 tester: TESTS_APPROVED ## CP9 L2 Evidence (Backend + App Integrated) **Setup**: Local proxy backend (port 10150) returning 408 `{error: "clock_skew", skew_seconds: 900}` + Flutter app on emulator (commit `a47a06b1d`). **Screenshot** — snackbar visible at bottom:  **Verified behaviors**: 1. `ClockSkewDetector.parseResponse` correctly parses `clock_skew` JSON from 408 responses 2. Content-type check ignores non-JSON 408s (confirmed against prod API which returns `text/html`) 3. `AppShell` stream subscriber shows localized snackbar with correct minutes (900s → ~15 min) 4. Rate limiter: 48 concurrent 408s → 1 snackbar (45s cooldown) 5. Warning logging for all detections **Evidence**: [Screenshot](https://storage.googleapis.com/omi-pr-assets/pr-5934/cp9b_snackbar_v2.webp) · [Flutter logs](https://storage.googleapis.com/omi-pr-assets/pr-5934/cp9b_flutter_logs_v2.txt) · [Proxy logs](https://storage.googleapis.com/omi-pr-assets/pr-5934/cp9b_proxy_logs_v2.txt) ## Deployment Steps ### 1. Backend first (backward-compatible) ```bash # Set env var in Helm values (optional — default is 300s / 5 min) HTTP_CLOCK_SKEW_ALLOWANCE=300 # Deploy backend-listen gh workflow run gcp_backend.yml -f environment=prod -f branch=main # Verify: stale request returns JSON 408 (not plain text) curl -s -H "X-Request-Start-Time: $(echo "$(date +%s) - 900" | bc)" \ https://api.omiapi.com/health | python3 -m json.tool # Expected: {"error": "clock_skew", "server_time": ..., "skew_seconds": ...} ``` ### 2. App second (backward-compatible) - The app change is backward-compatible: it only parses 408 with `content-type: json` AND `error: "clock_skew"` - Old backend 408s (text/html) are safely ignored - Release via normal mobile release pipeline (App Store + Google Play) ### Order matters - Backend **must** deploy first so the structured JSON 408 is available - App can deploy anytime after — it gracefully handles both old (text) and new (JSON) 408 formats ## Changed-Path Coverage | Path | Changed code | Happy | Non-happy | L1 | L2 | |------|-------------|-------|-----------|----|----| | P1 | `timeout.py:dispatch` — skew tolerance + JSON 408 | Fresh → 200 | 15min stale → 408 JSON | PASS | PASS | | P2 | `clock_skew_detector.dart:parseResponse` — JSON parse | Valid 408 → parsed | Non-JSON/malformed → null | PASS (17 tests) | PASS | | P3 | `clock_skew_detector.dart:checkResponse` — rate-limit + stream emit | First → event emitted | Second <45s → suppressed; 45s exact → emits | PASS (8 tests) | PASS | | P4 | `shared.dart:makeApiCall` — delegates to detector | 408 → detector called | Non-408 → skipped | PASS | PASS | | P5 | `shared.dart:makeMultipartApiCall` — delegates to detector | 408 → detector called | Non-408 → skipped | PASS | PASS | | P6 | `app_shell.dart:initState` — stream subscriber | Event → snackbar shown | Unmounted → ignored | PASS | PASS | CP9C skipped (no cluster/infra deps). --- _by AI for @beastoin_


Fixes #5929 — voice chat transcription fails with 408 when user's device clock is out of sync with server.
Changes
Backend (kenji, sub-PR #5932)
HTTP_CLOCK_SKEW_ALLOWANCEenv var (default 5 min) toTimeoutMiddlewaremax_age + skew_allowance(effective 10 min)server_time,client_time,skew_seconds,hintfor client-side detectionApp (kelvin, sub-PRs #5937, #5938)
ClockSkewDetectorsingleton (backend/http/clock_skew_detector.dart) — parses 408 JSON, emits typedClockSkewEventvia broadcast stream, rate-limits (45s cooldown)shared.dartcleaned — delegates toClockSkewDetector.instance.checkResponse(), zero UI imports (noapp_globals,AppLocalizations,AppSnackbar)AppShellsubscribes toClockSkewDetector.onClockSkewstream — shows localized snackbar with properBuildContextclockSkewWarning(minutes)l10n key in all 34 localesArchitecture (review feedback)
shared.dart) no longer controls UI — it detects and emits,AppShellsubscribes and renders.coordination/added to.gitignoreMyApp.navigatorKey→globalNavigatorKeyindevice_provider.dartmainTest Results
test_timeout_middleware.py)clock_skew_detection_test.dart) — tests real production classesCP9 L2 Evidence (Backend + App Integrated)
Setup: Local proxy backend (port 10150) returning 408
{error: "clock_skew", skew_seconds: 900}+ Flutter app on emulator (commita47a06b1d).Screenshot — snackbar visible at bottom:

Verified behaviors:
ClockSkewDetector.parseResponsecorrectly parsesclock_skewJSON from 408 responsestext/html)AppShellstream subscriber shows localized snackbar with correct minutes (900s → ~15 min)Evidence: Screenshot · Flutter logs · Proxy logs
Deployment Steps
1. Backend first (backward-compatible)
2. App second (backward-compatible)
content-type: jsonANDerror: "clock_skew"Order matters
Changed-Path Coverage
timeout.py:dispatch— skew tolerance + JSON 408clock_skew_detector.dart:parseResponse— JSON parseclock_skew_detector.dart:checkResponse— rate-limit + stream emitshared.dart:makeApiCall— delegates to detectorshared.dart:makeMultipartApiCall— delegates to detectorapp_shell.dart:initState— stream subscriberCP9C skipped (no cluster/infra deps).
by AI for @beastoin