Skip to content

fix: voice chat 408 clock skew tolerance + user notification (#5929)#5934

Merged
beastoin merged 23 commits intomainfrom
collab/5929-integration
Mar 23, 2026
Merged

fix: voice chat 408 clock skew tolerance + user notification (#5929)#5934
beastoin merged 23 commits intomainfrom
collab/5929-integration

Conversation

@beastoin
Copy link
Copy Markdown
Collaborator

@beastoin beastoin commented Mar 23, 2026

Fixes #5929 — voice chat transcription fails with 408 when user's device clock is out of sync with server.

Changes

Backend (kenji, sub-PR #5932)

  • Add HTTP_CLOCK_SKEW_ALLOWANCE env var (default 5 min) to TimeoutMiddleware
  • Stale request threshold becomes max_age + skew_allowance (effective 10 min)
  • Return structured JSON on 408 with server_time, client_time, skew_seconds, hint for client-side detection
  • 13 unit tests

App (kelvin, sub-PRs #5937, #5938)

  • New: ClockSkewDetector singleton (backend/http/clock_skew_detector.dart) — parses 408 JSON, emits typed ClockSkewEvent via broadcast stream, rate-limits (45s cooldown)
  • shared.dart cleaned — delegates to ClockSkewDetector.instance.checkResponse(), zero UI imports (no app_globals, AppLocalizations, AppSnackbar)
  • AppShell subscribes to ClockSkewDetector.onClockSkew stream — shows localized snackbar with proper BuildContext
  • Content-type check before JSON parsing (ignores HTML/text 408s from proxies)
  • clockSkewWarning(minutes) l10n key in all 34 locales
  • 28 unit tests — parsing (17), skewMinutes (3), checkResponse cooldown/emission/broadcast (8)

Architecture (review feedback)

  • Separation of concerns: HTTP transport layer (shared.dart) no longer controls UI — it detects and emits, AppShell subscribes and renders
  • Consistent with 401 pattern: 401 handling does domain actions (refresh/signout), UI reacts via auth state at higher layers. Clock skew now follows the same boundary.
  • Testability: Tests import and verify real production classes directly instead of duplicating private logic
  • Scalable: Broadcast stream pattern supports future global HTTP signals (rate limit warnings, maintenance mode, etc.)
  • .coordination/ added to .gitignore
  • Fixed pre-existing MyApp.navigatorKeyglobalNavigatorKey in device_provider.dart
  • Rebased onto latest main

Test Results

  • Backend: 13/13 unit tests pass (test_timeout_middleware.py)
  • App: 28/28 unit tests pass (clock_skew_detection_test.dart) — tests real production classes
  • CP7 reviewer: PR_APPROVED_LGTM
  • CP8 tester: TESTS_APPROVED

CP9 L2 Evidence (Backend + App Integrated)

Setup: Local proxy backend (port 10150) returning 408 {error: "clock_skew", skew_seconds: 900} + Flutter app on emulator (commit a47a06b1d).

Screenshot — snackbar visible at bottom:
Clock skew snackbar

Verified behaviors:

  1. ClockSkewDetector.parseResponse correctly parses clock_skew JSON from 408 responses
  2. Content-type check ignores non-JSON 408s (confirmed against prod API which returns text/html)
  3. AppShell stream subscriber shows localized snackbar with correct minutes (900s → ~15 min)
  4. Rate limiter: 48 concurrent 408s → 1 snackbar (45s cooldown)
  5. Warning logging for all detections

Evidence: Screenshot · Flutter logs · Proxy logs

Deployment Steps

1. Backend first (backward-compatible)

# Set env var in Helm values (optional — default is 300s / 5 min)
HTTP_CLOCK_SKEW_ALLOWANCE=300

# Deploy backend-listen
gh workflow run gcp_backend.yml -f environment=prod -f branch=main

# Verify: stale request returns JSON 408 (not plain text)
curl -s -H "X-Request-Start-Time: $(echo "$(date +%s) - 900" | bc)" \
  https://api.omiapi.com/health | python3 -m json.tool
# Expected: {"error": "clock_skew", "server_time": ..., "skew_seconds": ...}

2. App second (backward-compatible)

  • The app change is backward-compatible: it only parses 408 with content-type: json AND error: "clock_skew"
  • Old backend 408s (text/html) are safely ignored
  • Release via normal mobile release pipeline (App Store + Google Play)

Order matters

  • Backend must deploy first so the structured JSON 408 is available
  • App can deploy anytime after — it gracefully handles both old (text) and new (JSON) 408 formats

Changed-Path Coverage

Path Changed code Happy Non-happy L1 L2
P1 timeout.py:dispatch — skew tolerance + JSON 408 Fresh → 200 15min stale → 408 JSON PASS PASS
P2 clock_skew_detector.dart:parseResponse — JSON parse Valid 408 → parsed Non-JSON/malformed → null PASS (17 tests) PASS
P3 clock_skew_detector.dart:checkResponse — rate-limit + stream emit First → event emitted Second <45s → suppressed; 45s exact → emits PASS (8 tests) PASS
P4 shared.dart:makeApiCall — delegates to detector 408 → detector called Non-408 → skipped PASS PASS
P5 shared.dart:makeMultipartApiCall — delegates to detector 408 → detector called Non-408 → skipped PASS PASS
P6 app_shell.dart:initState — stream subscriber Event → snackbar shown Unmounted → ignored PASS PASS

CP9C skipped (no cluster/infra deps).


by AI for @beastoin

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 23, 2026

Greptile Summary

This PR adds only a coordination manifest (.coordination/issue-5929-ownership.yaml) to track ownership of the two-scope fix for issue #5929 (voice chat 408 clock-skew failures). The actual code changes described in the PR — clock-skew tolerance in backend/utils/other/timeout.py and the Flutter snackbar notification in app/lib/backend/http/shared.dart — are not present in this diff and remain unmodified in main.

Key observations:

  • The PR title "fix: voice chat 408 clock skew tolerance + user notification" and "Fixes Voice chat transcription fails with 408 on retry due to stale request check + clock skew #5929" are premature; no functional code has changed in main.
  • The app-clock-skew-notification scope is marked status: planned, directly contradicting the PR description which claims sub-PR fix(app): detect clock skew 408 and show user warning #5933 is already implemented.
  • The coordination file correctly separates ownership (kenji owns backend, kelvin owns app) and defines do_not_touch zones, which is a good practice for parallel work.
  • shared_zones is empty ([]), though the HTTP error-handling layer (shared.dart) is touched by both the app scope and indirectly by the backend scope's response contract — a shared zone entry documenting the agreed 408 JSON schema would improve clarity.

Confidence Score: 3/5

  • Safe to merge as a tracking artifact, but the PR title and description overstate what is actually fixed — the underlying issue Voice chat transcription fails with 408 on retry due to stale request check + clock skew #5929 remains unresolved in main until the sub-PRs land.
  • The only file changed is a coordination YAML, so there is zero runtime risk from this PR itself. Score is held at 3 because the PR description claims "Fixes Voice chat transcription fails with 408 on retry due to stale request check + clock skew #5929" and references completed sub-PRs, while the coordination document contradicts this (app scope is planned, backend scope is in_progress) and neither code change appears in the diff. The mismatch between the stated intent and the actual content warrants attention before merge.
  • .coordination/issue-5929-ownership.yaml — status field for app-clock-skew-notification scope needs to be updated to reflect actual progress.

Important Files Changed

Filename Overview
.coordination/issue-5929-ownership.yaml New coordination manifest describing the two-scope work split for issue #5929; the app scope is marked planned while the PR description claims both sub-PRs are already implemented, and the actual code fixes are absent from this diff.

Sequence Diagram

sequenceDiagram
    participant App as Flutter App (shared.dart)
    participant BE as Backend (TimeoutMiddleware)

    Note over App,BE: Current state (main — unchanged by this PR)
    App->>BE: POST /voice-chat (X-Request-Start-Time header)
    BE-->>App: 408 "Request is too old and has been rejected." (plain text)
    Note over App: No clock-skew detection, no user notification

    Note over App,BE: Planned state (sub-PRs #5932 + #5933 — NOT yet merged)
    App->>BE: POST /voice-chat (X-Request-Start-Time header)
    BE-->>App: 408 JSON {server_time, client_time, skew_seconds}
    App->>App: Parse clock_skew from JSON body
    App->>App: Show snackbar: "Device clock is off by N seconds"
Loading

Reviews (1): Last reviewed commit: "Add ownership manifest for #5929 collab" | Re-trigger Greptile

Comment thread .coordination/issue-5929-ownership.yaml Outdated
12+ unit tests.

- name: app-clock-skew-notification
owner: kelvin
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 status: planned contradicts PR description

The PR description states that sub-PR #5933 already implements "Parse 408 JSON clock_skew response in shared.dart HTTP layer. Show snackbar warning with skew magnitude so users know to fix their device clock." However, this scope is marked as status: planned, indicating work has not yet started. If sub-PR #5933 is genuinely done, this status should be completed (or at least in_progress). If the work is not done, the PR description and title are misleading.

Suggested change
owner: kelvin
status: in_progress

Comment thread .coordination/issue-5929-ownership.yaml Outdated
issue: 5929
title: "Voice chat transcription fails with 408 on retry due to clock skew"
driver: kelvin
integration_branch: collab/5929-integration
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Integration branch diverges from merge target

The integration_branch is listed as collab/5929-integration, but this PR targets main. The actual code changes to backend/utils/other/timeout.py and app/lib/backend/http/shared.dart described in the PR body are not present in this diff — only this coordination YAML has been added. Merging this PR to main will not fix issue #5929; main will still have the old plain-text 408 response in TimeoutMiddleware and no clock-skew notification in the Flutter HTTP layer. The PR title "fix: voice chat 408 clock skew tolerance + user notification" and "Fixes #5929" in the description are premature if the actual sub-PRs (#5932 and #5933) haven't landed yet.

@beastoin
Copy link
Copy Markdown
Collaborator Author

Verification Evidence

Backend Tests (12/12 PASS)

tests/unit/test_timeout_middleware.py::test_fresh_request_passes PASSED
tests/unit/test_timeout_middleware.py::test_within_clock_skew_tolerance_passes PASSED
tests/unit/test_timeout_middleware.py::test_beyond_tolerance_rejected_with_clock_skew_json PASSED
tests/unit/test_timeout_middleware.py::test_at_exact_boundary_rejected PASSED
tests/unit/test_timeout_middleware.py::test_just_within_boundary_passes PASSED
tests/unit/test_timeout_middleware.py::test_multipart_upload_with_skew_passes PASSED
tests/unit/test_timeout_middleware.py::test_malformed_header_passes PASSED
tests/unit/test_timeout_middleware.py::test_no_header_passes PASSED
tests/unit/test_timeout_middleware.py::test_future_dated_header_passes PASSED
tests/unit/test_timeout_middleware.py::test_custom_skew_allowance_via_env PASSED
tests/unit/test_timeout_middleware.py::test_zero_skew_allowance_original_behavior PASSED
tests/unit/test_timeout_middleware.py::test_timeout_returns_504 PASSED
12 passed in 11.36s

E2E Integration (3/3 PASS)

✓ 408 JSON format matches Flutter app parser expectations
✓ Fresh request returns 200 (no false rejection)
✓ 7min-skewed multipart upload passes (within 10min tolerance)

408 JSON Response (verified format)

{
  "error": "clock_skew",
  "message": "Request rejected — your device clock may be out of sync",
  "server_time": 1774240978.08,
  "client_time": 1774240078.07,
  "skew_seconds": 900.0,
  "hint": "Check your device date/time settings and enable automatic time"
}

App Static Analysis

flutter analyze lib/backend/http/shared.dart — No issues found

Merge Matrix

Sub-PR Owner Scope CP Status
#5932 kenji backend clock skew tolerance CP6 Merged
#5933 kelvin app clock skew notification CP6 Merged

by AI for @beastoin

@beastoin
Copy link
Copy Markdown
Collaborator Author

All checkpoints passed — PR is ready for merge.

Checkpoint summary:

Test totals: 41 unit tests + 3 E2E = 44 tests, all passing.

Awaiting explicit merge approval.


by AI for @beastoin

@beastoin
Copy link
Copy Markdown
Collaborator Author

CP9B L2 Evidence: Integrated Backend + App Test

Setup

  • Backend: Local proxy server (port 10150) mimicking TimeoutMiddleware's clock_skew JSON 408 response
  • App: Built from collab/5929-kelvin-app-v2 branch, dev flavor, running on emulator-5554
  • Auth: Native Google Sign-In → Firebase Auth (based-hardware-dev), bypassing backend for auth
  • Flow: App → proxy backend → 408 {error: "clock_skew", skew_seconds: 900} → app parses → snackbar shown

Screenshot: Clock Skew Snackbar

Clock skew snackbar

Snackbar text: "Your device clock is off by ~15 min. Check your date & time settings."

Flutter Logs (clock skew detection)

[warning] | 6:07:14 215ms | Clock skew detected: skew_seconds=900, server_time=1774246040.44, client_time=1774245140.44
[warning] | 6:07:14 216ms | Clock skew detected: skew_seconds=900, ...
[warning] | 6:07:14 218ms | Clock skew detected: skew_seconds=900, ...
[warning] | 6:07:14 221ms | Clock skew detected: skew_seconds=900, ...
[warning] | 6:07:14 398ms | Clock skew detected: skew_seconds=900, ...
[warning] | 6:07:14 399ms | Clock skew detected: skew_seconds=900, ...
[warning] | 6:07:14 439ms | Clock skew detected: skew_seconds=900, ...
[warning] | 6:07:16 58ms  | Clock skew detected: skew_seconds=900, ...

8 API calls detected clock skew, rate-limiter showed only 1 snackbar (45s cooldown).

Backend Proxy Logs (20+ endpoints returning 408)

408 clock_skew -> GET /v1/action-items
408 clock_skew -> GET /v1/goals/all
408 clock_skew -> GET /v3/speech-profile
408 clock_skew -> GET /v1/users/profile
408 clock_skew -> GET /v1/users/people
408 clock_skew -> GET /v2/apps
408 clock_skew -> GET /v1/apps/enabled
408 clock_skew -> GET /v1/conversations
408 clock_skew -> GET /v2/messages
408 clock_skew -> GET /v1/users/me/subscription
408 clock_skew -> GET /v3/memories
408 clock_skew -> GET /v1/folders
408 clock_skew -> GET /v1/app/plans
408 clock_skew -> GET /v1/announcements/pending

Evidence Links

Verified Behaviors

  1. 408 JSON parsing: App correctly parses {error: "clock_skew", skew_seconds: 900.0} from HTTP response
  2. Content-type check: Only JSON responses are parsed (text/html 408s from prod API are correctly ignored)
  3. Snackbar display: Shows localized warning with calculated minutes (~15 min from 900 seconds)
  4. Rate limiting: 8 concurrent 408 responses → only 1 snackbar shown (45-second cooldown)
  5. Warning logging: All clock skew detections logged at warning level with server_time/client_time

by AI for @beastoin

@beastoin
Copy link
Copy Markdown
Collaborator Author

lgtm

beastoin and others added 13 commits March 23, 2026 08:00
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add HTTP_CLOCK_SKEW_ALLOWANCE env var (default 5min) for clock drift
- Effective stale threshold = max_age + skew_allowance (10min default)
- 408 response returns JSON with server_time, client_time, skew_seconds
  so the app can detect drift and show user-facing warning

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
12 tests: tolerance boundaries, JSON diagnostics fields, env var
config, zero-allowance fallback, multipart with skew, 504 timeout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Parse backend JSON 408 response with error=clock_skew in the HTTP layer.
Show snackbar warning with the skew magnitude so users know to fix their
device clock settings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Parse backend's JSON 408 {error:"clock_skew"} response in the HTTP layer.
When detected, show a localized snackbar warning the user their device
clock is out of sync.

Improvements over initial attempt:
- Content-type check before JSON parsing (avoid crash on non-JSON 408)
- Navigator state null guard (no crash if app in background)
- Rate-limited snackbar (45s cooldown, no spam on retries)
- Warning-level logging with server/client time details
- Localized message via l10n (clockSkewWarning key)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add localized string for the clock skew user warning with {minutes}
parameter placeholder. Translations provided for all 33 non-English
locales.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Auto-generated by flutter gen-l10n after adding clockSkewWarning key.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add test_at_threshold_minus_1s_passes (599s should pass with strict >
comparison). Rename test_at_exact_boundary_rejected to
test_just_beyond_boundary_rejected for clarity. Now 13 tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
25 tests covering:
- JSON parsing (valid, malformed, missing fields, wrong error type)
- Content-type gating (json, text/plain, text/html)
- Type coercion (int, float, string skew_seconds)
- Skew minutes calculation (ceiling, minimum 1, negative)
- Rate limiter (first call allowed, immediate second blocked)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Import globalNavigatorKey from app_globals.dart instead of MyApp from
main.dart. Low-level HTTP utilities should not depend on the app entry
point — app_globals.dart exists specifically to break this dependency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@beastoin beastoin force-pushed the collab/5929-integration branch from 5aefca4 to 5126716 Compare March 23, 2026 08:02
beastoin and others added 9 commits March 23, 2026 08:17
Move clock skew parsing, event model, and rate-limiting into a dedicated
ClockSkewDetector singleton with a broadcast stream. shared.dart now
delegates to the detector instead of handling UI directly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
shared.dart now delegates clock skew detection to ClockSkewDetector.
Removes imports of app_globals, AppLocalizations, and AppSnackbar —
HTTP transport layer no longer controls UI directly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AppShell now listens to ClockSkewDetector.onClockSkew stream and shows
the localized snackbar with proper BuildContext. This moves UI control
to the appropriate layer (widget tree) instead of the HTTP layer.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace MyApp.navigatorKey with globalNavigatorKey — MyApp is not in
scope since the file imports app_globals.dart, not main.dart.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests now import and test the production ClockSkewDetector.parseResponse
and ClockSkewEvent.skewMinutes directly, eliminating duplicated test
abstractions that could drift from production code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests ClockSkewDetector.checkResponse behavior: emits on first valid
408, suppresses within 45s cooldown, ignores non-clock-skew and
non-JSON responses. Covers the reviewer-noted gap.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Test emission resumes after 45s cooldown expires
- Test suppression at 44s (just before cooldown boundary)
- Test broadcast stream delivers to multiple subscribers
- Test missing content-type header returns null
- Test JSON array body returns null
- Total: 26 tests (16 parseResponse + 3 skewMinutes + 7 checkResponse)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Test exact 45s boundary: strict < means == cooldown emits
- Test string skew_seconds ('900') parsing via _parseInt
- Total: 28 tests (17 parseResponse + 3 skewMinutes + 8 checkResponse)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@beastoin
Copy link
Copy Markdown
Collaborator Author

CP9B L2 Evidence (v2): Integrated Backend + App Test (Updated Code)

Setup

  • Backend: Local proxy server (port 10150) returning 408 {error: "clock_skew", skew_seconds: 900} JSON
  • App: Built from collab/5929-integration branch (commit a47a06b1d), dev flavor, running on emulator-5554
  • Architecture: ClockSkewDetector singleton + AppShell stream subscriber (refactored, no UI in HTTP layer)
  • Tests: 28 unit tests (17 parseResponse + 3 skewMinutes + 8 checkResponse incl. cooldown boundary + broadcast)

Screenshot: Clock Skew Snackbar

Clock skew snackbar

Snackbar text: "Your device clock is off by ~15 min. Check your date & time settings."

Flutter Logs (48 detections, 1 snackbar)

[warning] | Clock skew detected: skew_seconds=900, server_time=1774255594.33, client_time=1774254694.33
[warning] | Clock skew detected: skew_seconds=900, server_time=1774255594.33, ...
... (48 total detections across 2 hot restarts)

48 API calls detected clock skew → rate-limiter showed only 1 snackbar (45s cooldown).

Backend Proxy Logs (20+ endpoints returning 408)

408 clock_skew -> PATCH /v1/users/onboarding
408 clock_skew -> GET /v1/action-items?limit=100&offset=0
408 clock_skew -> GET /v1/goals/all
408 clock_skew -> GET /v3/speech-profile
408 clock_skew -> GET /v1/users/profile
408 clock_skew -> GET /v1/users/people?include_speech_samples=true
408 clock_skew -> GET /v2/apps?offset=0&limit=20
408 clock_skew -> GET /v1/app-categories
408 clock_skew -> GET /v2/messages
408 clock_skew -> GET /v1/users/me/subscription
408 clock_skew -> GET /v3/memories?limit=100&offset=0
408 clock_skew -> GET /v1/app/plans
408 clock_skew -> GET /v1/apps/enabled
408 clock_skew -> GET /v1/task-integrations
408 clock_skew -> GET /v1/conversations
408 clock_skew -> GET /v1/folders
408 clock_skew -> GET /v1/announcements/pending

Evidence Links

Verified Behaviors (with refactored architecture)

  1. ClockSkewDetector.parseResponse: Correctly parses {error: "clock_skew", skew_seconds: 900} from 408 JSON
  2. Content-type check: Only JSON responses parsed; text/html 408s from proxies safely ignored
  3. Broadcast stream: ClockSkewDetector emits ClockSkewEvent via broadcast stream
  4. AppShell subscription: Shows localized snackbar with calculated minutes (900s → ~15 min)
  5. Rate limiting: 48 concurrent 408 responses → 1 snackbar (45s cooldown working correctly)
  6. Separation of concerns: shared.dart has zero UI imports — delegates to detector; AppShell subscribes and renders

Test Summary (28 tests)

$ flutter test test/unit/clock_skew_detection_test.dart
+28: All tests passed!
  • 17 parseResponse: valid JSON, non-408, empty body, non-JSON, HTML, malformed, wrong error, missing error, missing skew, non-numeric string, numeric string, integer, float rounding, case-insensitive content-type, non-408 status, missing content-type, JSON array
  • 3 skewMinutes: ceiling, minimum 1, negative/abs
  • 8 checkResponse: first emission, cooldown suppression, expiry at 46s, boundary at 44s, exact 45s boundary, non-clock-skew, non-JSON 408, broadcast to multiple subscribers

by AI for @beastoin

@beastoin beastoin merged commit d25ec6e into main Mar 23, 2026
2 checks passed
@beastoin beastoin deleted the collab/5929-integration branch March 23, 2026 09:03
Glucksberg pushed a commit to Glucksberg/omi-local that referenced this pull request Apr 28, 2026
…rdware#5929) (BasedHardware#5934)

Fixes BasedHardware#5929 — voice chat transcription fails with 408 when user's device
clock is out of sync with server.

## Changes

### Backend (kenji, sub-PR BasedHardware#5932)
- Add `HTTP_CLOCK_SKEW_ALLOWANCE` env var (default 5 min) to
`TimeoutMiddleware`
- Stale request threshold becomes `max_age + skew_allowance` (effective
10 min)
- Return structured JSON on 408 with `server_time`, `client_time`,
`skew_seconds`, `hint` for client-side detection
- 13 unit tests

### App (kelvin, sub-PRs BasedHardware#5937, BasedHardware#5938)
- **New: `ClockSkewDetector`** singleton
(`backend/http/clock_skew_detector.dart`) — parses 408 JSON, emits typed
`ClockSkewEvent` via broadcast stream, rate-limits (45s cooldown)
- **`shared.dart` cleaned** — delegates to
`ClockSkewDetector.instance.checkResponse()`, zero UI imports (no
`app_globals`, `AppLocalizations`, `AppSnackbar`)
- **`AppShell` subscribes** to `ClockSkewDetector.onClockSkew` stream —
shows localized snackbar with proper `BuildContext`
- Content-type check before JSON parsing (ignores HTML/text 408s from
proxies)
- `clockSkewWarning(minutes)` l10n key in all 34 locales
- 28 unit tests — parsing (17), skewMinutes (3), checkResponse
cooldown/emission/broadcast (8)

### Architecture (review feedback)
- **Separation of concerns**: HTTP transport layer (`shared.dart`) no
longer controls UI — it detects and emits, `AppShell` subscribes and
renders
- **Consistent with 401 pattern**: 401 handling does domain actions
(refresh/signout), UI reacts via auth state at higher layers. Clock skew
now follows the same boundary.
- **Testability**: Tests import and verify real production classes
directly instead of duplicating private logic
- **Scalable**: Broadcast stream pattern supports future global HTTP
signals (rate limit warnings, maintenance mode, etc.)
- `.coordination/` added to `.gitignore`
- Fixed pre-existing `MyApp.navigatorKey` → `globalNavigatorKey` in
`device_provider.dart`
- Rebased onto latest `main`

## Test Results

- **Backend**: 13/13 unit tests pass (`test_timeout_middleware.py`)
- **App**: 28/28 unit tests pass (`clock_skew_detection_test.dart`) —
tests real production classes
- CP7 reviewer: PR_APPROVED_LGTM
- CP8 tester: TESTS_APPROVED

## CP9 L2 Evidence (Backend + App Integrated)

**Setup**: Local proxy backend (port 10150) returning 408 `{error:
"clock_skew", skew_seconds: 900}` + Flutter app on emulator (commit
`a47a06b1d`).

**Screenshot** — snackbar visible at bottom:
![Clock skew
snackbar](https://storage.googleapis.com/omi-pr-assets/pr-5934/cp9b_snackbar_v2.webp)

**Verified behaviors**:
1. `ClockSkewDetector.parseResponse` correctly parses `clock_skew` JSON
from 408 responses
2. Content-type check ignores non-JSON 408s (confirmed against prod API
which returns `text/html`)
3. `AppShell` stream subscriber shows localized snackbar with correct
minutes (900s → ~15 min)
4. Rate limiter: 48 concurrent 408s → 1 snackbar (45s cooldown)
5. Warning logging for all detections

**Evidence**:
[Screenshot](https://storage.googleapis.com/omi-pr-assets/pr-5934/cp9b_snackbar_v2.webp)
· [Flutter
logs](https://storage.googleapis.com/omi-pr-assets/pr-5934/cp9b_flutter_logs_v2.txt)
· [Proxy
logs](https://storage.googleapis.com/omi-pr-assets/pr-5934/cp9b_proxy_logs_v2.txt)

## Deployment Steps

### 1. Backend first (backward-compatible)
```bash
# Set env var in Helm values (optional — default is 300s / 5 min)
HTTP_CLOCK_SKEW_ALLOWANCE=300

# Deploy backend-listen
gh workflow run gcp_backend.yml -f environment=prod -f branch=main

# Verify: stale request returns JSON 408 (not plain text)
curl -s -H "X-Request-Start-Time: $(echo "$(date +%s) - 900" | bc)" \
  https://api.omiapi.com/health | python3 -m json.tool
# Expected: {"error": "clock_skew", "server_time": ..., "skew_seconds": ...}
```

### 2. App second (backward-compatible)
- The app change is backward-compatible: it only parses 408 with
`content-type: json` AND `error: "clock_skew"`
- Old backend 408s (text/html) are safely ignored
- Release via normal mobile release pipeline (App Store + Google Play)

### Order matters
- Backend **must** deploy first so the structured JSON 408 is available
- App can deploy anytime after — it gracefully handles both old (text)
and new (JSON) 408 formats

## Changed-Path Coverage

| Path | Changed code | Happy | Non-happy | L1 | L2 |
|------|-------------|-------|-----------|----|----|
| P1 | `timeout.py:dispatch` — skew tolerance + JSON 408 | Fresh → 200 |
15min stale → 408 JSON | PASS | PASS |
| P2 | `clock_skew_detector.dart:parseResponse` — JSON parse | Valid 408
→ parsed | Non-JSON/malformed → null | PASS (17 tests) | PASS |
| P3 | `clock_skew_detector.dart:checkResponse` — rate-limit + stream
emit | First → event emitted | Second <45s → suppressed; 45s exact →
emits | PASS (8 tests) | PASS |
| P4 | `shared.dart:makeApiCall` — delegates to detector | 408 →
detector called | Non-408 → skipped | PASS | PASS |
| P5 | `shared.dart:makeMultipartApiCall` — delegates to detector | 408
→ detector called | Non-408 → skipped | PASS | PASS |
| P6 | `app_shell.dart:initState` — stream subscriber | Event → snackbar
shown | Unmounted → ignored | PASS | PASS |

CP9C skipped (no cluster/infra deps).

---
_by AI for @beastoin_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Voice chat transcription fails with 408 on retry due to stale request check + clock skew

1 participant