Skip to content

Add clock skew tolerance to stale request check (#5929)#5931

Closed
beastoin wants to merge 9 commits intomainfrom
fix/stale-request-clock-skew-5929
Closed

Add clock skew tolerance to stale request check (#5929)#5931
beastoin wants to merge 9 commits intomainfrom
fix/stale-request-clock-skew-5929

Conversation

@beastoin
Copy link
Copy Markdown
Collaborator

@beastoin beastoin commented Mar 23, 2026

Summary

  • Add configurable clock skew tolerance to TimeoutMiddleware stale request check
  • Return JSON 408 response with clock skew diagnostics (server_time, client_time, skew_seconds) so the app can detect drift and show a user-facing warning
  • Fixes voice chat transcription 408 failures caused by client clock skew vs server

Problem

TimeoutMiddleware compares time.time() against client-set X-Request-Start-Time header with a 5-minute threshold. When a user's phone clock is ~5 minutes behind server UTC, requests get rejected with 408 "Request is too old" even though they just arrived. The plain-text 408 response gave no diagnostic info, so the app couldn't tell the user what was wrong.

Solution

1. Clock skew tolerance (industry-standard, like AWS SigV4)

  • New HTTP_CLOCK_SKEW_ALLOWANCE env var (default: 5 minutes)
  • Effective stale threshold = max_age (5min) + clock_skew_allowance (5min) = 10 minutes
  • A phone with 5-min clock drift + reasonable transfer delay stays well within tolerance
  • Set HTTP_CLOCK_SKEW_ALLOWANCE=0 to restore original behavior
  • Stale check protects ALL endpoints (no content-type bypass)

2. JSON 408 response with clock skew diagnostics

When a request IS rejected, the 408 response now returns:

{
  "error": "clock_skew",
  "message": "Request rejected — your device clock may be out of sync",
  "server_time": 1742691120.5,
  "client_time": 1742690820.5,
  "skew_seconds": 300.0,
  "hint": "Check your device date/time settings and enable automatic time"
}

This enables the app to detect clock drift and show a user-facing warning (Flutter side handled separately by @Kelvin).

Changes

backend/utils/other/timeout.py:

  • Add clock_skew_allowance from HTTP_CLOCK_SKEW_ALLOWANCE env var
  • Stale threshold = max_age + clock_skew_allowance
  • Return JSONResponse with diagnostic fields on 408

backend/tests/unit/test_timeout_middleware.py — 12 tests:

  • Fresh request passes
  • Within clock skew tolerance passes (7min old, threshold 10min)
  • Beyond tolerance → 408 with JSON diagnostics (error, server_time, client_time, skew_seconds, hint)
  • Boundary behavior (exact threshold ± 1s)
  • Multipart upload with clock skew passes
  • Malformed/missing/future headers handled correctly
  • Custom skew allowance via env var
  • Zero allowance restores original behavior
  • Timeout → 504

Test plan

  • 12 unit tests pass
  • black formatting clean
  • Codex reviewer approved (PR_APPROVED_LGTM)
  • Codex tester approved (TESTS_APPROVED)
  • CP9A: build and run backend, verify stale check + JSON response
  • CP9B: build and run backend + app integrated

Fixes #5929

🤖 Generated with Claude Code

beastoin and others added 2 commits March 23, 2026 03:40
Client clock skew causes false 408 rejections for voice chat file
uploads when X-Request-Start-Time appears stale on arrival.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8 tests: stale rejection, multipart bypass, malformed header,
future-dated header, no header, boundary variations, 504 timeout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 23, 2026

Greptile Summary

This PR fixes a clock-skew-induced 408 failure during large file uploads by skipping the TimeoutMiddleware stale-request check for multipart/form-data requests, while leaving the check in place for all other request types.

Key changes:

  • backend/utils/other/timeout.py: Reads Content-Type on every request and sets is_file_upload = "multipart/form-data" in content_type; the stale check (X-Request-Start-Time age comparison) is skipped when this flag is True.
  • backend/tests/unit/test_timeout_middleware.py: 8 new unit tests covering stale/fresh non-multipart, stale/fresh multipart (with and without boundary param), missing header, malformed header, future-dated header, and the 504 timeout path.

Minor points to consider:

  • The is_file_upload flag is evaluated even when X-Request-Start-Time is absent (no functional impact, just a tiny unnecessary computation).
  • The bypass relies solely on the client-supplied Content-Type header, meaning any request can opt out of the stale check by declaring itself multipart. Since clients could already bypass the check by omitting X-Request-Start-Time entirely, this doesn't introduce a fundamentally new bypass path — but narrowing the condition to POST + multipart/form-data would minimise the surface.
  • One test uses assert response.status_code != 408 instead of the stronger assert response.status_code == 200.

Confidence Score: 4/5

  • Safe to merge with one minor hardening suggestion and one test assertion tweak.
  • The fix is small, well-targeted, and well-tested. The two observations (client-controlled bypass surface and weak test assertion) are non-blocking style/hardening points rather than correctness bugs — the middleware's stale check was already bypassable by omitting the header, so the new bypass path is not significantly worse. No regressions are introduced.
  • backend/utils/other/timeout.py — review the is_file_upload condition if tighter bypass control is desired.

Important Files Changed

Filename Overview
backend/utils/other/timeout.py 3-line fix that conditionally skips the stale X-Request-Start-Time check for multipart/form-data requests; logic is correct but the bypass relies on a client-controlled header, which slightly widens the existing bypass surface.
backend/tests/unit/test_timeout_middleware.py 8 well-structured unit tests covering the new multipart-skip behaviour; one test uses a weak != 408 assertion that could silently pass on unexpected errors.

Sequence Diagram

sequenceDiagram
    participant Client
    participant TimeoutMiddleware
    participant RouteHandler

    Client->>TimeoutMiddleware: POST /upload (multipart/form-data, X-Request-Start-Time: stale)
    TimeoutMiddleware->>TimeoutMiddleware: is_file_upload = True → skip stale check
    TimeoutMiddleware->>RouteHandler: forward request (within asyncio.wait_for timeout)
    RouteHandler-->>TimeoutMiddleware: 200 OK
    TimeoutMiddleware-->>Client: 200 OK

    Client->>TimeoutMiddleware: POST /api/json (application/json, X-Request-Start-Time: stale)
    TimeoutMiddleware->>TimeoutMiddleware: is_file_upload = False → stale check runs
    TimeoutMiddleware-->>Client: 408 Request is too old
Loading

Reviews (1): Last reviewed commit: "Add unit tests for TimeoutMiddleware sta..." | Re-trigger Greptile

Comment on lines +79 to +91
# Should not be 408 — multipart skips stale check
assert response.status_code != 408


def test_fresh_non_multipart_passes():
"""Non-multipart request with fresh header passes through."""
app = _make_app()
client = TestClient(app)
fresh_time = str(time.time())
response = client.get("/ok", headers={"X-Request-Start-Time": fresh_time})
assert response.status_code == 200


Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Weak assertion hides endpoint errors

The assertion assert response.status_code != 408 would pass for any non-408 response, including a 422 Unprocessable Entity or 500 Internal Server Error. For a test whose goal is to verify the request passes through successfully, asserting == 200 is more accurate and would have caught unexpected failures.

Suggested change
# Should not be 408 — multipart skips stale check
assert response.status_code != 408
def test_fresh_non_multipart_passes():
"""Non-multipart request with fresh header passes through."""
app = _make_app()
client = TestClient(app)
fresh_time = str(time.time())
response = client.get("/ok", headers={"X-Request-Start-Time": fresh_time})
assert response.status_code == 200
# Should pass through — multipart skips stale check
assert response.status_code == 200

Comment thread backend/utils/other/timeout.py Outdated
Comment on lines +43 to +46
content_type = request.headers.get("content-type", "")
is_file_upload = "multipart/form-data" in content_type

if request_start_header and not is_file_upload:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Stale check bypassed via client-controlled header

The Content-Type header is set by the client, so any caller can trivially bypass the stale check on any endpoint by including multipart/form-data in their Content-Type — even for plain JSON or GET-style requests. This is worth being aware of because TimeoutMiddleware is the only place the X-Request-Start-Time anti-replay protection is enforced.

In the existing design, a client that omits X-Request-Start-Time already bypasses the check, so this doesn't introduce a fundamentally new bypass path. That said, it does extend the surface: previously the bypass required not sending the header; now it also works by asserting a particular content-type. A slightly more robust alternative is to only skip the stale check when the request method is POST and the Content-Type is multipart/form-data, which limits the bypass to the smallest surface needed:

is_file_upload = request.method == "POST" and "multipart/form-data" in content_type

This is a suggestion rather than a blocking issue, but worth considering before merge.

beastoin and others added 5 commits March 23, 2026 03:45
Case-insensitive comparison of base media type instead of substring
match. Prevents bypass via crafted Content-Type headers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests for case-insensitive content-type, non-multipart rejection,
and missing content-type header behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Stronger assertions (== 200 instead of != 408), proper substring
bypass regression test with crafted content-type.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace multipart bypass with industry-standard clock skew allowance.
Effective threshold = max_age (5min) + skew_allowance (5min) = 10min.
Configurable via HTTP_CLOCK_SKEW_ALLOWANCE env var. Stale check now
protects all endpoints including file uploads.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
12 tests covering: within/beyond tolerance, boundary behavior,
multipart with skew, env var configuration, zero-allowance fallback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@beastoin beastoin changed the title Skip stale request check for multipart uploads (clock skew fix) Add clock skew tolerance to stale request check (#5929) Mar 23, 2026
beastoin and others added 2 commits March 23, 2026 04:02
408 response now includes server_time, client_time, skew_seconds,
and hint so the app can detect clock drift and show a user warning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verify error, server_time, client_time, skew_seconds, hint fields
in the 408 rejection response body.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@beastoin
Copy link
Copy Markdown
Collaborator Author

Superseded by sub-PR #5932 targeting collab/5929-integration branch (collab protocol with @Kelvin for app-side clock skew warning).

@beastoin beastoin closed this Mar 23, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Hey @beastoin 👋

Thank you so much for taking the time to contribute to Omi! We truly appreciate you putting in the effort to submit this pull request.

After careful review, we've decided not to merge this particular PR. Please don't take this personally — we genuinely try to merge as many contributions as possible, but sometimes we have to make tough calls based on:

  • Project standards — Ensuring consistency across the codebase
  • User needs — Making sure changes align with what our users need
  • Code best practices — Maintaining code quality and maintainability
  • Project direction — Keeping aligned with our roadmap and vision

Your contribution is still valuable to us, and we'd love to see you contribute again in the future! If you'd like feedback on how to improve this PR or want to discuss alternative approaches, please don't hesitate to reach out.

Thank you for being part of the Omi community! 💜

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Voice chat transcription fails with 408 on retry due to stale request check + clock skew

1 participant