Add clock skew tolerance to stale request check (#5929)#5931
Add clock skew tolerance to stale request check (#5929)#5931
Conversation
Client clock skew causes false 408 rejections for voice chat file uploads when X-Request-Start-Time appears stale on arrival. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8 tests: stale rejection, multipart bypass, malformed header, future-dated header, no header, boundary variations, 504 timeout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Greptile SummaryThis PR fixes a clock-skew-induced 408 failure during large file uploads by skipping the Key changes:
Minor points to consider:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Client
participant TimeoutMiddleware
participant RouteHandler
Client->>TimeoutMiddleware: POST /upload (multipart/form-data, X-Request-Start-Time: stale)
TimeoutMiddleware->>TimeoutMiddleware: is_file_upload = True → skip stale check
TimeoutMiddleware->>RouteHandler: forward request (within asyncio.wait_for timeout)
RouteHandler-->>TimeoutMiddleware: 200 OK
TimeoutMiddleware-->>Client: 200 OK
Client->>TimeoutMiddleware: POST /api/json (application/json, X-Request-Start-Time: stale)
TimeoutMiddleware->>TimeoutMiddleware: is_file_upload = False → stale check runs
TimeoutMiddleware-->>Client: 408 Request is too old
Reviews (1): Last reviewed commit: "Add unit tests for TimeoutMiddleware sta..." | Re-trigger Greptile |
| # Should not be 408 — multipart skips stale check | ||
| assert response.status_code != 408 | ||
|
|
||
|
|
||
| def test_fresh_non_multipart_passes(): | ||
| """Non-multipart request with fresh header passes through.""" | ||
| app = _make_app() | ||
| client = TestClient(app) | ||
| fresh_time = str(time.time()) | ||
| response = client.get("/ok", headers={"X-Request-Start-Time": fresh_time}) | ||
| assert response.status_code == 200 | ||
|
|
||
|
|
There was a problem hiding this comment.
Weak assertion hides endpoint errors
The assertion assert response.status_code != 408 would pass for any non-408 response, including a 422 Unprocessable Entity or 500 Internal Server Error. For a test whose goal is to verify the request passes through successfully, asserting == 200 is more accurate and would have caught unexpected failures.
| # Should not be 408 — multipart skips stale check | |
| assert response.status_code != 408 | |
| def test_fresh_non_multipart_passes(): | |
| """Non-multipart request with fresh header passes through.""" | |
| app = _make_app() | |
| client = TestClient(app) | |
| fresh_time = str(time.time()) | |
| response = client.get("/ok", headers={"X-Request-Start-Time": fresh_time}) | |
| assert response.status_code == 200 | |
| # Should pass through — multipart skips stale check | |
| assert response.status_code == 200 |
| content_type = request.headers.get("content-type", "") | ||
| is_file_upload = "multipart/form-data" in content_type | ||
|
|
||
| if request_start_header and not is_file_upload: |
There was a problem hiding this comment.
Stale check bypassed via client-controlled header
The Content-Type header is set by the client, so any caller can trivially bypass the stale check on any endpoint by including multipart/form-data in their Content-Type — even for plain JSON or GET-style requests. This is worth being aware of because TimeoutMiddleware is the only place the X-Request-Start-Time anti-replay protection is enforced.
In the existing design, a client that omits X-Request-Start-Time already bypasses the check, so this doesn't introduce a fundamentally new bypass path. That said, it does extend the surface: previously the bypass required not sending the header; now it also works by asserting a particular content-type. A slightly more robust alternative is to only skip the stale check when the request method is POST and the Content-Type is multipart/form-data, which limits the bypass to the smallest surface needed:
is_file_upload = request.method == "POST" and "multipart/form-data" in content_typeThis is a suggestion rather than a blocking issue, but worth considering before merge.
Case-insensitive comparison of base media type instead of substring match. Prevents bypass via crafted Content-Type headers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests for case-insensitive content-type, non-multipart rejection, and missing content-type header behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Stronger assertions (== 200 instead of != 408), proper substring bypass regression test with crafted content-type. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace multipart bypass with industry-standard clock skew allowance. Effective threshold = max_age (5min) + skew_allowance (5min) = 10min. Configurable via HTTP_CLOCK_SKEW_ALLOWANCE env var. Stale check now protects all endpoints including file uploads. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
12 tests covering: within/beyond tolerance, boundary behavior, multipart with skew, env var configuration, zero-allowance fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
408 response now includes server_time, client_time, skew_seconds, and hint so the app can detect clock drift and show a user warning. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verify error, server_time, client_time, skew_seconds, hint fields in the 408 rejection response body. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Hey @beastoin 👋 Thank you so much for taking the time to contribute to Omi! We truly appreciate you putting in the effort to submit this pull request. After careful review, we've decided not to merge this particular PR. Please don't take this personally — we genuinely try to merge as many contributions as possible, but sometimes we have to make tough calls based on:
Your contribution is still valuable to us, and we'd love to see you contribute again in the future! If you'd like feedback on how to improve this PR or want to discuss alternative approaches, please don't hesitate to reach out. Thank you for being part of the Omi community! 💜 |
Summary
TimeoutMiddlewarestale request checkProblem
TimeoutMiddlewarecomparestime.time()against client-setX-Request-Start-Timeheader with a 5-minute threshold. When a user's phone clock is ~5 minutes behind server UTC, requests get rejected with 408 "Request is too old" even though they just arrived. The plain-text 408 response gave no diagnostic info, so the app couldn't tell the user what was wrong.Solution
1. Clock skew tolerance (industry-standard, like AWS SigV4)
HTTP_CLOCK_SKEW_ALLOWANCEenv var (default: 5 minutes)max_age(5min) +clock_skew_allowance(5min) = 10 minutesHTTP_CLOCK_SKEW_ALLOWANCE=0to restore original behavior2. JSON 408 response with clock skew diagnostics
When a request IS rejected, the 408 response now returns:
{ "error": "clock_skew", "message": "Request rejected — your device clock may be out of sync", "server_time": 1742691120.5, "client_time": 1742690820.5, "skew_seconds": 300.0, "hint": "Check your device date/time settings and enable automatic time" }This enables the app to detect clock drift and show a user-facing warning (Flutter side handled separately by @Kelvin).
Changes
backend/utils/other/timeout.py:clock_skew_allowancefromHTTP_CLOCK_SKEW_ALLOWANCEenv varmax_age + clock_skew_allowanceJSONResponsewith diagnostic fields on 408backend/tests/unit/test_timeout_middleware.py— 12 tests:Test plan
blackformatting cleanFixes #5929
🤖 Generated with Claude Code