fix(audit): improve failover and timeout visibility#239
fix(audit): improve failover and timeout visibility#239SantiagoDePolonia merged 1 commit intomainfrom
Conversation
📝 WalkthroughWalkthroughThis PR introduces failover metadata tracking throughout the system. It adds a new Changes
Sequence DiagramsequenceDiagram
participant Client
participant Handler
participant HTTPClient
participant AuditLog
participant Fallback
Client->>Handler: ChatCompletion Request
Handler->>HTTPClient: Send to Primary Provider
HTTPClient-->>HTTPClient: Timeout Error
HTTPClient->>HTTPClient: Map to 504 Status
HTTPClient-->>Handler: Error Response
Handler->>Fallback: Execute Fallback/Failover
Fallback->>HTTPClient: Send to Fallback Provider
HTTPClient-->>Fallback: Success Response
Fallback-->>Handler: Return with failoverModel
Handler->>AuditLog: EnrichEntryWithFailover(failoverModel)
AuditLog-->>Handler: Audit Entry Updated
Handler-->>Client: Return Response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/llmclient/client.go`:
- Around line 676-692: The string-match fallback in isTimeoutError currently
checks for "client.timeout exceeded" and "timeout awaiting response headers" but
misses the body-read phrasing; update isTimeoutError to also check the lowercase
error message for the pattern "client.timeout or context canceled" (or a
substring like "client.timeout or context canceled") so non-standard transports
that surface that wording are treated as timeouts; locate function
isTimeoutError in internal/llmclient/client.go and add the additional
strings.Contains check to the existing fallback logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 64459caf-3769-4584-91eb-0cfe8bafecfe
📒 Files selected for processing (19)
internal/admin/dashboard/static/js/modules/dashboard-layout.test.jsinternal/admin/dashboard/static/js/modules/workflows-layout.test.jsinternal/admin/dashboard/static/js/modules/workflows.jsinternal/admin/dashboard/static/js/modules/workflows.test.jsinternal/admin/dashboard/templates/index.htmlinternal/admin/dashboard/templates/workflow-chart.htmlinternal/admin/handler.gointernal/auditlog/auditlog.gointernal/auditlog/auditlog_test.gointernal/auditlog/middleware.gointernal/auditlog/reader_postgresql.gointernal/auditlog/reader_sqlite.gointernal/auditlog/reader_sqlite_boundary_test.gointernal/auditlog/stream_wrapper.gointernal/llmclient/client.gointernal/llmclient/client_test.gointernal/server/fallback_test.gointernal/server/internal_chat_completion_executor.gointernal/server/translated_inference_service.go
| func isTimeoutError(err error) bool { | ||
| if err == nil { | ||
| return false | ||
| } | ||
| if errors.Is(err, context.DeadlineExceeded) { | ||
| return true | ||
| } | ||
|
|
||
| var netErr net.Error | ||
| if errors.As(err, &netErr) && netErr.Timeout() { | ||
| return true | ||
| } | ||
|
|
||
| message := strings.ToLower(err.Error()) | ||
| return strings.Contains(message, "client.timeout exceeded") || | ||
| strings.Contains(message, "timeout awaiting response headers") | ||
| } |
There was a problem hiding this comment.
❓ Verification inconclusive
What exact error message does Go's net/http return when http.Client.Timeout fires during response body read vs while awaiting headers?
When http.Client.Timeout fires while awaiting headers (before receiving the response), the error returned by client.Do/Get/etc. is a *url.Error wrapping the underlying error with the message "(Client.Timeout exceeded while awaiting headers)" appended, e.g., "Get "https://example.com\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)". When the timeout fires during reading the response body (after headers received, during io.ReadAll(resp.Body) or similar), the resp.Body.Read returns an error wrapped by *http.timeoutError with "(Client.Timeout or context canceled)" appended to the underlying read error, e.g., "read tcp ...: i/o timeout (Client.Timeout or context canceled)". This is due to the cancelTimerBody wrapper around the body, which detects the timeout and augments the error. Note that issue #58279 discusses this returning an unexposed *http.httpError type, but it implements net.Error.Timeout=true and can be detected that way.
Citations:
- 1: https://go.dev/src/net/http/client.go
- 2: https://github.com/golang/go/blob/master/src/net/http/client.go
- 3: net/http: Timeout while reading response body gives unexposed http.httpError golang/go#58279
- 4: https://stackoverflow.com/questions/58619341/error-net-http-request-canceled-while-waiting-for-connection-client-timeout-e
- 5: https://www.reddit.com/r/golang/comments/9ad3u2/clienttimeout_exceeded_while_awaiting_headers/
Minor: string-match fallback incompletely covers timeout scenarios.
The errors.Is / net.Error.Timeout() checks reliably catch http.Client.Timeout for both awaiting-headers and body-read phases because Go's internal timeout errors implement net.Error in both cases, so production code will work as-is. However, the string fallback provides a safety net for non-standard transports that might surface timeout errors without the net.Error interface. The fallback currently matches "client.timeout exceeded" (from awaiting-headers) and "timeout awaiting response headers", but not the body-read timeout phrasing "client.timeout or context canceled". Consider adding that pattern for completeness, or document that the fallback is intentionally limited.
♻️ Suggested tweak
message := strings.ToLower(err.Error())
return strings.Contains(message, "client.timeout exceeded") ||
+ strings.Contains(message, "client.timeout or context canceled") ||
strings.Contains(message, "timeout awaiting response headers")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/llmclient/client.go` around lines 676 - 692, The string-match
fallback in isTimeoutError currently checks for "client.timeout exceeded" and
"timeout awaiting response headers" but misses the body-read phrasing; update
isTimeoutError to also check the lowercase error message for the pattern
"client.timeout or context canceled" (or a substring like "client.timeout or
context canceled") so non-standard transports that surface that wording are
treated as timeouts; locate function isTimeoutError in
internal/llmclient/client.go and add the additional strings.Contains check to
the existing fallback logic.
Summary
Testing
Summary by CodeRabbit
New Features
Bug Fixes
Documentation