fix: improve server timeout handling#3980
Conversation
4024011 to
5ff4ead
Compare
📝 WalkthroughWalkthroughAdds a ServerConfig with four HTTP timeout fields, wires those defaults/validation into the global config and examples, applies the timeouts to both API and telemetry http.Servers, and changes graceful shutdown to use dedicated timeout-bound contexts for each server. Changes
Sequence Diagram(s)sequenceDiagram
participant Main as cmd/main
participant Config as Configuration/Viper
participant Telemetry as TelemetryServer (http.Server)
participant API as APIServer (http.Server)
Main->>Config: load config (server + telemetry timeouts)
Main->>Telemetry: NewTelemetryServer(conf.Telemetry.ReadHeaderTimeout)
Main->>API: create http.Server with conf.Server timeouts
Note right of Main: Run servers (goroutines)
Main->>Main: receive shutdown signal
Main->>Telemetry: context.WithTimeout(gracefulShutdownTimeout)
Main->>Telemetry: TelemetryServer.Close() / Shutdown(ctx)
Main->>API: context.WithTimeout(gracefulShutdownTimeout)
Main->>API: APIServer.Shutdown(ctx)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
app/config/telemetry.go (1)
68-69: Consider adding validation forReadHeaderTimeout.The new
ServerConfig.Validate()checks that timeout durations are non-negative, butTelemetryConfig.Validate()doesn't validateReadHeaderTimeout. For consistency, you might want to add a similar check here.♻️ Suggested validation addition in TelemetryConfig.Validate()
// Add to TelemetryConfig.Validate() if c.ReadHeaderTimeout < 0 { errs = append(errs, errors.New("readHeaderTimeout must be non-negative")) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/config/telemetry.go` around lines 68 - 69, TelemetryConfig.Validate() is missing a check for ReadHeaderTimeout; add a non-negative duration validation similar to ServerConfig.Validate() by checking c.ReadHeaderTimeout and appending an error (e.g., "readHeaderTimeout must be non-negative") to errs when it's less than zero so TelemetryConfig.Validate() enforces the same timeout invariants as ServerConfig.cmd/server/main.go (1)
216-221: Good fix using a fresh context for shutdown.Using
context.Background()with the graceful shutdown timeout ensures the shutdown process gets its full time budget even if the parent context is already cancelled.One tiny edge case: if
GracefulShutdownTimeoutis zero (which validation currently allows), the context would be immediately cancelled. The default is 30s so this is unlikely in practice, but you might want to add validation inTerminationConfig.Validate()to require a positive duration.Also applies to: 257-262
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/server/main.go` around lines 216 - 221, The graceful-shutdown code correctly uses context.WithTimeout(shutdownCtx, conf.Termination.GracefulShutdownTimeout) but you must prevent a zero timeout from creating an immediately-canceled context: add validation in TerminationConfig.Validate() to require conf.Termination.GracefulShutdownTimeout > 0 (return an error if zero/negative), and update any other shutdown sites (the similar block around lines 257-262 calling context.WithTimeout) to rely on this validation; reference the symbols TerminationConfig.Validate, conf.Termination.GracefulShutdownTimeout, and the shutdown helper code that creates shutdownCtx and cancel to locate where to enforce the positive-duration check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@app/config/telemetry.go`:
- Around line 68-69: TelemetryConfig.Validate() is missing a check for
ReadHeaderTimeout; add a non-negative duration validation similar to
ServerConfig.Validate() by checking c.ReadHeaderTimeout and appending an error
(e.g., "readHeaderTimeout must be non-negative") to errs when it's less than
zero so TelemetryConfig.Validate() enforces the same timeout invariants as
ServerConfig.
In `@cmd/server/main.go`:
- Around line 216-221: The graceful-shutdown code correctly uses
context.WithTimeout(shutdownCtx, conf.Termination.GracefulShutdownTimeout) but
you must prevent a zero timeout from creating an immediately-canceled context:
add validation in TerminationConfig.Validate() to require
conf.Termination.GracefulShutdownTimeout > 0 (return an error if zero/negative),
and update any other shutdown sites (the similar block around lines 257-262
calling context.WithTimeout) to rely on this validation; reference the symbols
TerminationConfig.Validate, conf.Termination.GracefulShutdownTimeout, and the
shutdown helper code that creates shutdownCtx and cancel to locate where to
enforce the positive-duration check.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c192378c-ee91-41c2-921f-1ac4a5f8832b
📒 Files selected for processing (6)
app/common/telemetry.goapp/config/config.goapp/config/server.goapp/config/telemetry.gocmd/server/main.goconfig.example.yaml
5ff4ead to
5123b42
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (3)
app/config/config_test.go (1)
404-409: Server timeout expectations match the defaults.The expected values (10s/60s/90s/120s) correctly match what
ConfigureServer()sets. This test validates the full configuration flow works end-to-end.One thing to consider: it'd be great to have a few unit tests for
ServerConfig.Validate()covering edge cases like negative values and potentially zero values. This would help catch any future regressions in the validation logic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/config/config_test.go` around lines 404 - 409, Add unit tests for ServerConfig.Validate() to cover edge cases (negative and zero durations) that could regress validation logic: create test cases that construct ServerConfig with negative ReadHeaderTimeout/ReadTimeout/WriteTimeout/IdleTimeout and with zero values, call ServerConfig.Validate() and assert it returns an error for invalid values and no error for allowed zeros if intended; use the existing ServerConfig struct and the Validate() method as the targets and mirror the style of the existing ConfigureServer() test for setup and assertions.app/config/server.go (1)
18-38: Validation allows zero values which disables timeouts.The validation only rejects negative values, meaning
0is allowed. A zero timeout effectively disables that particular timeout on thehttp.Server. If this is intentional (to give operators flexibility), that's fine! But if not, you might want to also reject zero for at leastReadHeaderTimeoutto prevent Slowloris-style attacks.Worth a quick sanity check on whether this is the desired behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/config/server.go` around lines 18 - 38, The current ServerConfig.Validate permits zero timeouts which effectively disables them; update ServerConfig.Validate to treat zero as invalid for at least ReadHeaderTimeout (and optionally ReadTimeout/WriteTimeout/IdleTimeout if desired) by checking <= 0 instead of < 0 and appending clearer errors (e.g., "readHeaderTimeout must be positive"); update the error messages returned from Validate and any related tests to expect the new validation behavior so a zero value no longer passes.cmd/server/main.go (1)
254-262: Minor: error variable shadowing.The
errparameter inapiServerShutdown(err error)shadows the outererrvariable declared at the package level (line 54's error handling). Same thing happens on line 260 where the assignmenterr = apiServer.Shutdown(...)is using the function parameter, not reassigning the outer scope.This actually works fine in this case since you're not trying to use the outer
errafterward, but it can be a bit confusing. A common pattern is to name the shutdown error differently:💡 Optional clarity improvement
- apiServerShutdown := func(err error) { + apiServerShutdown := func(cause error) { - logger.Debug("shutting down API server gracefully...", "error", err) + logger.Debug("shutting down API server gracefully...", "error", cause) shutdownCtx, cancel := context.WithTimeout(context.Background(), conf.Termination.GracefulShutdownTimeout) defer cancel() - if err = apiServer.Shutdown(shutdownCtx); err != nil { + if err := apiServer.Shutdown(shutdownCtx); err != nil { logger.Warn("failed to shutdown API server", "error", err) } }The same pattern applies to
telemetryServerShutdownat lines 213-222.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/server/main.go` around lines 254 - 262, The closure functions apiServerShutdown and telemetryServerShutdown currently take a parameter named err which shadows the outer err; rename the parameter (e.g., shutdownErr) and update internal uses (including the assignment from apiServer.Shutdown(...) and telemetryServer.Shutdown(...)) so they refer to the new name to avoid shadowing and improve clarity while preserving existing behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@app/config/config_test.go`:
- Around line 404-409: Add unit tests for ServerConfig.Validate() to cover edge
cases (negative and zero durations) that could regress validation logic: create
test cases that construct ServerConfig with negative
ReadHeaderTimeout/ReadTimeout/WriteTimeout/IdleTimeout and with zero values,
call ServerConfig.Validate() and assert it returns an error for invalid values
and no error for allowed zeros if intended; use the existing ServerConfig struct
and the Validate() method as the targets and mirror the style of the existing
ConfigureServer() test for setup and assertions.
In `@app/config/server.go`:
- Around line 18-38: The current ServerConfig.Validate permits zero timeouts
which effectively disables them; update ServerConfig.Validate to treat zero as
invalid for at least ReadHeaderTimeout (and optionally
ReadTimeout/WriteTimeout/IdleTimeout if desired) by checking <= 0 instead of < 0
and appending clearer errors (e.g., "readHeaderTimeout must be positive");
update the error messages returned from Validate and any related tests to expect
the new validation behavior so a zero value no longer passes.
In `@cmd/server/main.go`:
- Around line 254-262: The closure functions apiServerShutdown and
telemetryServerShutdown currently take a parameter named err which shadows the
outer err; rename the parameter (e.g., shutdownErr) and update internal uses
(including the assignment from apiServer.Shutdown(...) and
telemetryServer.Shutdown(...)) so they refer to the new name to avoid shadowing
and improve clarity while preserving existing behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f8f086ec-a7ed-4c5b-bf12-98db675fe8ab
📒 Files selected for processing (7)
app/common/telemetry.goapp/config/config.goapp/config/config_test.goapp/config/server.goapp/config/telemetry.gocmd/server/main.goconfig.example.yaml
✅ Files skipped from review due to trivial changes (1)
- config.example.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- app/common/telemetry.go
- app/config/telemetry.go
Overview
Improve connection handling by limiting the duration of http connections.
Notes for reviewer
Summary by CodeRabbit
New Features
Chores