Skip to content

fix: improve server timeout handling#3980

Merged
turip merged 2 commits into
mainfrom
fix/add-api-server-timeout-settings
Mar 20, 2026
Merged

fix: improve server timeout handling#3980
turip merged 2 commits into
mainfrom
fix/add-api-server-timeout-settings

Conversation

@turip
Copy link
Copy Markdown
Member

@turip turip commented Mar 20, 2026

Overview

Improve connection handling by limiting the duration of http connections.

Notes for reviewer

Summary by CodeRabbit

  • New Features

    • Added configurable HTTP server timeouts (read header, read, write, idle) with sensible defaults and a matching telemetry header-read timeout.
  • Chores

    • Improved graceful shutdown to use timeout-bound contexts for cleaner shutdown behavior.
    • Updated example configuration and tests to include the new server timeout settings.

@turip turip requested a review from a team as a code owner March 20, 2026 09:21
@turip turip force-pushed the fix/add-api-server-timeout-settings branch from 4024011 to 5ff4ead Compare March 20, 2026 09:21
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 20, 2026

📝 Walkthrough

Walkthrough

Adds a ServerConfig with four HTTP timeout fields, wires those defaults/validation into the global config and examples, applies the timeouts to both API and telemetry http.Servers, and changes graceful shutdown to use dedicated timeout-bound contexts for each server.

Changes

Cohort / File(s) Summary
Server config types & defaults
app/config/server.go, app/config/config.go, app/config/config_test.go
Introduce ServerConfig (ReadHeaderTimeout, ReadTimeout, WriteTimeout, IdleTimeout), add Validate() and ConfigureServer() for Viper defaults, and embed Server ServerConfig into Configuration with validation and test updates.
Telemetry config & server
app/config/telemetry.go, app/common/telemetry.go
Add ReadHeaderTimeout to TelemetryConfig with a 10s default and set ReadHeaderTimeout on the telemetry http.Server during initialization.
API server runtime & shutdown
cmd/server/main.go
Set all four timeout fields on the API http.Server (ReadHeaderTimeout, ReadTimeout, WriteTimeout, IdleTimeout) and change graceful shutdown to use fresh timeout-bound contexts for both telemetry and API servers.
Examples / docs
config.example.yaml
Add server configuration block documenting defaults for the four timeout fields.

Sequence Diagram(s)

sequenceDiagram
    participant Main as cmd/main
    participant Config as Configuration/Viper
    participant Telemetry as TelemetryServer (http.Server)
    participant API as APIServer (http.Server)
    Main->>Config: load config (server + telemetry timeouts)
    Main->>Telemetry: NewTelemetryServer(conf.Telemetry.ReadHeaderTimeout)
    Main->>API: create http.Server with conf.Server timeouts
    Note right of Main: Run servers (goroutines)
    Main->>Main: receive shutdown signal
    Main->>Telemetry: context.WithTimeout(gracefulShutdownTimeout)
    Main->>Telemetry: TelemetryServer.Close() / Shutdown(ctx)
    Main->>API: context.WithTimeout(gracefulShutdownTimeout)
    Main->>API: APIServer.Shutdown(ctx)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change—adding HTTP server timeout configurations to improve connection handling across telemetry and API servers.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/add-api-server-timeout-settings
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@turip turip requested a review from tothandras March 20, 2026 09:21
@turip turip added the release-note/bug-fix Release note: Bug Fixes label Mar 20, 2026
tothandras
tothandras previously approved these changes Mar 20, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
app/config/telemetry.go (1)

68-69: Consider adding validation for ReadHeaderTimeout.

The new ServerConfig.Validate() checks that timeout durations are non-negative, but TelemetryConfig.Validate() doesn't validate ReadHeaderTimeout. For consistency, you might want to add a similar check here.

♻️ Suggested validation addition in TelemetryConfig.Validate()
// Add to TelemetryConfig.Validate()
if c.ReadHeaderTimeout < 0 {
    errs = append(errs, errors.New("readHeaderTimeout must be non-negative"))
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/config/telemetry.go` around lines 68 - 69, TelemetryConfig.Validate() is
missing a check for ReadHeaderTimeout; add a non-negative duration validation
similar to ServerConfig.Validate() by checking c.ReadHeaderTimeout and appending
an error (e.g., "readHeaderTimeout must be non-negative") to errs when it's less
than zero so TelemetryConfig.Validate() enforces the same timeout invariants as
ServerConfig.
cmd/server/main.go (1)

216-221: Good fix using a fresh context for shutdown.

Using context.Background() with the graceful shutdown timeout ensures the shutdown process gets its full time budget even if the parent context is already cancelled.

One tiny edge case: if GracefulShutdownTimeout is zero (which validation currently allows), the context would be immediately cancelled. The default is 30s so this is unlikely in practice, but you might want to add validation in TerminationConfig.Validate() to require a positive duration.

Also applies to: 257-262

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/server/main.go` around lines 216 - 221, The graceful-shutdown code
correctly uses context.WithTimeout(shutdownCtx,
conf.Termination.GracefulShutdownTimeout) but you must prevent a zero timeout
from creating an immediately-canceled context: add validation in
TerminationConfig.Validate() to require conf.Termination.GracefulShutdownTimeout
> 0 (return an error if zero/negative), and update any other shutdown sites (the
similar block around lines 257-262 calling context.WithTimeout) to rely on this
validation; reference the symbols TerminationConfig.Validate,
conf.Termination.GracefulShutdownTimeout, and the shutdown helper code that
creates shutdownCtx and cancel to locate where to enforce the positive-duration
check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@app/config/telemetry.go`:
- Around line 68-69: TelemetryConfig.Validate() is missing a check for
ReadHeaderTimeout; add a non-negative duration validation similar to
ServerConfig.Validate() by checking c.ReadHeaderTimeout and appending an error
(e.g., "readHeaderTimeout must be non-negative") to errs when it's less than
zero so TelemetryConfig.Validate() enforces the same timeout invariants as
ServerConfig.

In `@cmd/server/main.go`:
- Around line 216-221: The graceful-shutdown code correctly uses
context.WithTimeout(shutdownCtx, conf.Termination.GracefulShutdownTimeout) but
you must prevent a zero timeout from creating an immediately-canceled context:
add validation in TerminationConfig.Validate() to require
conf.Termination.GracefulShutdownTimeout > 0 (return an error if zero/negative),
and update any other shutdown sites (the similar block around lines 257-262
calling context.WithTimeout) to rely on this validation; reference the symbols
TerminationConfig.Validate, conf.Termination.GracefulShutdownTimeout, and the
shutdown helper code that creates shutdownCtx and cancel to locate where to
enforce the positive-duration check.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c192378c-ee91-41c2-921f-1ac4a5f8832b

📥 Commits

Reviewing files that changed from the base of the PR and between 4971c89 and 5ff4ead.

📒 Files selected for processing (6)
  • app/common/telemetry.go
  • app/config/config.go
  • app/config/server.go
  • app/config/telemetry.go
  • cmd/server/main.go
  • config.example.yaml

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
app/config/config_test.go (1)

404-409: Server timeout expectations match the defaults.

The expected values (10s/60s/90s/120s) correctly match what ConfigureServer() sets. This test validates the full configuration flow works end-to-end.

One thing to consider: it'd be great to have a few unit tests for ServerConfig.Validate() covering edge cases like negative values and potentially zero values. This would help catch any future regressions in the validation logic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/config/config_test.go` around lines 404 - 409, Add unit tests for
ServerConfig.Validate() to cover edge cases (negative and zero durations) that
could regress validation logic: create test cases that construct ServerConfig
with negative ReadHeaderTimeout/ReadTimeout/WriteTimeout/IdleTimeout and with
zero values, call ServerConfig.Validate() and assert it returns an error for
invalid values and no error for allowed zeros if intended; use the existing
ServerConfig struct and the Validate() method as the targets and mirror the
style of the existing ConfigureServer() test for setup and assertions.
app/config/server.go (1)

18-38: Validation allows zero values which disables timeouts.

The validation only rejects negative values, meaning 0 is allowed. A zero timeout effectively disables that particular timeout on the http.Server. If this is intentional (to give operators flexibility), that's fine! But if not, you might want to also reject zero for at least ReadHeaderTimeout to prevent Slowloris-style attacks.

Worth a quick sanity check on whether this is the desired behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/config/server.go` around lines 18 - 38, The current ServerConfig.Validate
permits zero timeouts which effectively disables them; update
ServerConfig.Validate to treat zero as invalid for at least ReadHeaderTimeout
(and optionally ReadTimeout/WriteTimeout/IdleTimeout if desired) by checking <=
0 instead of < 0 and appending clearer errors (e.g., "readHeaderTimeout must be
positive"); update the error messages returned from Validate and any related
tests to expect the new validation behavior so a zero value no longer passes.
cmd/server/main.go (1)

254-262: Minor: error variable shadowing.

The err parameter in apiServerShutdown(err error) shadows the outer err variable declared at the package level (line 54's error handling). Same thing happens on line 260 where the assignment err = apiServer.Shutdown(...) is using the function parameter, not reassigning the outer scope.

This actually works fine in this case since you're not trying to use the outer err afterward, but it can be a bit confusing. A common pattern is to name the shutdown error differently:

💡 Optional clarity improvement
-		apiServerShutdown := func(err error) {
+		apiServerShutdown := func(cause error) {
-			logger.Debug("shutting down API server gracefully...", "error", err)
+			logger.Debug("shutting down API server gracefully...", "error", cause)

 			shutdownCtx, cancel := context.WithTimeout(context.Background(), conf.Termination.GracefulShutdownTimeout)
 			defer cancel()

-			if err = apiServer.Shutdown(shutdownCtx); err != nil {
+			if err := apiServer.Shutdown(shutdownCtx); err != nil {
 				logger.Warn("failed to shutdown API server", "error", err)
 			}
 		}

The same pattern applies to telemetryServerShutdown at lines 213-222.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/server/main.go` around lines 254 - 262, The closure functions
apiServerShutdown and telemetryServerShutdown currently take a parameter named
err which shadows the outer err; rename the parameter (e.g., shutdownErr) and
update internal uses (including the assignment from apiServer.Shutdown(...) and
telemetryServer.Shutdown(...)) so they refer to the new name to avoid shadowing
and improve clarity while preserving existing behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@app/config/config_test.go`:
- Around line 404-409: Add unit tests for ServerConfig.Validate() to cover edge
cases (negative and zero durations) that could regress validation logic: create
test cases that construct ServerConfig with negative
ReadHeaderTimeout/ReadTimeout/WriteTimeout/IdleTimeout and with zero values,
call ServerConfig.Validate() and assert it returns an error for invalid values
and no error for allowed zeros if intended; use the existing ServerConfig struct
and the Validate() method as the targets and mirror the style of the existing
ConfigureServer() test for setup and assertions.

In `@app/config/server.go`:
- Around line 18-38: The current ServerConfig.Validate permits zero timeouts
which effectively disables them; update ServerConfig.Validate to treat zero as
invalid for at least ReadHeaderTimeout (and optionally
ReadTimeout/WriteTimeout/IdleTimeout if desired) by checking <= 0 instead of < 0
and appending clearer errors (e.g., "readHeaderTimeout must be positive");
update the error messages returned from Validate and any related tests to expect
the new validation behavior so a zero value no longer passes.

In `@cmd/server/main.go`:
- Around line 254-262: The closure functions apiServerShutdown and
telemetryServerShutdown currently take a parameter named err which shadows the
outer err; rename the parameter (e.g., shutdownErr) and update internal uses
(including the assignment from apiServer.Shutdown(...) and
telemetryServer.Shutdown(...)) so they refer to the new name to avoid shadowing
and improve clarity while preserving existing behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f8f086ec-a7ed-4c5b-bf12-98db675fe8ab

📥 Commits

Reviewing files that changed from the base of the PR and between 5ff4ead and 5123b42.

📒 Files selected for processing (7)
  • app/common/telemetry.go
  • app/config/config.go
  • app/config/config_test.go
  • app/config/server.go
  • app/config/telemetry.go
  • cmd/server/main.go
  • config.example.yaml
✅ Files skipped from review due to trivial changes (1)
  • config.example.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • app/common/telemetry.go
  • app/config/telemetry.go

@turip turip enabled auto-merge (squash) March 20, 2026 10:49
@turip turip merged commit 41455a3 into main Mar 20, 2026
45 of 49 checks passed
@turip turip deleted the fix/add-api-server-timeout-settings branch March 20, 2026 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note/bug-fix Release note: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants