audit: comprehensive hygiene, bug fixes, and new features#19
Conversation
…nstant-format logging
…etection (net/url errors)
…r with timeouts to satisfy G114
- rand.Seed(time.Now().UnixNano()) is a no-op since Go 1.20 (auto-seeded) - min() shadowed the Go 1.21 built-in; removed so the built-in is used - net.Error.Temporary() deprecated since Go 1.18; removed from isRetryableError - Remove spurious w.WriteHeader(500) in writeJSON after encode failure; headers are already flushed at that point, the call was a no-op
…educe log noise - Parse TARGETS and ECHO_MODE_* once in init(), expose as package vars to avoid per-request os.Getenv calls and strings.Split on every request - Sort log context map keys for deterministic text output - Demote sensitive-header log from WARN to DEBUG (noisy for a proxy that legitimately forwards auth headers) - Record bodySize Prometheus metric for both pre-read body paths (previously only observed when ContentLength was known and >0) - Fix indentation of body Close() calls in both pre-read branches
…detection - Generate a crypto/rand UUID-like X-Request-ID if not present on the incoming request; forward to all targets via cloneHeaders; echo back on the fan-out response for client-side correlation - Change Response.Latency from time.Duration (raw int64 ns) to float64 seconds with JSON key latency_seconds - self-documenting and consistent with Prometheus/OpenTelemetry conventions - Detect when io.LimitReader silently caps a response body at maxBodySize; set Truncated=true on the Response and emit a WARN log so callers know - Fix debug log in sendRequest that always logged latency=0s because resp.Latency is set by the caller after sendRequest returns; use time.Since(startTime) instead
- Dockerfile and compose.yml both invoke /fanout -healthcheck for health probes; the flag was never handled so the binary tried to bind :8080 (already in use) and always exited 1 -> container always unhealthy - Add -healthcheck: GET localhost:/health, exit 0 on 200 else exit 1 - Move -version and -healthcheck checks to the very top of main(), before any HTTP handler registration or log output - Simplify main() target validation to use cached configuredTargets slice instead of re-parsing TARGETS env var
…tate - TestEchoHandlerSimpleMode/FullMode: env vars are now cached at init() time, not re-read per request; switch to setting echoModeHeader / echoModeResponse package vars directly with defer restore - TestMultiplexNoGetBody: TARGETS env is cached at startup; set configuredTargets directly with defer restore instead of os.Setenv - TestSendRequest / TestSendRequestNetworkError: add defer to restore the maxRetries global so tests cannot bleed state into each other - Drop now-unused os import from both test files
- docker-image.yml: remove '|| true' from gofmt/go vet step so formatting and vet failures actually break the build; previously both tools were silently swallowed and the step could never fail - binary-release.yml: add a gosec security scan step (matching the step that already exists in docker-image.yml) so published binaries receive the same security scrutiny as Docker images
There was a problem hiding this comment.
Pull request overview
This PR performs a broad audit of the FanOut Go service, focusing on correctness fixes, operational/CI hygiene, and a few small features (request correlation, clearer latency reporting, and config caching).
Changes:
- Improve request/response handling (request body handling without
GetBody, response truncation signaling, JSON writing fixes) and addX-Request-IDcorrelation. - Refactor runtime configuration to cache key env values at startup; update tests accordingly.
- Tighten CI gates (gofmt/go vet fail the build) and add gosec to release workflow.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
fanout.go |
Core service updates: cached config, request ID propagation, latency JSON shape, retry logic tweaks, CLI -healthcheck, and server timeout configuration |
fanout_test.go |
Updates expectations for latency_seconds and adjusts tests for cached globals / retry state restoration |
fanout_additional_test.go |
Adds regression tests for multiplex behavior and typed retryable error detection |
.github/workflows/docker-image.yml |
Runs gofmt/go vet, gosec, and unit tests inside Docker; ensures formatting/vet failures fail CI |
.github/workflows/binary-release.yml |
Adds gosec scan before publishing binaries |
.github/copilot-instructions.md |
Adds repository-specific contributor guidance for future Copilot sessions |
fanout.go
Outdated
| @@ -595,7 +699,7 @@ func sendRequest(ctx context.Context, client *http.Client, target string, origin | |||
| if readErr != nil { | |||
There was a problem hiding this comment.
Response truncation detection is unreliable here: because the body is read with io.LimitReader(..., maxBodySize), you can't distinguish an exact maxBodySize response from a truncated response. Read maxBodySize+1 bytes, then if the read exceeds maxBodySize set Truncated=true and trim the stored body back to maxBodySize.
fanout.go
Outdated
| // Warn if the response body was silently capped at maxBodySize. | ||
| if int64(len(respBody)) == maxBodySize { | ||
| resp.Truncated = true | ||
| logWarnWithContext(map[string]string{"target": target}, "Response body truncated at limit (%d bytes)", maxBodySize) |
There was a problem hiding this comment.
The truncation check uses len(respBody) == maxBodySize, which will flag perfectly valid responses that happen to be exactly maxBodySize bytes. If you switch to reading maxBodySize+1, set Truncated only when len(respBody) > maxBodySize (and then trim).
fanout.go
Outdated
| // target was validated by caller (must be absolute http/https with host). Suppress gosec SSRF warning. | ||
| // #nosec G704 -- validated target URL in multiplex | ||
| req, err := http.NewRequestWithContext(ctx, originalReq.Method, target, bodyReader) |
There was a problem hiding this comment.
This code suppresses gosec SSRF warnings under the assumption that target is validated by the caller, but configuredTargets are currently only trimmed/split and never URL-validated. Either validate targets (absolute http/https URLs with non-empty host) when caching TARGETS / before dispatching, or remove the misleading comment and #nosec suppression.
| // target was validated by caller (must be absolute http/https with host). Suppress gosec SSRF warning. | |
| // #nosec G704 -- validated target URL in multiplex | |
| req, err := http.NewRequestWithContext(ctx, originalReq.Method, target, bodyReader) | |
| parsedTarget, err := url.Parse(target) | |
| if err != nil || !parsedTarget.IsAbs() || parsedTarget.Host == "" || (parsedTarget.Scheme != "http" && parsedTarget.Scheme != "https") { | |
| resp.Status = http.StatusBadRequest | |
| resp.Error = fmt.Sprintf("Invalid target URL %q: must be an absolute http/https URL with a non-empty host", target) | |
| if bodyReader != nil { | |
| if cerr := bodyReader.Close(); cerr != nil { | |
| logWarn("Failed to close body reader after target validation failure: %v", cerr) | |
| } | |
| } | |
| logErrorWithContext(map[string]string{"target": target}, "%s", resp.Error) | |
| return resp | |
| } | |
| // target is validated locally as an absolute http/https URL with a non-empty host. | |
| // #nosec G704 -- validated target URL before request construction | |
| req, err := http.NewRequestWithContext(ctx, originalReq.Method, parsedTarget.String(), bodyReader) |
| // #nosec G107 -- localhost-only health probe, port sourced from env | ||
| resp, err := http.Get("http://localhost:" + port + "/health") | ||
| if err != nil || resp.StatusCode != http.StatusOK { | ||
| os.Exit(1) | ||
| } |
There was a problem hiding this comment.
The -healthcheck implementation uses http.Get with the default client (no timeout) and does not close resp.Body. In container healthchecks this can hang indefinitely and leak resources; use an http.Client with a short Timeout and always close the response body before exiting.
| // #nosec G107 -- localhost-only health probe, port sourced from env | |
| resp, err := http.Get("http://localhost:" + port + "/health") | |
| if err != nil || resp.StatusCode != http.StatusOK { | |
| os.Exit(1) | |
| } | |
| client := &http.Client{ | |
| Timeout: 5 * time.Second, | |
| } | |
| // #nosec G107 -- localhost-only health probe, port sourced from env | |
| resp, err := client.Get("http://localhost:" + port + "/health") | |
| if err != nil { | |
| os.Exit(1) | |
| } | |
| defer resp.Body.Close() | |
| if resp.StatusCode != http.StatusOK { | |
| os.Exit(1) | |
| } |
fanout_additional_test.go
Outdated
| // Create a request WITHOUT GetBody (http.NewRequest leaves GetBody nil) | ||
| body := []byte("hello") | ||
| req, err := http.NewRequest("POST", "/fanout", bytes.NewReader(body)) | ||
| if err != nil { |
There was a problem hiding this comment.
This test likely doesn't exercise the GetBody == nil path: in Go, http.NewRequest sets Request.GetBody for bodies backed by *bytes.Reader (and also sets ContentLength), so multiplex will take the GetBody branch. To test the nil-GetBody behavior, explicitly set req.GetBody = nil (and ensure req.Body is a non-rewindable reader) before calling multiplex.
.github/workflows/docker-image.yml
Outdated
|
|
||
| - name: Security scan (gosec) | ||
| run: | | ||
| docker run --rm -v ${{ github.workspace }}:/src -w /src securego/gosec:latest gosec ./... |
There was a problem hiding this comment.
The workflow pulls securego/gosec:latest, which makes CI behavior non-deterministic and can break unexpectedly when upstream publishes a new image. Consider pinning to a specific gosec version tag or an image digest for reproducible CI runs.
| docker run --rm -v ${{ github.workspace }}:/src -w /src securego/gosec:latest gosec ./... | |
| docker run --rm -v ${{ github.workspace }}:/src -w /src securego/gosec:v2.22.2 gosec ./... |
.github/workflows/binary-release.yml
Outdated
|
|
||
| - name: Security scan (gosec) | ||
| run: | | ||
| docker run --rm -v ${{ github.workspace }}:/src -w /src securego/gosec:latest gosec ./... |
There was a problem hiding this comment.
The workflow pulls securego/gosec:latest, which makes release gating non-deterministic and can break unexpectedly when upstream publishes a new image. Consider pinning to a specific gosec version tag or an image digest for reproducible release builds.
| docker run --rm -v ${{ github.workspace }}:/src -w /src securego/gosec:latest gosec ./... | |
| docker run --rm -v ${{ github.workspace }}:/src -w /src securego/gosec:v2.21.4 gosec ./... |
- Truncation detection: read maxBodySize+1 bytes so an exact-size
response is not falsely flagged; only set Truncated=true and trim
when len(respBody) > maxBodySize
- Target URL validation: validate each TARGETS entry as an absolute
http/https URL with non-empty host at startup in init(); skip and warn
on invalid entries rather than passing unvalidated strings to
http.NewRequestWithContext; update #nosec annotation accordingly
- Healthcheck client: use http.Client{Timeout:5s} in -healthcheck to
avoid hanging indefinitely; always close resp.Body before exiting
- TestMultiplexNoGetBody: http.NewRequest sets GetBody for *bytes.Reader;
wrap in io.NopCloser to ensure req.GetBody==nil and the test actually
exercises the pre-read code path
- Pin securego/gosec to v2.22.4 in both CI workflows for reproducible
builds instead of pulling :latest
Summary
Full audit pass addressing 20 issues found across correctness, metrics, hygiene, CI, and missing features.
/fanout -healthcheckwas not handled and always exited 1 (tried to bind an in-use port). Now performs an HTTP GET to/healthand exits 0/1 accordingly.rand.Seed(no-op since Go 1.20),min()func that shadowed the Go 1.21 built-in, andnet.Error.Temporary()calls deprecated since Go 1.18.io.LimitReaderwas silently capping response bodies atmaxBodySizewith no indication. Now setstruncated: trueon theResponseand emits a WARN log.sendRequestloggedresp.Latencywhich is set by the caller after the function returns; replaced withtime.Since(startTime).WriteHeader(500)inwriteJSON— headers are already flushed when the encoder fails mid-write; the call was a no-op generating noise.X-Request-IDcorrelation — generate acrypto/randUUID-like ID if absent, forward to all targets (via existingcloneHeaders), echo on the fan-out response.latency_secondsasfloat64—Response.Latencywastime.Duration(raw int64 nanoseconds in JSON, undocumented unit). Nowfloat64seconds with JSON keylatency_seconds.TARGETSandECHO_MODE_*env vars were read and parsed on every single request viaos.Getenv/strings.Split; now parsed once ininit().bodySizemetric — only observed whenContentLengthwas known; now recorded for all pre-read body paths.Authorizationheader), log context keys sorted for deterministic output.gofmt -l . || true; go vet ./... || trueswallowed all failures; now fails the step on violations.deferrestore;maxRetriesglobal state restored in retry tests.Test Plan
docker run --rm -v $(pwd):/src -w /src golang:1.24 go test -v -race ./...— all 15 tests passgofmt -l .— no output (clean)go vet ./...— cleandocker build . && docker run --rm <image> /fanout -healthcheckshould exit 0 once server is runningX-Request-IDis echoed in fan-out response headerslatency_secondsis a float in the JSON response array🤖 Generated with Claude Code