Skip to content

audit: comprehensive hygiene, bug fixes, and new features#19

Merged
KingPin merged 19 commits intomainfrom
pr/refactor-getbody-retry-ci
Apr 5, 2026
Merged

audit: comprehensive hygiene, bug fixes, and new features#19
KingPin merged 19 commits intomainfrom
pr/refactor-getbody-retry-ci

Conversation

@KingPin
Copy link
Copy Markdown
Owner

@KingPin KingPin commented Apr 5, 2026

Summary

Full audit pass addressing 20 issues found across correctness, metrics, hygiene, CI, and missing features.

  • fix: -healthcheck flag — container healthcheck was broken; /fanout -healthcheck was not handled and always exited 1 (tried to bind an in-use port). Now performs an HTTP GET to /health and exits 0/1 accordingly.
  • fix: deprecated APIs — remove rand.Seed (no-op since Go 1.20), min() func that shadowed the Go 1.21 built-in, and net.Error.Temporary() calls deprecated since Go 1.18.
  • fix: response body truncationio.LimitReader was silently capping response bodies at maxBodySize with no indication. Now sets truncated: true on the Response and emits a WARN log.
  • fix: latency always 0 in debug logsendRequest logged resp.Latency which is set by the caller after the function returns; replaced with time.Since(startTime).
  • fix: spurious WriteHeader(500) in writeJSON — headers are already flushed when the encoder fails mid-write; the call was a no-op generating noise.
  • feat: X-Request-ID correlation — generate a crypto/rand UUID-like ID if absent, forward to all targets (via existing cloneHeaders), echo on the fan-out response.
  • feat: latency_seconds as float64Response.Latency was time.Duration (raw int64 nanoseconds in JSON, undocumented unit). Now float64 seconds with JSON key latency_seconds.
  • perf: cache startup configTARGETS and ECHO_MODE_* env vars were read and parsed on every single request via os.Getenv/strings.Split; now parsed once in init().
  • fix: bodySize metric — only observed when ContentLength was known; now recorded for all pre-read body paths.
  • fix: log noise — sensitive-header log demoted from WARN→DEBUG (fires on every request with an Authorization header), log context keys sorted for deterministic output.
  • ci: lint/vet gategofmt -l . || true; go vet ./... || true swallowed all failures; now fails the step on violations.
  • ci: gosec in binary-release — security scan was only run in the Docker CI workflow, not before publishing binaries.
  • test: adapt for cached config — tests that set env vars at runtime updated to set the package-level cached vars directly with defer restore; maxRetries global state restored in retry tests.

Test Plan

  • docker run --rm -v $(pwd):/src -w /src golang:1.24 go test -v -race ./... — all 15 tests pass
  • gofmt -l . — no output (clean)
  • go vet ./... — clean
  • Verify container healthcheck: docker build . && docker run --rm <image> /fanout -healthcheck should exit 0 once server is running
  • Verify X-Request-ID is echoed in fan-out response headers
  • Verify latency_seconds is a float in the JSON response array

🤖 Generated with Claude Code

KingPin added 17 commits March 10, 2026 12:57
- rand.Seed(time.Now().UnixNano()) is a no-op since Go 1.20 (auto-seeded)
- min() shadowed the Go 1.21 built-in; removed so the built-in is used
- net.Error.Temporary() deprecated since Go 1.18; removed from isRetryableError
- Remove spurious w.WriteHeader(500) in writeJSON after encode failure;
  headers are already flushed at that point, the call was a no-op
…educe log noise

- Parse TARGETS and ECHO_MODE_* once in init(), expose as package vars
  to avoid per-request os.Getenv calls and strings.Split on every request
- Sort log context map keys for deterministic text output
- Demote sensitive-header log from WARN to DEBUG (noisy for a proxy that
  legitimately forwards auth headers)
- Record bodySize Prometheus metric for both pre-read body paths
  (previously only observed when ContentLength was known and >0)
- Fix indentation of body Close() calls in both pre-read branches
…detection

- Generate a crypto/rand UUID-like X-Request-ID if not present on the
  incoming request; forward to all targets via cloneHeaders; echo back on
  the fan-out response for client-side correlation
- Change Response.Latency from time.Duration (raw int64 ns) to float64
  seconds with JSON key latency_seconds - self-documenting and consistent
  with Prometheus/OpenTelemetry conventions
- Detect when io.LimitReader silently caps a response body at maxBodySize;
  set Truncated=true on the Response and emit a WARN log so callers know
- Fix debug log in sendRequest that always logged latency=0s because
  resp.Latency is set by the caller after sendRequest returns; use
  time.Since(startTime) instead
- Dockerfile and compose.yml both invoke /fanout -healthcheck for health
  probes; the flag was never handled so the binary tried to bind :8080
  (already in use) and always exited 1 -> container always unhealthy
- Add -healthcheck: GET localhost:/health, exit 0 on 200 else exit 1
- Move -version and -healthcheck checks to the very top of main(), before
  any HTTP handler registration or log output
- Simplify main() target validation to use cached configuredTargets slice
  instead of re-parsing TARGETS env var
…tate

- TestEchoHandlerSimpleMode/FullMode: env vars are now cached at init()
  time, not re-read per request; switch to setting echoModeHeader /
  echoModeResponse package vars directly with defer restore
- TestMultiplexNoGetBody: TARGETS env is cached at startup; set
  configuredTargets directly with defer restore instead of os.Setenv
- TestSendRequest / TestSendRequestNetworkError: add defer to restore
  the maxRetries global so tests cannot bleed state into each other
- Drop now-unused os import from both test files
- docker-image.yml: remove '|| true' from gofmt/go vet step so
  formatting and vet failures actually break the build; previously both
  tools were silently swallowed and the step could never fail
- binary-release.yml: add a gosec security scan step (matching the step
  that already exists in docker-image.yml) so published binaries receive
  the same security scrutiny as Docker images
Copilot AI review requested due to automatic review settings April 5, 2026 19:08
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR performs a broad audit of the FanOut Go service, focusing on correctness fixes, operational/CI hygiene, and a few small features (request correlation, clearer latency reporting, and config caching).

Changes:

  • Improve request/response handling (request body handling without GetBody, response truncation signaling, JSON writing fixes) and add X-Request-ID correlation.
  • Refactor runtime configuration to cache key env values at startup; update tests accordingly.
  • Tighten CI gates (gofmt/go vet fail the build) and add gosec to release workflow.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
fanout.go Core service updates: cached config, request ID propagation, latency JSON shape, retry logic tweaks, CLI -healthcheck, and server timeout configuration
fanout_test.go Updates expectations for latency_seconds and adjusts tests for cached globals / retry state restoration
fanout_additional_test.go Adds regression tests for multiplex behavior and typed retryable error detection
.github/workflows/docker-image.yml Runs gofmt/go vet, gosec, and unit tests inside Docker; ensures formatting/vet failures fail CI
.github/workflows/binary-release.yml Adds gosec scan before publishing binaries
.github/copilot-instructions.md Adds repository-specific contributor guidance for future Copilot sessions

fanout.go Outdated
Comment on lines 698 to 699
@@ -595,7 +699,7 @@ func sendRequest(ctx context.Context, client *http.Client, target string, origin
if readErr != nil {
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Response truncation detection is unreliable here: because the body is read with io.LimitReader(..., maxBodySize), you can't distinguish an exact maxBodySize response from a truncated response. Read maxBodySize+1 bytes, then if the read exceeds maxBodySize set Truncated=true and trim the stored body back to maxBodySize.

Copilot uses AI. Check for mistakes.
fanout.go Outdated
Comment on lines +713 to +716
// Warn if the response body was silently capped at maxBodySize.
if int64(len(respBody)) == maxBodySize {
resp.Truncated = true
logWarnWithContext(map[string]string{"target": target}, "Response body truncated at limit (%d bytes)", maxBodySize)
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The truncation check uses len(respBody) == maxBodySize, which will flag perfectly valid responses that happen to be exactly maxBodySize bytes. If you switch to reading maxBodySize+1, set Truncated only when len(respBody) > maxBodySize (and then trim).

Copilot uses AI. Check for mistakes.
fanout.go Outdated
Comment on lines 614 to 616
// target was validated by caller (must be absolute http/https with host). Suppress gosec SSRF warning.
// #nosec G704 -- validated target URL in multiplex
req, err := http.NewRequestWithContext(ctx, originalReq.Method, target, bodyReader)
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code suppresses gosec SSRF warnings under the assumption that target is validated by the caller, but configuredTargets are currently only trimmed/split and never URL-validated. Either validate targets (absolute http/https URLs with non-empty host) when caching TARGETS / before dispatching, or remove the misleading comment and #nosec suppression.

Suggested change
// target was validated by caller (must be absolute http/https with host). Suppress gosec SSRF warning.
// #nosec G704 -- validated target URL in multiplex
req, err := http.NewRequestWithContext(ctx, originalReq.Method, target, bodyReader)
parsedTarget, err := url.Parse(target)
if err != nil || !parsedTarget.IsAbs() || parsedTarget.Host == "" || (parsedTarget.Scheme != "http" && parsedTarget.Scheme != "https") {
resp.Status = http.StatusBadRequest
resp.Error = fmt.Sprintf("Invalid target URL %q: must be an absolute http/https URL with a non-empty host", target)
if bodyReader != nil {
if cerr := bodyReader.Close(); cerr != nil {
logWarn("Failed to close body reader after target validation failure: %v", cerr)
}
}
logErrorWithContext(map[string]string{"target": target}, "%s", resp.Error)
return resp
}
// target is validated locally as an absolute http/https URL with a non-empty host.
// #nosec G704 -- validated target URL before request construction
req, err := http.NewRequestWithContext(ctx, originalReq.Method, parsedTarget.String(), bodyReader)

Copilot uses AI. Check for mistakes.
Comment on lines +810 to +814
// #nosec G107 -- localhost-only health probe, port sourced from env
resp, err := http.Get("http://localhost:" + port + "/health")
if err != nil || resp.StatusCode != http.StatusOK {
os.Exit(1)
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -healthcheck implementation uses http.Get with the default client (no timeout) and does not close resp.Body. In container healthchecks this can hang indefinitely and leak resources; use an http.Client with a short Timeout and always close the response body before exiting.

Suggested change
// #nosec G107 -- localhost-only health probe, port sourced from env
resp, err := http.Get("http://localhost:" + port + "/health")
if err != nil || resp.StatusCode != http.StatusOK {
os.Exit(1)
}
client := &http.Client{
Timeout: 5 * time.Second,
}
// #nosec G107 -- localhost-only health probe, port sourced from env
resp, err := client.Get("http://localhost:" + port + "/health")
if err != nil {
os.Exit(1)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
os.Exit(1)
}

Copilot uses AI. Check for mistakes.
Comment on lines +25 to +28
// Create a request WITHOUT GetBody (http.NewRequest leaves GetBody nil)
body := []byte("hello")
req, err := http.NewRequest("POST", "/fanout", bytes.NewReader(body))
if err != nil {
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test likely doesn't exercise the GetBody == nil path: in Go, http.NewRequest sets Request.GetBody for bodies backed by *bytes.Reader (and also sets ContentLength), so multiplex will take the GetBody branch. To test the nil-GetBody behavior, explicitly set req.GetBody = nil (and ensure req.Body is a non-rewindable reader) before calling multiplex.

Copilot uses AI. Check for mistakes.

- name: Security scan (gosec)
run: |
docker run --rm -v ${{ github.workspace }}:/src -w /src securego/gosec:latest gosec ./...
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow pulls securego/gosec:latest, which makes CI behavior non-deterministic and can break unexpectedly when upstream publishes a new image. Consider pinning to a specific gosec version tag or an image digest for reproducible CI runs.

Suggested change
docker run --rm -v ${{ github.workspace }}:/src -w /src securego/gosec:latest gosec ./...
docker run --rm -v ${{ github.workspace }}:/src -w /src securego/gosec:v2.22.2 gosec ./...

Copilot uses AI. Check for mistakes.

- name: Security scan (gosec)
run: |
docker run --rm -v ${{ github.workspace }}:/src -w /src securego/gosec:latest gosec ./...
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow pulls securego/gosec:latest, which makes release gating non-deterministic and can break unexpectedly when upstream publishes a new image. Consider pinning to a specific gosec version tag or an image digest for reproducible release builds.

Suggested change
docker run --rm -v ${{ github.workspace }}:/src -w /src securego/gosec:latest gosec ./...
docker run --rm -v ${{ github.workspace }}:/src -w /src securego/gosec:v2.21.4 gosec ./...

Copilot uses AI. Check for mistakes.
KingPin added 2 commits April 5, 2026 15:34
- Truncation detection: read maxBodySize+1 bytes so an exact-size
  response is not falsely flagged; only set Truncated=true and trim
  when len(respBody) > maxBodySize
- Target URL validation: validate each TARGETS entry as an absolute
  http/https URL with non-empty host at startup in init(); skip and warn
  on invalid entries rather than passing unvalidated strings to
  http.NewRequestWithContext; update #nosec annotation accordingly
- Healthcheck client: use http.Client{Timeout:5s} in -healthcheck to
  avoid hanging indefinitely; always close resp.Body before exiting
- TestMultiplexNoGetBody: http.NewRequest sets GetBody for *bytes.Reader;
  wrap in io.NopCloser to ensure req.GetBody==nil and the test actually
  exercises the pre-read code path
- Pin securego/gosec to v2.22.4 in both CI workflows for reproducible
  builds instead of pulling :latest
@KingPin KingPin merged commit 9745001 into main Apr 5, 2026
@KingPin KingPin deleted the pr/refactor-getbody-retry-ci branch April 5, 2026 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants