P0-7: graceful shutdown readiness drain (MR-P0-7)#120
Merged
mastermanas805 merged 5 commits intoMay 20, 2026
Conversation
…R-P0-7) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n (MR-P0-7) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…P0-7) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…0-7) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…MR-P0-7) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Diagnosis
Pre-existing state on master:
runServerWithGracefulShutdownhelper was wired inmain.go(callsapp.ShutdownWithTimeouton SIGTERM, drains in-flight requests in 25s).TestRunServerWithGracefulShutdown_DrainsInflightalready proved the drain contract.Gap identified:
/readyzkept returning 200 right up to listener close. The kubelet's readinessProbe never observed a 503, so the Service kept routing new traffic to a pod that was about to stop accepting connections — producing the connection-reset symptom on every rolling restart. The k8s manifest also lackedterminationGracePeriodSecondsandpreStop(companion PR in infra repo).Diff Summary
internal/handlers/readyz.goMarkDraining()+IsDraining();Get()short-circuits to 503 + overall=failed + a singleshutting_downcheck when draining.atomic.Boolflag.internal/router/router.goShutdownHooksstruct +NewWithHooks()returning(*fiber.App, ShutdownHooks)so main.go can flipReadyz.MarkDraining. LegacyNew()preserved for existing tests.main.goreadinessDrainGrace = 3 * time.Second.runServerWithGracefulShutdownnow takesrouter.ShutdownHooksand on SIGTERM: (A) callshooks.Readyz.MarkDraining; (B) sleepsreadinessDrainGrace; (C)app.ShutdownWithTimeout(25s). Doc updated to reflect terminationGracePeriodSeconds=35.graceful_shutdown_test.gorouter.ShutdownHooks{}. Two new tests:TestRunServerWithGracefulShutdown_MarksReadinessDraining(asserts the flag flip on SIGTERM);TestRunServerWithGracefulShutdown_TimeoutKillsStuckRequest(asserts the 25s timeout fires non-nil and exits in bounded time).internal/handlers/readyz_test.goTestReadyz_DrainingReturns503(no sqlmock expectations — proves the runner is NOT consulted in drain mode);TestReadyz_DrainingIsIdempotent.Test Output (verbatim)
go build ./...andgo vet ./...clean.Coverage Block (CLAUDE.md rule 17)
Required Companion PR
infra repo —
ship/p0-7-api-grace-period-2026-05-20adds
terminationGracePeriodSeconds: 35and apreStop: sleep 5lifecycle hookto
k8s/app.yaml. Must land together or the in-process drain logic stalls untilSIGKILL at 30s default.
Live Verify Plan (post-merge)
curl https://api.instanode.dev/healthz | jq .commit_idmatches HEAD.kubectl rollout restart deploy/instant-api -n instantcurl -w '%{http_code}\n' /healthz /openapi.jsonfor 60s — assertZERO non-2xx (no connection-resets).
kubectl get events -n instantshows no 'killed before terminationGracePeriod'.🤖 Generated with Claude Code