Retry Redis ping on startup#386
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughStartup Redis checks now use a bounded retry loop. A new Client.PingWithRetry(ctx, total, perAttempt) retries PING with per-attempt timeouts and capped exponential backoff; tests and three command entry points (analysis, app, worker) were updated and the changelog documents the fix. ChangesRedis Retry-Based Health Checking
Sequence Diagram(s)sequenceDiagram
participant Entrypoint
participant BrokerClient
participant Redis
Entrypoint->>BrokerClient: PingWithRetry(ctx, total, perAttempt)
BrokerClient->>Redis: PING (per-attempt timeout)
Redis-->>BrokerClient: PONG or error
alt PONG
BrokerClient-->>Entrypoint: success
else error and time remaining
BrokerClient->>BrokerClient: sleep (exponential backoff, capped)
BrokerClient->>Redis: PING (next attempt)
else deadline exceeded or ctx canceled
BrokerClient-->>Entrypoint: return last error or ctx.Err()
end
🎯 3 (Moderate) | ⏱️ ~25 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
Updates to Preview Branch (work/gallant-elbakyan-64b835) ↗︎
Tasks are run on every commit but only new migration files are pushed.
View logs for this Workflow Run ↗︎. |
Release VersionsApp patch: ChangelogFixed
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
internal/broker/redis_test.go (1)
15-56: ⚡ Quick winAdd a regression case for
perAttempt > totalbudget handling.Nice coverage overall. Please add one subtest that asserts retry returns within the total budget when
perAttemptis larger, so budget semantics stay protected.🧪 Suggested test shape
func TestPingWithRetry(t *testing.T) { + t.Run("does not exceed total budget when per-attempt timeout is larger", func(t *testing.T) { + start := time.Now() + err := pingWithRetry(context.Background(), 80*time.Millisecond, time.Second, + func(ctx context.Context) error { + <-ctx.Done() + return ctx.Err() + }) + require.Error(t, err) + assert.LessOrEqual(t, time.Since(start), 200*time.Millisecond) + }) + t.Run("immediate success", func(t *testing.T) { var calls int err := pingWithRetry(context.Background(), time.Second, 100*time.Millisecond, func(context.Context) error { calls++; return nil })🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/broker/redis_test.go` around lines 15 - 56, Add a new subtest inside TestPingWithRetry that calls pingWithRetry with total shorter than perAttempt (e.g., total=100ms, perAttempt=200ms) using a stub ping function that immediately returns a sentinel error; capture time before/after the call and assert that the call returns the expected error and that elapsed time is <= total (plus a tiny tolerance), ensuring pingWithRetry respects the overall budget when perAttempt > total. Reference the pingWithRetry helper and add the subtest under TestPingWithRetry.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/broker/redis.go`:
- Around line 90-123: In pingWithRetry the per-attempt context always uses
perAttempt which can let the final try exceed the overall deadline; compute the
remaining total budget as time.Until(deadline) and clamp the attempt timeout to
min(perAttempt, remaining) before calling context.WithTimeout. If the remaining
budget is <= 0 return the lastErr (or ctx.Err() if set) instead of starting a
timed attempt; replace the direct context.WithTimeout(ctx, perAttempt) call in
pingWithRetry with this clamped timeout logic to enforce the total budget.
---
Nitpick comments:
In `@internal/broker/redis_test.go`:
- Around line 15-56: Add a new subtest inside TestPingWithRetry that calls
pingWithRetry with total shorter than perAttempt (e.g., total=100ms,
perAttempt=200ms) using a stub ping function that immediately returns a sentinel
error; capture time before/after the call and assert that the call returns the
expected error and that elapsed time is <= total (plus a tiny tolerance),
ensuring pingWithRetry respects the overall budget when perAttempt > total.
Reference the pingWithRetry helper and add the subtest under TestPingWithRetry.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 27c218b9-9459-4bc2-bbd1-0d4a1bc92a30
📒 Files selected for processing (5)
cmd/analysis/main.gocmd/app/main.gocmd/worker/main.gointernal/broker/redis.gointernal/broker/redis_test.go
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
🐝 Review App Deployed Homepage: https://hover-pr-386.fly.dev |
|
🐝 Review App Deployed Homepage: https://hover-pr-386.fly.dev |
1 similar comment
|
🐝 Review App Deployed Homepage: https://hover-pr-386.fly.dev |
Summary
(*broker.Client).PingWithRetry(ctx, total, perAttempt)with capped exponential backoff and per-attempt timeout.Pingcall sites incmd/app,cmd/worker,cmd/analysisto use it (30s budget, 3s per attempt).Why
Every PR preview spin-up generated a small burst of Sentry errors on
staging—*errors.errorString: EOFreported asfailed to ping Redisfrom each of the three binaries. Triaged in Sentry:hover-worker-pr-*, 169 occurrences since 2026-04-19.hover-analysis-pr-*, 158 occurrences since 2026-04-28.hover-pr-*, 153 occurrences since 2026-04-20.Review apps provision a fresh per-PR Upstash-on-Fly Redis and pass the URL as a secret. The Fly machine boots and calls
client.Ping(context.Background())immediately. During the Upstash cold-start window TCP connects but the server closes the connection with EOF before answering PING. The client's built-inMaxRetries: 3burns through in milliseconds inside the same dead window, the binaryFatals, Fly restarts the machine, and the next boot succeeds — hence the burst-per-deploy pattern with zero prod impact (production Redis is warm).The fix lets the binary ride out the cold-start window instead of crashing. On a healthy Redis the first ping succeeds and the helper returns immediately, so there's no production latency regression. On genuine misconfiguration the helper still exhausts its budget and
Fatals — Sentry still gets one signal instead of three back-to-back.Fixes HOVER-JX HOVER-MD HOVER-JZ
Test plan
gofmt,goimports,go vet ./internal/broker/... ./cmd/app ./cmd/worker ./cmd/analysisgo test ./internal/broker/— full package, including newTestPingWithRetrygo build ./cmd/app ./cmd/worker ./cmd/analysishover-pr-<N>,hover-worker-pr-<N>,hover-analysis-pr-<N>during boot and thatconnected to Redisappears in all three startup logsNeed help on this PR? Tag
@codesmithwith what you need.Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation