fix(ratelimiter:PLA-1025): fail open when redis ratelimiter panics#56
Conversation
|
@codex review |
0x666c6f
left a comment
There was a problem hiding this comment.
Automated PR Review (Claude)
Reviewed commit: c477a77
Language: Go
| Severity | Count |
|---|---|
| Critical | 0 |
| High | 1 |
| Medium | 3 |
| Low | 3 |
Reviewed by 6 specialized agents. This review will re-run when new commits are pushed.
0x666c6f
left a comment
There was a problem hiding this comment.
Automated PR Review (Claude) — Re-review
Reviewed commit: 7432d4c
Language: Go
Previous findings: All 7 findings from review #1 (c477a77) have been addressed ✓
| Severity | Count |
|---|---|
| Critical | 0 |
| High | 0 |
| Medium | 2 |
| Low | 3 |
Notable improvements in this commit:
onRedisCacheFailure→onCacheFailure: now clears cache for all drivers, prevents hot-loop panics- Timeout
MetricRateLimiterFailopenTotalmoved toevaluateRulewhereprojectIdis available — consistent metric cardinality - Updated doc comment on
doLimitWithTimeout— now accurate - Added
PanicBeforeTimeoutFailOpentest — covers the channel-based panic propagation path - Added
MetricUnexpectedPanicTotalassertion — closes the observability gap - Removed fragile
MetricNetworkAttemptReasonTotal.Reset()— uses delta pattern instead
Remaining findings are suggestions for further refinement, not blockers.
Reviewed by 6 specialized agents. This review will re-run when new commits are pushed.
0x666c6f
left a comment
There was a problem hiding this comment.
Automated PR Review (Claude) — Re-review #3
Reviewed commit: 69068b4
Language: Go
Previous findings: All 5 findings from review #2 (7432d4c) have been addressed ✓
| Severity | Count |
|---|---|
| Critical | 0 |
| High | 0 |
| Medium | 1 |
| Low | 0 |
Excellent iteration — this commit addressed every finding from review #2:
- ✅ Extracted
recordFailOpenhelper (was MEDIUM: duplicate fail-open pattern) - ✅ Threaded
agentNamethroughevaluateRule(was LOW: empty string gap) - ✅ Simplified
waitDuration— computed once before if/else (was LOW: boilerplate) - ✅ Added doc comment about detached goroutine timeout-panic behavior (was MEDIUM)
- ✅ Updated non-redis log to "(rate limiting disabled until restart)" (was LOW)
- ✅ Tests now use real requests with agent names for label verification
The single remaining finding is a suggestion to reduce the 13-parameter recordFailOpen signature with a struct. Not a blocker.
Reviewed by 6 specialized agents. This review will re-run when new commits are pushed.
0x666c6f
left a comment
There was a problem hiding this comment.
Automated PR Review (Claude) — Re-review #4
Reviewed commit: 46db19d
Language: Go
Previous findings: The single MEDIUM finding from review #3 (69068b4) has been addressed ✓
| Severity | Count |
|---|---|
| Critical | 0 |
| High | 0 |
| Medium | 0 |
| Low | 0 |
Clean refactoring — introduced rateLimitEvalContext struct to replace the 13-parameter recordFailOpen signature with 6 parameters. Call sites are now self-documenting and resistant to silent string transposition. No behavioral changes, no new issues found.
All findings from all 3 prior reviews have been addressed across 4 iterations. This PR is in excellent shape. 🎯
Reviewed by 6 specialized agents. This review will re-run when new commits are pushed.
0x666c6f
left a comment
There was a problem hiding this comment.
Automated PR Review (Claude) — Re-review #5
Reviewed commit: dae5d30
Language: Go
| Severity | Count |
|---|---|
| Critical | 0 |
| High | 0 |
| Medium | 0 |
| Low | 0 |
Incremental change since last review (46db19d): only erpc/networks_multiplexer_test.go — adds a Delay(50ms) to the mock, a start-channel barrier (start := make(chan struct{}) + <-start + close(start)) to ensure goroutines launch simultaneously, fixing a race condition in the multiplexer test. Clean test fix, no issues found.
All ratelimiter code unchanged from review #4 (clean). All findings from reviews #1-3 remain addressed.
Reviewed by 6 specialized agents. This review will re-run when new commits are pushed.
Summary
DoLimitcallsChanges
DoLimitexecution with panic recovery and record panic-specific fail-open telemetryLinear