Enable rate limit enforcement after 48h clean shadow mode#6197
Conversation
After 48h shadow mode monitoring showed clean results (297 total shadow events across 24 policies, 0 Redis errors), enable enforcement by default. Changes: - RATE_LIMIT_SHADOW_MODE default: true → false (enforcement active) - conversations:create limit: 8/hr → 10/hr (accommodates power users while still blocking abuse — shadow data showed 1 user at 73 hits) - Can revert to shadow via RATE_LIMIT_SHADOW_MODE=true env var - Boost multiplier remains available as emergency escape hatch Closes #5835 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Greptile SummaryThis PR graduates the per-UID rate limiting system from shadow/log-only mode to active enforcement, backed by 48 hours of production shadow data across 20 pods and 45 unique UIDs. The two concrete code changes are a one-line default flip ( Key changes:
Confidence Score: 5/5Safe to merge — minimal, targeted changes backed by production shadow data with a clear revert path. Both changed files are config and its test. The logic flip is correct (only "false" produces False, matching documented usage), the limit bump is data-driven, and all 50 unit tests including the shadow-mode suite remain consistent. The only finding is a pre-existing P2 style concern around boolean parsing that does not affect current behavior given the documented operator interface. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Incoming Request] --> B[Auth Dependency\nget_current_user_uid]
B --> C[_enforce_rate_limit\nuid, policy_name]
C --> D{Redis check_rate_limit\nLua INCR + TTL}
D -- RedisError --> E[Fail-open\nlog error, allow]
D -- allowed=True --> F[Request proceeds]
D -- allowed=False --> G{RATE_LIMIT_SHADOW?}
G -- True\nRATE_LIMIT_SHADOW_MODE=true --> H[Log warning only\nRequest proceeds]
G -- False\ndefault after this PR --> I[Raise HTTP 429\nRetry-After header]
style I fill:#f66,color:#fff
style H fill:#fa0,color:#fff
style E fill:#aaa,color:#fff
style F fill:#6a6,color:#fff
Reviews (1): Last reviewed commit: "Enable rate limit enforcement and bump c..." | Re-trigger Greptile |
|
|
||
| RATE_LIMIT_BOOST: float = float(os.getenv("RATE_LIMIT_BOOST", "1.0")) | ||
| RATE_LIMIT_SHADOW: bool = os.getenv("RATE_LIMIT_SHADOW_MODE", "true").lower() != "false" | ||
| RATE_LIMIT_SHADOW: bool = os.getenv("RATE_LIMIT_SHADOW_MODE", "false").lower() != "false" |
There was a problem hiding this comment.
Boolean parsing only recognises
"false" as falsy
The expression os.getenv(..., "false").lower() != "false" means that only the literal string "false" (case-insensitive) disables shadow mode; any other value — including "0", "no", or "off" — would silently enable shadow mode instead. This pre-dates this PR but becomes more operationally relevant now that shadow mode is the opt-in path and operators may try conventional truthy/falsy strings when configuring it.
Consider a more conventional boolean helper so that "0", "no", and "off" all behave as expected:
| RATE_LIMIT_SHADOW: bool = os.getenv("RATE_LIMIT_SHADOW_MODE", "false").lower() != "false" | |
| RATE_LIMIT_SHADOW: bool = os.getenv("RATE_LIMIT_SHADOW_MODE", "false").lower() in ("1", "true", "yes", "on") |
This way, shadow mode is only active when explicitly enabled with a clearly truthy value, which is consistent with the documented usage (RATE_LIMIT_SHADOW_MODE=true).
|
lgtm |
Summary
conversations:createlimit from 8/hr → 10/hr to accommodate power usersContext
PR #5836 deployed per-UID rate limiting in shadow mode (log-only, no blocking). After 48h of production monitoring:
Shadow event breakdown
conversations:reprocess(3/hr)conversations:create(8→10/hr)knowledge_graph:rebuild(2/hr)Why bump conversations:create to 10/hr
Safety
RATE_LIMIT_SHADOW_MODE=trueenv var (no code change needed)RATE_LIMIT_BOOST=2.0to instantly double all limitsTest plan
False(enforcement active)Closes #5835
by AI for @beastoin