Enable rate limit enforcement after 48h clean shadow mode by beastoin · Pull Request #6197 · BasedHardware/omi

beastoin · 2026-03-31T10:49:09Z

Summary

Enable rate limit enforcement by default (shadow mode → active 429 rejections)
Bump conversations:create limit from 8/hr → 10/hr to accommodate power users

Context

PR #5836 deployed per-UID rate limiting in shadow mode (log-only, no blocking). After 48h of production monitoring:

297 total shadow events across 24 policies — only 3 policies triggered
0 Redis errors, atomic Lua scripts stable across 20 pods
45 unique UIDs observed, 2 power users generated 60% of all events
21 of 24 policies had zero shadow events (comfortable headroom)

Shadow event breakdown

Policy	Events	Users	Analysis
`conversations:reprocess` (3/hr)	149	29	1 user = 102 hits (automated retry loop)
`conversations:create` (8→10/hr)	126	18	1 user = 73 hits (automated creation)
`knowledge_graph:rebuild` (2/hr)	22	16	Evenly spread, 1-2 hits each

Why bump conversations:create to 10/hr

8/hr could catch legitimate batch import workflows
10/hr still blocks the abuser (73 hits/48h) while giving normal power users headroom
All other limits validated by shadow data — no changes needed

Safety

Revert to shadow: Set RATE_LIMIT_SHADOW_MODE=true env var (no code change needed)
Emergency relax: Set RATE_LIMIT_BOOST=2.0 to instantly double all limits
Fail-open: Redis errors allow requests through (no blocking on infra failure)

Test plan

50 unit tests pass (policy validation, boost, enforcement, shadow mode, router wiring)
Shadow mode default test updated to expect False (enforcement active)
Post-deploy: monitor 429 response rate for first 24h
Verify no spike in user-facing errors

Closes #5835

by AI for @beastoin

After 48h shadow mode monitoring showed clean results (297 total shadow events across 24 policies, 0 Redis errors), enable enforcement by default. Changes: - RATE_LIMIT_SHADOW_MODE default: true → false (enforcement active) - conversations:create limit: 8/hr → 10/hr (accommodates power users while still blocking abuse — shadow data showed 1 user at 73 hits) - Can revert to shadow via RATE_LIMIT_SHADOW_MODE=true env var - Boost multiplier remains available as emergency escape hatch Closes #5835 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

greptile-apps · 2026-03-31T10:52:52Z

Greptile Summary

This PR graduates the per-UID rate limiting system from shadow/log-only mode to active enforcement, backed by 48 hours of production shadow data across 20 pods and 45 unique UIDs. The two concrete code changes are a one-line default flip (RATE_LIMIT_SHADOW from True → False) and a conversations:create limit bump from 8 → 10 req/hr.

Key changes:

RATE_LIMIT_SHADOW default changed to False so the app enforces 429 responses out-of-the-box; operators can revert to shadow mode at any time by setting RATE_LIMIT_SHADOW_MODE=true without a code deploy.
conversations:create raised to 10/hr to give legitimate batch-import workflows headroom while still blocking the observed abuser (73 hits/48 h, well above 10/hr burst limit).
The corresponding unit test (test_shadow_mode_default_on → test_shadow_mode_default_off) is correctly updated to assert assertFalse.
All safety nets remain intact: fail-open on Redis errors, RATE_LIMIT_BOOST multiplier for emergency relaxation, and the shadow-mode env var for instant revert.

Confidence Score: 5/5

Safe to merge — minimal, targeted changes backed by production shadow data with a clear revert path.

Both changed files are config and its test. The logic flip is correct (only "false" produces False, matching documented usage), the limit bump is data-driven, and all 50 unit tests including the shadow-mode suite remain consistent. The only finding is a pre-existing P2 style concern around boolean parsing that does not affect current behavior given the documented operator interface.

No files require special attention.

Important Files Changed

Filename	Overview
backend/utils/rate_limit_config.py	Three targeted changes: default for RATE_LIMIT_SHADOW flipped from True (shadow/log-only) to False (active enforcement), comment updated to match, and conversations:create limit bumped 8→10/hr. Logic is correct and consistent.
backend/tests/unit/test_rate_limiting.py	test_shadow_mode_default_on renamed to test_shadow_mode_default_off and assertion flipped to assertFalse; all other existing tests (env_true, env_false, enforcement, shadow logging) continue to pass without modification.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Incoming Request] --> B[Auth Dependency\nget_current_user_uid]
    B --> C[_enforce_rate_limit\nuid, policy_name]
    C --> D{Redis check_rate_limit\nLua INCR + TTL}
    D -- RedisError --> E[Fail-open\nlog error, allow]
    D -- allowed=True --> F[Request proceeds]
    D -- allowed=False --> G{RATE_LIMIT_SHADOW?}
    G -- True\nRATE_LIMIT_SHADOW_MODE=true --> H[Log warning only\nRequest proceeds]
    G -- False\ndefault after this PR --> I[Raise HTTP 429\nRetry-After header]

    style I fill:#f66,color:#fff
    style H fill:#fa0,color:#fff
    style E fill:#aaa,color:#fff
    style F fill:#6a6,color:#fff

_{Reviews (1): Last reviewed commit: "Enable rate limit enforcement and bump c..." | Re-trigger Greptile}

greptile-apps · 2026-03-31T10:52:55Z


 RATE_LIMIT_BOOST: float = float(os.getenv("RATE_LIMIT_BOOST", "1.0"))
-RATE_LIMIT_SHADOW: bool = os.getenv("RATE_LIMIT_SHADOW_MODE", "true").lower() != "false"
+RATE_LIMIT_SHADOW: bool = os.getenv("RATE_LIMIT_SHADOW_MODE", "false").lower() != "false"


Boolean parsing only recognises "false" as falsy

The expression os.getenv(..., "false").lower() != "false" means that only the literal string "false" (case-insensitive) disables shadow mode; any other value — including "0", "no", or "off" — would silently enable shadow mode instead. This pre-dates this PR but becomes more operationally relevant now that shadow mode is the opt-in path and operators may try conventional truthy/falsy strings when configuring it.

Consider a more conventional boolean helper so that "0", "no", and "off" all behave as expected:

Suggested change

RATE_LIMIT_SHADOW: bool = os.getenv("RATE_LIMIT_SHADOW_MODE", "false").lower() != "false"

RATE_LIMIT_SHADOW: bool = os.getenv("RATE_LIMIT_SHADOW_MODE", "false").lower() in ("1", "true", "yes", "on")

This way, shadow mode is only active when explicitly enabled with a clearly truthy value, which is consistent with the documented usage (RATE_LIMIT_SHADOW_MODE=true).

beastoin · 2026-03-31T10:58:25Z

lgtm

greptile-apps Bot reviewed Mar 31, 2026

View reviewed changes

beastoin merged commit 532f815 into main Mar 31, 2026
3 checks passed

beastoin deleted the feat/rate-limit-enforce branch March 31, 2026 10:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable rate limit enforcement after 48h clean shadow mode#6197

Enable rate limit enforcement after 48h clean shadow mode#6197
beastoin merged 1 commit intomainfrom
feat/rate-limit-enforce

beastoin commented Mar 31, 2026

Uh oh!

greptile-apps Bot commented Mar 31, 2026

Uh oh!

greptile-apps Bot Mar 31, 2026

Uh oh!

Uh oh!

beastoin commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	RATE_LIMIT_SHADOW: bool = os.getenv("RATE_LIMIT_SHADOW_MODE", "false").lower() != "false"
	RATE_LIMIT_SHADOW: bool = os.getenv("RATE_LIMIT_SHADOW_MODE", "false").lower() in ("1", "true", "yes", "on")

Conversation

beastoin commented Mar 31, 2026

Summary

Context

Shadow event breakdown

Why bump conversations:create to 10/hr

Safety

Test plan

Uh oh!

greptile-apps Bot commented Mar 31, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

beastoin commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant