Skip to content

feat(router): support per-key rate-limit overrides with regex pattern matching#2683

Merged
endigma merged 13 commits intomainfrom
jesse/eng-9286-rate-limit-overrides
Mar 31, 2026
Merged

feat(router): support per-key rate-limit overrides with regex pattern matching#2683
endigma merged 13 commits intomainfrom
jesse/eng-9286-rate-limit-overrides

Conversation

@endigma
Copy link
Copy Markdown
Member

@endigma endigma commented Mar 24, 2026

Adds an overrides array to simple_strategy in the rate limiter config. Each override specifies a matching regex pattern tested against the resolved rate-limit key; the first match wins and unmatched keys fall back to global defaults.

Includes documentation updates to the hardening guide and configuration reference.

Summary by CodeRabbit

  • New Features

    • Per-key rate limit overrides via regex matching with first-match precedence and fallback to global defaults.
    • Overrides specify per-pattern rate, burst, and period values.
  • Documentation

    • Expanded rate limiting docs and hardening guide with guidance, examples, and an updated example that derives the key suffix from the X-Api-Key header.
  • Tests

    • Added tests for key-suffix extraction, override matching, precedence, and expected rate-limit behavior.

Checklist

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Added per-key rate-limit overrides: config types, schema, fixtures, docs, and tests updated; Cosmo rate limiter now accepts, compiles, and applies regex-based overrides (first-match) against the key suffix; graph server wires overrides into the limiter.

Changes

Cohort / File(s) Summary
Config types & schema
router/pkg/config/config.go, router/pkg/config/config.schema.json, router/pkg/config/testdata/config_defaults.json, router/pkg/config/testdata/config_full.json
Added RateLimitOverride (matching, rate, burst, period) and Overrides []RateLimitOverride on RateLimitSimpleStrategy; updated JSON schema and testdata to include overrides.
Rate limiter implementation
router/core/ratelimiter.go, router/core/ratelimiter_test.go
Added Overrides option; compile matching regexes into internal structs; generateKey returns (key, suffix, err); added resolveLimit to pick first matching override or fallback; tests for matching, precedence, and invalid regex.
Graph server wiring
router/core/graph_server.go
Passes s.rateLimit.SimpleStrategy.Overrides into CosmoRateLimiterOptions when constructing the Redis-backed rate limiter.
Integration tests
router-tests/security/ratelimit_test.go
Added test verifying override behavior with KeySuffixExpression: "request.header.Get('X-Client-ID')" and an override matching ^premium-.*; cleans Redis keys and asserts differing limits for matching vs non-matching keys; minor test-scaffold formatting changes.
Documentation & guides
docs-website/router/configuration.mdx, docs-website/router/security/hardening-guide.mdx
Documented rate_limit.simple_strategy.overrides fields and first-match semantics; updated examples to show key_suffix_expression usage and set key_suffix_expression to request.header.Get('X-Api-Key').
Fixtures
router/pkg/config/fixtures/full.yaml
Added example override (matching: '^premium-.*') with higher rate/burst/period.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main feature added: per-key rate-limit overrides with regex pattern matching, which aligns directly with all the changes across configuration, implementation, tests, and documentation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@mintlify
Copy link
Copy Markdown
Contributor

mintlify bot commented Mar 24, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
wundergraphinc 🟢 Ready View Preview Mar 24, 2026, 12:56 PM

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 24, 2026

Router image scan passed

✅ No security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-dc55643b9338464d931d3b794c64eb87f0e3d579

@endigma
Copy link
Copy Markdown
Member Author

endigma commented Mar 24, 2026

Resolves #2395

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Codecov Report

❌ Patch coverage is 96.66667% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 63.14%. Comparing base (4e2b146) to head (fc931b6).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
router/core/ratelimiter.go 96.55% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2683      +/-   ##
==========================================
- Coverage   63.34%   63.14%   -0.20%     
==========================================
  Files         249      249              
  Lines       26643    26661      +18     
==========================================
- Hits        16876    16835      -41     
- Misses       8404     8449      +45     
- Partials     1363     1377      +14     
Files with missing lines Coverage Δ
router/core/graph_server.go 84.59% <100.00%> (-0.46%) ⬇️
router/pkg/config/config.go 80.51% <ø> (ø)
router/core/ratelimiter.go 84.07% <96.55%> (+2.82%) ⬆️

... and 12 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
router/core/graph_server.go (1)

1487-1499: ⚠️ Potential issue | 🟠 Major

Clean up partially initialized graph resources before returning here.

NewCosmoRateLimiter now adds a late failure point after the connector, pubsub providers, caches, and metric stores are already initialized. Because gm is only appended to s.graphMuxList at Line 1638, this direct return skips the normal shutdown path and leaks those resources when an override regex is invalid or rate-limiter construction otherwise fails. Please route this through the same buildGraphMux cleanup path used for other late init failures.

Based on learnings, ensure that buildGraphMux error paths clean up partially initialized resources (caches, metric stores, pub/sub providers, connectors) before returning.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/core/graph_server.go` around lines 1487 - 1499, The rate-limiter
failure path currently returns directly after NewCosmoRateLimiter fails and thus
leaks partially-initialized resources; instead route this failure through the
same buildGraphMux cleanup path used for other late-init errors: when
NewCosmoRateLimiter(...) returns an error, invoke the existing buildGraphMux
error/cleanup routine (the same code used elsewhere to unwind gm before
appending to s.graphMuxList) rather than returning immediately, or call the
shared teardown function that releases connector, pubsub, cache and metric store
resources for the in-progress gm; also ensure buildGraphMux's error paths (the
routines that run when late init fails) explicitly free caches, metric stores,
pub/sub providers and connectors so no resources leak on rate-limiter
construction failures.
🧹 Nitpick comments (2)
router/pkg/config/config.schema.json (1)

2309-2312: Prevent accidental catch-all override by disallowing empty matching.

matching: "" is a valid regex and can silently match every key, shadowing later overrides. Add a minimum length so catch-all intent must be explicit (e.g. .*).

Suggested schema tweak
                   "matching": {
                     "type": "string",
+                    "minLength": 1,
                     "description": "A regex pattern matched against the resolved rate-limit key."
                   },
Based on learnings: In the Cosmo router project, parameter validation for configuration is handled at the JSON schema level rather than through runtime validation methods on structs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/pkg/config/config.schema.json` around lines 2309 - 2312, The
"matching" property currently allows an empty string which acts as a silent
catch-all; update the JSON schema for the "matching" property to forbid empty
values (e.g., add "minLength": 1 or a pattern that requires at least one
character) so an explicit catch-all must be expressed (for example ".*"); modify
the schema entry for "matching" to include this constraint to prevent accidental
empty-regex overrides.
router-tests/security/ratelimit_test.go (1)

725-733: Consider adding a precedence case for overlapping overrides.

This test proves match vs non-match behavior, but it doesn’t lock in the “first matching override wins” rule. A second matching override with a different limit would make precedence explicit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router-tests/security/ratelimit_test.go` around lines 725 - 733, Add a
precedence case by adding a second overlapping RateLimitOverride in the
Overrides slice (using the same or broader Matching pattern as the existing
"^.*:premium-.*" but with a different Rate/Burst/Period) and update the test
assertions to verify that the first matching override in the slice is applied
(i.e., the limit equals the first override's Rate/Burst rather than the later
one); reference the Overrides []config.RateLimitOverride literal and the
Matching/Rate/Burst/Period fields to implement this additional override and
expectation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs-website/router/configuration.mdx`:
- Around line 1947-1950: The override example's matching regex (the matching
field shown as "^premium-.*") doesn't account for the resolved rate-limit key
format which uses the default prefix format "<key_prefix>:<suffix>"; update the
example in the overrides block so the matching pattern looks for the
colon-separated resolved key (e.g., change "^premium-.*" to "^premium:.*" or to
a pattern that includes the prefix, and optionally add a clarifying note near
the earlier explanation that matching is performed against the resolved key
which includes the prefix).

In `@docs-website/router/security/hardening-guide.mdx`:
- Around line 91-116: The example under rate_limit.simple_strategy.overrides
shows matching for "^premium-.*" and "^internal-.*" but doesn't show how those
suffixes get into the resolved key; update the snippet to either add a
key_suffix_expression that produces keys like "premium-<id>" / "internal-<id>"
or add an inline internal link to the configuration reference that defines the
full generated key format. Specifically modify the rate_limit block to include a
key_suffix_expression example (or a short reference link) so the overrides'
matching regexes can actually match, keeping the symbols rate_limit,
simple_strategy, overrides, matching and key_suffix_expression in the
documentation for discoverability.

---

Outside diff comments:
In `@router/core/graph_server.go`:
- Around line 1487-1499: The rate-limiter failure path currently returns
directly after NewCosmoRateLimiter fails and thus leaks partially-initialized
resources; instead route this failure through the same buildGraphMux cleanup
path used for other late-init errors: when NewCosmoRateLimiter(...) returns an
error, invoke the existing buildGraphMux error/cleanup routine (the same code
used elsewhere to unwind gm before appending to s.graphMuxList) rather than
returning immediately, or call the shared teardown function that releases
connector, pubsub, cache and metric store resources for the in-progress gm; also
ensure buildGraphMux's error paths (the routines that run when late init fails)
explicitly free caches, metric stores, pub/sub providers and connectors so no
resources leak on rate-limiter construction failures.

---

Nitpick comments:
In `@router-tests/security/ratelimit_test.go`:
- Around line 725-733: Add a precedence case by adding a second overlapping
RateLimitOverride in the Overrides slice (using the same or broader Matching
pattern as the existing "^.*:premium-.*" but with a different Rate/Burst/Period)
and update the test assertions to verify that the first matching override in the
slice is applied (i.e., the limit equals the first override's Rate/Burst rather
than the later one); reference the Overrides []config.RateLimitOverride literal
and the Matching/Rate/Burst/Period fields to implement this additional override
and expectation.

In `@router/pkg/config/config.schema.json`:
- Around line 2309-2312: The "matching" property currently allows an empty
string which acts as a silent catch-all; update the JSON schema for the
"matching" property to forbid empty values (e.g., add "minLength": 1 or a
pattern that requires at least one character) so an explicit catch-all must be
expressed (for example ".*"); modify the schema entry for "matching" to include
this constraint to prevent accidental empty-regex overrides.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f9cf235d-4645-4f91-a87a-aa59150ee09c

📥 Commits

Reviewing files that changed from the base of the PR and between ebd25e1 and 54d19db.

📒 Files selected for processing (11)
  • docs-website/router/configuration.mdx
  • docs-website/router/security/hardening-guide.mdx
  • router-tests/security/ratelimit_test.go
  • router/core/graph_server.go
  • router/core/ratelimiter.go
  • router/core/ratelimiter_test.go
  • router/pkg/config/config.go
  • router/pkg/config/config.schema.json
  • router/pkg/config/fixtures/full.yaml
  • router/pkg/config/testdata/config_defaults.json
  • router/pkg/config/testdata/config_full.json

Comment thread docs-website/router/configuration.mdx
Comment thread docs-website/router/security/hardening-guide.mdx
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
router/core/ratelimiter.go (1)

95-98: Rename resolveLimit parameter from key to suffix for clarity.

At Line 95, the parameter is named key, but call sites (Line 126) pass the suffix. Renaming reduces ambiguity and helps prevent regressions back to full-key matching.

Also applies to: 126-127

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/core/ratelimiter.go` around lines 95 - 98, Rename the parameter in
CosmoRateLimiter.resolveLimit from key to suffix to match its actual usage;
update the function signature of resolveLimit(suffix string, defaultLimit
redis_rate.Limit) and all call sites that pass a suffix to use the new parameter
name, and ensure the internal pattern matching still uses the renamed variable
(e.g., in resolveLimit and any callers of CosmoRateLimiter.resolveLimit) so the
logic remains identical but clearer that only the suffix is matched.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@router/core/ratelimiter.go`:
- Around line 95-98: Rename the parameter in CosmoRateLimiter.resolveLimit from
key to suffix to match its actual usage; update the function signature of
resolveLimit(suffix string, defaultLimit redis_rate.Limit) and all call sites
that pass a suffix to use the new parameter name, and ensure the internal
pattern matching still uses the renamed variable (e.g., in resolveLimit and any
callers of CosmoRateLimiter.resolveLimit) so the logic remains identical but
clearer that only the suffix is matched.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 92c2754d-b25f-432d-8af7-dab25eb0f629

📥 Commits

Reviewing files that changed from the base of the PR and between ef092c1 and 2db84df.

📒 Files selected for processing (4)
  • docs-website/router/configuration.mdx
  • docs-website/router/security/hardening-guide.mdx
  • router/core/ratelimiter.go
  • router/core/ratelimiter_test.go
✅ Files skipped from review due to trivial changes (1)
  • docs-website/router/security/hardening-guide.mdx
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs-website/router/configuration.mdx
  • router/core/ratelimiter_test.go

Comment thread router/core/ratelimiter.go Outdated
Comment thread router/core/ratelimiter.go
@SkArchon
Copy link
Copy Markdown
Contributor

Will take another look when the PR is ready for review

@endigma endigma marked this pull request as ready for review March 30, 2026 11:33
@endigma endigma requested a review from Noroth as a code owner March 30, 2026 11:33
@endigma endigma requested review from a team, StarpTech, devsergiy and jensneuse as code owners March 30, 2026 11:33
@endigma endigma requested a review from wilsonrivera March 30, 2026 11:33
Copy link
Copy Markdown
Contributor

@SkArchon SkArchon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one doc change and two small nits.

Comment thread router-tests/security/ratelimit_test.go
Comment thread router/pkg/config/config.go
Comment thread docs-website/router/security/hardening-guide.mdx
endigma added 2 commits March 30, 2026 15:42
- Add test assertions verifying premium user is also rate-limited after
  exceeding their override allowance
- Update docs example to use JWT claim expression instead of header,
  with MCP/internal consumer type scenario
Copy link
Copy Markdown
Contributor

@SkArchon SkArchon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@endigma endigma merged commit 8529b07 into main Mar 31, 2026
39 checks passed
@endigma endigma deleted the jesse/eng-9286-rate-limit-overrides branch March 31, 2026 12:10
@rajbayer
Copy link
Copy Markdown

@endigma - thank you so much. This is great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants