Rate Monitor: Per-org request rate alerting on high-cost endpoints by vprashrex · Pull Request #911 · ProjectTech4DevAI/kaapi-backend

vprashrex · 2026-06-04T13:43:33Z

Target issue is: #797

Summary

Explain the motivation for making this change. What existing problem does the pull request solve?
High-cost endpoints (llm/call, evaluations, collections) had no visibility into per-tenant request rates, risking runaway clients and server load. This PR adds monitoring and alerting (no rate limiting) for request rates in a one-minute window.

What it does:

Threshold Rate Limit happens at Project Level
New app/core/rate_monitor.py exposes monitor_rate(category), a FastAPI dependency added to the high-cost endpoints.
Counts requests per org per minute using a Redis bucket key (rate_monitor:{category}:{org_id}:{minute}), expiring after 2 minutes.
When a threshold is exceeded, emits a warning alert via record_rate_threshold in telemetry.py (Sentry → Discord channel), including org, category, request count, and threshold.
Fails open: if Redis is unavailable, the request proceeds and the check is skipped (logged, not raised).

Thresholds (requests/minute):

Category	Threshold
`llm_call`	15
`collections`	3
`evaluations`	3

Checklist

Before submitting a pull request, please ensure that you mark these task.

Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
If you've fixed a bug or added code that is tested and has test cases.

Notes

Using Redis for storing request count for 2 minute window (Expiration)
No rate limiting: requests are never blocked, only counted and alerted.
To extend this code for rate limit just add Raise HttpException with 429 status code on function monitor_rate()

coderabbitai · 2026-06-04T13:43:41Z

Warning

Review limit reached

@vprashrex, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 33 minutes and 31 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 45b14155-1bbd-4022-97cf-f3bcff7c9d96

📥 Commits

Reviewing files that changed from the base of the PR and between 5021d0d and abd4725.

📒 Files selected for processing (3)

backend/app/core/rate_monitor.py
backend/app/core/telemetry.py
backend/app/tests/core/test_rate_monitor.py

📝 Walkthrough

Walkthrough

This PR introduces per-organization rate limiting on three API endpoints using Redis-backed counters with configurable per-minute thresholds and Sentry alerting. Configuration defines thresholds (15 for LLM, 3 for collections and evaluations). The core rate_monitor module provides atomic Redis increment-and-get logic and a FastAPI dependency factory that checks per-org request counts, logs warnings, and triggers telemetry alerts when thresholds are exceeded. Three routes wire this into their dependency chains, and comprehensive tests cover normal paths, edge cases, and error handling.

Changes

Rate Limiting Infrastructure and Route Integration

Layer / File(s)	Summary
Configuration: rate thresholds `backend/app/core/config.py`	Settings class adds three per-minute rate threshold fields for LLM calls (15), collections (3), and evaluations (3).
Core rate monitor infrastructure `backend/app/core/rate_monitor.py`	RateCategory type and THRESHOLDS mapping load from config. Module-level Redis client initialized. `increment_and_get_count()` atomically increments a Redis key with 120-second expiry, returning the count or None on error. `monitor_rate(category)` returns a FastAPI dependency checker that reads the authenticated project, computes per-minute per-org counters, compares against thresholds, and dispatches Sentry alerts when breached, with graceful degradation on Redis errors.
Telemetry alerts for rate threshold events `backend/app/core/telemetry.py`	`record_rate_threshold()` emits warning-level Sentry events with org, category, request count, and threshold metadata as tags and extras, with no-op behavior if Sentry client is inactive or emission fails.
Route endpoint rate limiting `backend/app/api/routes/collections.py`, `backend/app/api/routes/evaluations/evaluation.py`, `backend/app/api/routes/llm.py`	Three POST endpoints (collections, evaluations, llm/call) import and wire `monitor_rate("collections")`, `monitor_rate("evaluations")`, and `monitor_rate("llm_call")` into their FastAPI dependency lists alongside existing project-permission checks.
Comprehensive test coverage `backend/app/tests/core/test_rate_monitor.py`	Test module validates increment_and_get_count (Redis pipeline, error handling), monitor_rate factory (org/category early exits, threshold comparison, telemetry dispatch, Redis error swallowing), and record_rate_threshold (Sentry active/inactive, tag emission, exception suppression). All external calls mocked; includes AuthContext helper.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

ProjectTech4DevAI/kaapi-frontend#129: This PR implements per-endpoint rate monitoring for llm/call, evaluations, and collections with Redis-backed organization-scoped counters and Sentry alerting; the linked issue requests similar per-API-key monitoring with Discord alerts.

Poem

A rabbit hops through Redis gates,
Counting requests, checking rates,
Sentry sings when limits break,
Three endpoints now can monitor their stake! 🐰⏱️

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: implementing per-organization request rate alerting for high-cost endpoints.
Docstring Coverage	✅ Passed	Docstring coverage is 90.48% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/threshold-monitor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-04T13:44:38Z

OpenAPI changes ⚪ No API surface changes

Note

This PR does not modify the API contract.

_{main ↔ e9943aec · generated by oasdiff}

…evaluations

sentry · 2026-06-04T14:06:42Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…hold

vprashrex · 2026-06-05T02:27:51Z

+
+        try:
+            count = increment_and_get_count(redis_key)
+            if count is not None and count > threshold:


@AkhileshNegi if wanted to enforce rate limit, we can add raise HttpException with status code 429 here.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

backend/app/tests/core/test_rate_monitor.py (1)
13-211: ⚡ Quick win

Add explicit return annotations to helper/test functions.

Line 13 (_auth_context) and test methods throughout this file are missing return type annotations (e.g., -> SimpleNamespace / -> None).

As per coding guidelines, "**/*.py: Always add type hints to all function parameters and return values in Python code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/tests/core/test_rate_monitor.py` around lines 13 - 211, Add
explicit return type annotations: update the helper _auth_context to declare its
return type (e.g., -> SimpleNamespace) and annotate every test method to return
None (e.g., def test_returns_count_and_sets_expiry(self) -> None). Locate
functions by their names (_auth_context and each test_* method in classes
TestIncrementAndGetCount, TestMonitorRate, and TestRecordRateThreshold) and add
the appropriate return annotations without changing behavior.
backend/app/core/rate_monitor.py (1)
49-49: ⚡ Quick win

Add an explicit return type to monitor_rate.

Line 49 is missing a return annotation for the dependency factory.
As per coding guidelines, "**/*.py: Always add type hints to all function parameters and return values in Python code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/core/rate_monitor.py` at line 49, monitor_rate is missing a
return type annotation; update the signature of monitor_rate(category:
RateCategory) to include an explicit return type that matches the dependency
factory it returns (e.g., import typing.Callable and annotate as ->
Callable[..., RateMonitor] or, if uncertain, -> Callable[..., Any]) and ensure
any referenced type (RateMonitor or Any) is imported or added to typing imports
so the function has a full return type hint.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/app/core/rate_monitor.py`:
- Around line 60-83: The code uses project (auth_context.project) for the Redis
key but then labels telemetry as org-scoped via
record_rate_threshold(org_id=project.id,...), causing inconsistent scoping;
locate where auth_context.project is read and instead resolve the organization
identity (e.g., auth_context.organization or project.organization_id /
project.organization) and use that organization id/name for both the redis_key
and the record_rate_threshold call (update redis_key =
f"rate_monitor:{category}:{org.id}:{minute_bucket}" and pass org.id/org.name
into record_rate_threshold) and keep increment_and_get_count and threshold logic
unchanged so counters and telemetry are consistently org-scoped.
- Around line 76-86: The code currently logs and calls record_rate_threshold for
every count > threshold; change the check to emit the alert only when the bucket
first crosses the threshold (e.g., when count == threshold + 1) so repeated
increments in the same minute don't spam alerts. Update the condition around
monitor logic that uses variables count and threshold (the block that calls
logger.warning and record_rate_threshold for project.id, project.name, and
category) to only run when the count has just moved from <=threshold to
>threshold (count == threshold + 1).

In `@backend/app/core/telemetry.py`:
- Around line 481-483: In function record_rate_threshold update the
logger.exception call to use the correct log prefix "[record_rate_threshold]"
(instead of "[record_rate_threshold_exceeded]") so the message follows the
convention; keep the same exception context (exc_info=e) and message text
otherwise to preserve error detail.

---

Nitpick comments:
In `@backend/app/core/rate_monitor.py`:
- Line 49: monitor_rate is missing a return type annotation; update the
signature of monitor_rate(category: RateCategory) to include an explicit return
type that matches the dependency factory it returns (e.g., import
typing.Callable and annotate as -> Callable[..., RateMonitor] or, if uncertain,
-> Callable[..., Any]) and ensure any referenced type (RateMonitor or Any) is
imported or added to typing imports so the function has a full return type hint.

In `@backend/app/tests/core/test_rate_monitor.py`:
- Around line 13-211: Add explicit return type annotations: update the helper
_auth_context to declare its return type (e.g., -> SimpleNamespace) and annotate
every test method to return None (e.g., def
test_returns_count_and_sets_expiry(self) -> None). Locate functions by their
names (_auth_context and each test_* method in classes TestIncrementAndGetCount,
TestMonitorRate, and TestRecordRateThreshold) and add the appropriate return
annotations without changing behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fa7d503f-de2a-4ef8-8046-2d1e522c6a82

📥 Commits

Reviewing files that changed from the base of the PR and between b06fec6 and 5021d0d.

📒 Files selected for processing (7)

backend/app/api/routes/collections.py
backend/app/api/routes/evaluations/evaluation.py
backend/app/api/routes/llm.py
backend/app/core/config.py
backend/app/core/rate_monitor.py
backend/app/core/telemetry.py
backend/app/tests/core/test_rate_monitor.py

…d of organization

kartpop

approved with comments

kartpop · 2026-06-05T07:22:20Z

+                    threshold=threshold,
+                )
+
+        except redis.RedisError as e:


increment_and_get_count returns None after an exception, so this redis.RedisError will practically never fire right? should remove the exception handler there and let this redis.RedisError handle it?

kartpop · 2026-06-05T07:26:22Z

+        pipe.incr(key)
+        pipe.expire(key, _EXPIRATION_SECONDS)


increment and expire are not atomic; what if increment executes, system crashes, expire does not execute -- key will remain in redis forever

def increment_and_get_count(key: str) -> int | None: try: # SET NX atomically creates the key with TTL only on first call. _redis_client.set(key, 0, ex=_EXPIRATION_SECONDS, nx=True) return _redis_client.incr(key) except Exception as e: logger.error( f"[increment_and_get_count] Error incrementing count for {key}: {e}" ) return None

feat: implement rate monitoring for API endpoints

6c502ea

feat: add threshold rates for monitoring LLM calls, collections, and …

4a77d05

…evaluations

vprashrex self-assigned this Jun 4, 2026

vprashrex added the enhancement New feature or request label Jun 4, 2026

vprashrex linked an issue Jun 4, 2026 that may be closed by this pull request

Monitoring: Add per API key rate monitoring and Discord alerts #797

Open

feat: update monitor_rate usage to accept dynamic category parameter

4263c46

feat: add unit tests for rate_monitor and telemetry.record_rate_thres…

6b4ce6c

…hold

vprashrex requested review from AkhileshNegi and Prajna1999 June 5, 2026 02:19

vprashrex added the ready-for-review label Jun 5, 2026

vprashrex force-pushed the feat/threshold-monitor branch from fab3581 to 6b4ce6c Compare June 5, 2026 02:20

vprashrex commented Jun 5, 2026

View reviewed changes

Merge branch 'main' into feat/threshold-monitor

b8d6f65

vprashrex requested review from Ayush8923 and kartpop and removed request for Prajna1999 June 5, 2026 05:01

feat: update monitor_rate to use project context instead of organization

5021d0d

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

Comment thread backend/app/core/rate_monitor.py

Comment thread backend/app/core/rate_monitor.py Outdated

Comment thread backend/app/core/telemetry.py Outdated

vprashrex added 2 commits June 5, 2026 10:47

feat: update telemetry and rate_monitor to use project context instea…

9ffcc83

…d of organization

feat: update rate_monitor and telemetry to use project context instea…

abd4725

…d of organization

kartpop approved these changes Jun 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rate Monitor: Per-org request rate alerting on high-cost endpoints#911

Rate Monitor: Per-org request rate alerting on high-cost endpoints#911
vprashrex wants to merge 8 commits into
mainfrom
feat/threshold-monitor

vprashrex commented Jun 4, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

Uh oh!

github-actions Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

sentry Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

vprashrex Jun 5, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kartpop left a comment

Uh oh!

kartpop Jun 5, 2026

Uh oh!

kartpop Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vprashrex commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Target issue is: #797

Summary

Checklist

Notes

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

Uh oh!

github-actions Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OpenAPI changes ⚪ No API surface changes

Uh oh!

sentry Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vprashrex Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kartpop left a comment

Choose a reason for hiding this comment

Uh oh!

kartpop Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

kartpop Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vprashrex commented Jun 4, 2026 •

edited

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

github-actions Bot commented Jun 4, 2026 •

edited

Loading

sentry Bot commented Jun 4, 2026 •

edited

Loading