-
Notifications
You must be signed in to change notification settings - Fork 55
LCORE-741: consume quota mechanism #722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LCORE-741: consume quota mechanism #722
Conversation
|
Warning Rate limit exceeded@tisnik has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 7 minutes and 51 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
3aea405 to
4d46ee1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/utils/quota.py (3)
14-28: Improve docstring and consider error handling consistency.The docstring should follow Google Python style with an Args section per coding guidelines. Additionally, this function lacks error handling while
check_tokens_availablehas extensive error handling—consider whether errors fromconsume_tokensshould also be caught and translated to appropriate HTTP exceptions.As per coding guidelines.
Apply this diff to improve the docstring:
- """Consume tokens from cluster and/or user quotas.""" + """Consume tokens from cluster and/or user quotas. + + Args: + quota_limiters: List of quota limiter instances to consume tokens from. + user_id: Identifier of the user consuming tokens. + input_tokens: Number of input tokens to consume. + output_tokens: Number of output tokens to consume. + """
30-31: Improve docstring to follow Google Python style.The docstring should include Args and Raises sections per coding guidelines.
As per coding guidelines.
Apply this diff:
- """Check if tokens are available for user.""" + """Check if tokens are available for user. + + Args: + quota_limiters: List of quota limiter instances to check. + user_id: Identifier of the user to check quota for. + + Raises: + HTTPException: If quota is exceeded or database error occurs. + """
57-67: Improve docstring to follow Google Python style.The docstring should include Args and Returns sections per coding guidelines. Also consider whether this function should handle errors from
available_quotacalls, similar to the error handling incheck_tokens_available.As per coding guidelines.
Apply this diff to improve the docstring:
- """Get quota available from all quota limiters.""" + """Get quota available from all quota limiters. + + Args: + quota_limiters: List of quota limiter instances to query. + user_id: Identifier of the user to retrieve quotas for. + + Returns: + Dictionary mapping limiter class names to available quota amounts. + """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/utils/quota.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/utils/quota.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
src/utils/quota.py
🧬 Code graph analysis (1)
src/utils/quota.py (2)
src/quota/quota_limiter.py (1)
QuotaLimiter(19-93)src/log.py (1)
get_logger(7-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-pr
- GitHub Check: e2e_tests (ci)
- GitHub Check: e2e_tests (azure)
| from log import get_logger | ||
|
|
||
| logger = get_logger(__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use standard logging setup per coding guidelines.
The coding guidelines specify using logger = logging.getLogger(__name__) after importing the logging module directly, rather than using the custom get_logger function.
As per coding guidelines.
Apply this diff to align with the guidelines:
-from log import get_logger
+import logging
-logger = get_logger(__name__)
+logger = logging.getLogger(__name__)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from log import get_logger | |
| logger = get_logger(__name__) | |
| import logging | |
| logger = logging.getLogger(__name__) |
🤖 Prompt for AI Agents
In src/utils/quota.py around lines 9 to 11, the module currently imports and
uses a custom get_logger; replace it with the standard logging setup by
importing the logging module and creating the logger via
logging.getLogger(__name__); remove the custom get_logger import and ensure only
the standard logging import and logger assignment remain.
4d46ee1 to
c10c8f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
src/utils/quota.py (2)
10-12: Use standard logging setup per coding guidelines.This issue was flagged in a previous review but remains unaddressed. The coding guidelines mandate using
logger = logging.getLogger(__name__)after importing theloggingmodule directly.As per coding guidelines.
Apply this diff to align with the guidelines:
-from log import get_logger +import logging -logger = get_logger(__name__) +logger = logging.getLogger(__name__)
46-55: Uselogger.warningfor quota exhaustion.The previous review comment recommended changing
logger.errortologger.warningon line 48, since quota exhaustion is expected client behavior (not a server error). This change wasn't applied.Apply this diff:
except QuotaExceedError as quota_exceed_error: message = "The quota has been exceeded" - logger.error(message) + logger.warning(message) raise HTTPException(
🧹 Nitpick comments (3)
src/utils/quota.py (3)
15-28: Enhance docstring to follow Google Python style.The coding guidelines require docstrings with Args and Returns sections. The current one-line docstring should be expanded.
As per coding guidelines.
Apply this diff:
- """Consume tokens from cluster and/or user quotas.""" + """Consume tokens from cluster and/or user quotas. + + Args: + quota_limiters: List of quota limiter instances to consume tokens from. + user_id: Identifier of the user consuming tokens. + input_tokens: Number of input tokens to consume. + output_tokens: Number of output tokens to consume. + + Returns: + None + """
31-55: Enhance docstring to follow Google Python style.The coding guidelines require docstrings with Args, Returns, and Raises sections for functions. The current one-line docstring should be expanded to document the exception behavior.
As per coding guidelines.
Apply this diff:
- """Check if tokens are available for user.""" + """Check if tokens are available for user. + + Args: + quota_limiters: List of quota limiter instances to check. + user_id: Identifier of the user to check quota for. + + Returns: + None + + Raises: + HTTPException: With status 500 if database communication fails, + or status 429 if quota is exceeded. + """
58-68: Enhance docstring to follow Google Python style.The coding guidelines require docstrings with Args and Returns sections. The current one-line docstring should be expanded to describe the return value structure.
As per coding guidelines.
Apply this diff:
- """Get quota available from all quota limiters.""" + """Get quota available from all quota limiters. + + Args: + quota_limiters: List of quota limiter instances to query. + user_id: Identifier of the user to get quotas for. + + Returns: + Dictionary mapping quota limiter class names to available token counts. + """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/utils/quota.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/utils/quota.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
src/utils/quota.py
🧬 Code graph analysis (1)
src/utils/quota.py (3)
src/quota/quota_limiter.py (1)
QuotaLimiter(19-93)src/quota/quota_exceed_error.py (1)
QuotaExceedError(6-38)src/log.py (1)
get_logger(7-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-pr
- GitHub Check: e2e_tests (ci)
- GitHub Check: e2e_tests (azure)
c10c8f2 to
9856806
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/app/endpoints/query.py (3)
281-289: Bug: passing method instead of list; call configuration.quota_limiters().This is the root cause of the TypeError seen in tests. Apply the fix here (and similarly below).
- check_tokens_available(configuration.quota_limiters, user_id) + check_tokens_available(configuration.quota_limiters(), user_id)
388-401: Same call-site bug for get_available_quotas.Ensure the method is invoked.
- available_quotas = get_available_quotas(configuration.quota_limiters, user_id) + available_quotas = get_available_quotas(configuration.quota_limiters(), user_id)
281-358: TOCTOU: quota check and consumption are not atomic.Between check_tokens_available and consume_tokens another request can consume quota, causing consume_tokens to fail after a successful LLM call. Consider:
- Implementing reserve-then-commit semantics, or
- A single atomic check-and-consume operation in the limiter, or
- Handling QuotaExceedError during consume as a warning and continuing (business decision).
Happy to draft an atomic helper in utils/quota if desired.
♻️ Duplicate comments (2)
src/utils/quota.py (2)
69-78: Log quota exhaustion as warning, not error (duplicate of prior review).QuotaExceedError is expected user behavior; prefer warning to avoid noisy error logs.
- message = "The quota has been exceeded" - logger.error(message) + message = "The quota has been exceeded" + logger.warning(message)
10-12: Use standard logging per guidelines (duplicate of prior review).Replace custom get_logger with standard logging.getLogger(name).
-from log import get_logger - -logger = get_logger(__name__) +import logging + +logger = logging.getLogger(__name__)As per coding guidelines.
🧹 Nitpick comments (3)
src/app/endpoints/query.py (1)
353-358: Don’t let quota consume failure 500 a successful request.Wrap consume_tokens in a targeted try/except (psycopg2.Error as 500, QuotaExceedError as 429 or warning per policy) and ensure cache/transcript persistence decisions are intentional.
- consume_tokens( + try: + consume_tokens( configuration.quota_limiters(), user_id, input_tokens=token_usage.input_tokens, output_tokens=token_usage.output_tokens, - ) + ) + except HTTPException: + raise + except Exception as e: + logger.error("Failed to consume tokens for %s: %s", user_id, e) + # Decide: re-raise, or continue with best-effort. If continuing, add a metric.src/utils/quota.py (2)
55-59: Defensive input validation for clearer failures.If a non-iterable (e.g., a bound method) is passed, raise a clear TypeError pointing to the likely mistake.
try: + if not isinstance(quota_limiters, list): + raise TypeError( + "quota_limiters must be a list[QuotaLimiter]; did you forget to call configuration.quota_limiters()?" + ) # check available tokens using all configured quota limiters for quota_limiter in quota_limiters: quota_limiter.ensure_available_quota(subject_id=user_id)
81-101: Minor: mirror the same defensive check in get_available_quotas (optional).Improves error clarity in tests/mocks.
def get_available_quotas( quota_limiters: list[QuotaLimiter], user_id: str, ) -> dict[str, int]: @@ - # retrieve available tokens using all configured quota limiters + # retrieve available tokens using all configured quota limiters + if not isinstance(quota_limiters, list): + raise TypeError( + "quota_limiters must be a list[QuotaLimiter]; did you forget to call configuration.quota_limiters()?" + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/app/endpoints/query.py(5 hunks)src/utils/quota.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/app/endpoints/query.pysrc/utils/quota.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code
Files:
src/app/endpoints/query.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
src/app/endpoints/query.pysrc/utils/quota.py
src/{app/**/*.py,client.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use async def for I/O-bound operations and external API calls
Files:
src/app/endpoints/query.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling
Files:
src/app/endpoints/query.py
🧬 Code graph analysis (2)
src/app/endpoints/query.py (2)
src/utils/quota.py (3)
get_available_quotas(81-101)check_tokens_available(41-78)consume_tokens(15-38)src/configuration.py (2)
configuration(73-77)quota_limiters(173-181)
src/utils/quota.py (3)
src/quota/quota_limiter.py (1)
QuotaLimiter(19-93)src/quota/quota_exceed_error.py (1)
QuotaExceedError(6-38)src/log.py (1)
get_logger(7-13)
🪛 GitHub Actions: Unit tests
src/utils/quota.py
[error] 57-57: TypeError: 'Mock' object is not iterable while iterating over quota_limiters in check_tokens_available. This occurs during pytest run when quota_limiters is mocked in query_endpoint_handler tests.
[error] 57-57: TypeError: 'Mock' object is not iterable while iterating over quota_limiters in check_tokens_available. This occurs during pytest run when quota_limiters is mocked in query_endpoint_handler tests.
[error] 57-57: TypeError: 'Mock' object is not iterable while iterating over quota_limiters in check_tokens_available. This occurs during pytest run when quota_limiters is mocked in query_endpoint_handler tests.
[error] 57-57: TypeError: 'Mock' object is not iterable while iterating over quota_limiters in check_tokens_available. This occurs during pytest run when quota_limiters is mocked in query_endpoint_handler tests.
[error] 57-57: TypeError: 'Mock' object is not iterable while iterating over quota_limiters in check_tokens_available. This occurs during pytest run when quota_limiters is mocked in query_endpoint_handler tests.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-pr
- GitHub Check: e2e_tests (azure)
- GitHub Check: e2e_tests (ci)
| from utils.quota import ( | ||
| get_available_quotas, | ||
| check_tokens_available, | ||
| consume_tokens, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Imports LGTM; ensure utils.quota adheres to logging guideline.
The imported helpers are fine; see separate comment in utils/quota.py about replacing get_logger with logging.getLogger(name).
Scan for any other missing parentheses usages:
🏁 Script executed:
#!/bin/bash
rg -nP "configuration\\.quota_limiters(?!\\s*\\()" -g "src/**.py" -S -C2Length of output: 0
---
🏁 Script executed:
```shell
#!/bin/bash
# Verify the file at src/app/endpoints/query.py exists and check imports
fd "query.py" src/app/endpoints/
Length of output: 63
🏁 Script executed:
#!/bin/bash
# Check if utils/quota.py exists and examine its logging implementation
fd "quota.py" src/Length of output: 18
🏁 Script executed:
#!/bin/bash
# Search for get_logger usage in utils/quota.py
rg -n "get_logger|logging\.getLogger" src/utils/quota.pyLength of output: 62
---
🏁 Script executed:
```shell
#!/bin/bash
# Examine query.py lines 50-65 to verify imports context
sed -n '50,65p' src/app/endpoints/query.py
Length of output: 508
🏁 Script executed:
#!/bin/bash
# Examine utils/quota.py to see full logging implementation and scope
cat src/utils/quota.pyLength of output: 3113
Fix logging in utils/quota.py to meet guideline requirements.
Imports in query.py are syntactically correct. However, utils/quota.py violates the logging guideline: it uses custom get_logger (line 10) instead of the standard logging.getLogger(__name__). Replace:
from log import get_logger
logger = get_logger(__name__)With:
import logging
logger = logging.getLogger(__name__)🤖 Prompt for AI Agents
In utils/quota.py around line 10, replace the custom logger import and
instantiation that uses get_logger with the standard logging module: remove
"from log import get_logger" and "logger = get_logger(__name__)" and instead
import the built-in logging module and instantiate the logger with
logging.getLogger(__name__); keep the same logger variable name so existing
calls to logger.debug/info/error continue to work.
| consume_tokens( | ||
| configuration.quota_limiters, | ||
| user_id, | ||
| input_tokens=token_usage.input_tokens, | ||
| output_tokens=token_usage.output_tokens, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same call-site bug for consume_tokens.
Call the method to get the list; consider storing it once in a local variable to avoid repeated calls.
- consume_tokens(
- configuration.quota_limiters,
+ consume_tokens(
+ configuration.quota_limiters(),
user_id,
input_tokens=token_usage.input_tokens,
output_tokens=token_usage.output_tokens,
)Optional small refactor:
+ quota_limiters = configuration.quota_limiters()
- check_tokens_available(configuration.quota_limiters(), user_id)
+ check_tokens_available(quota_limiters, user_id)
...
- consume_tokens(configuration.quota_limiters(), user_id,
+ consume_tokens(quota_limiters, user_id,
input_tokens=token_usage.input_tokens,
output_tokens=token_usage.output_tokens)
...
- available_quotas = get_available_quotas(configuration.quota_limiters(), user_id)
+ available_quotas = get_available_quotas(quota_limiters, user_id)🤖 Prompt for AI Agents
In src/app/endpoints/query.py around lines 353 to 358, the call to
consume_tokens is passing configuration.quota_limiters directly which likely
triggers repeated property access or a method call; call the method/property
once, assign the resulting list to a local variable (e.g., quota_limiters =
configuration.quota_limiters or configuration.get_quota_limiters()) and then
pass that local variable into consume_tokens so the list is retrieved once and
reused; update the call site accordingly to use the local variable.
Description
LCORE-741: consume quota mechanism
Type of change
Related Tickets & Documents
Summary by CodeRabbit