RSPEED-2689: Record token metrics when verbose infer post-processing fails#1364
Conversation
WalkthroughInitialize Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/app/endpoints/rlsapi_v1.py (1)
477-479: Redundant reassignment ofresponse = None.Line 478 reassigns
response = None, but this is already the value from line 460. The reassignment is harmless but unnecessary noise.♻️ Remove redundant assignment
else: - response = None response_text = await retrieve_simple_response(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/app/endpoints/rlsapi_v1.py` around lines 477 - 479, The assignment response = None before calling retrieve_simple_response is redundant because response was already set to None earlier; remove that extra assignment to clean up the code (inside the else branch where response and response_text are set) and leave the call to retrieve_simple_response(...) intact so behavior does not change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 477-479: The assignment response = None before calling
retrieve_simple_response is redundant because response was already set to None
earlier; remove that extra assignment to clean up the code (inside the else
branch where response and response_text are set) and leave the call to
retrieve_simple_response(...) intact so behavior does not change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: af023a3a-1b35-4c9a-9ac1-12649b93ad35
📒 Files selected for processing (2)
src/app/endpoints/rlsapi_v1.pytests/unit/app/endpoints/test_rlsapi_v1.py
Signed-off-by: Major Hayden <major@redhat.com>
6de788a to
0cb2f9f
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/unit/app/endpoints/test_rlsapi_v1.py (1)
5-5: Consider splitting this test module over time instead of broad lint suppression.
too-many-linessuppression is acceptable short-term, but this file is already very large and hard to navigate. Incremental split by endpoint behavior/theme would improve maintainability.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/app/endpoints/test_rlsapi_v1.py` at line 5, The module-level suppression "# pylint: disable=too-many-lines" hides that this test module has grown too large; instead remove that broad suppression and split the tests by endpoint/behavior into smaller test modules so each file is under lint limits (e.g., move groups of tests for a single endpoint or behavior from test_rlsapi_v1 into separate modules like test_rlsapi_v1_auth.py, test_rlsapi_v1_items.py, etc.), update any shared fixtures/imports to a common conftest or helper module, and re-run linting to ensure each new file no longer triggers too-many-lines.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/unit/app/endpoints/test_rlsapi_v1.py`:
- Line 5: The module-level suppression "# pylint: disable=too-many-lines" hides
that this test module has grown too large; instead remove that broad suppression
and split the tests by endpoint/behavior into smaller test modules so each file
is under lint limits (e.g., move groups of tests for a single endpoint or
behavior from test_rlsapi_v1 into separate modules like test_rlsapi_v1_auth.py,
test_rlsapi_v1_items.py, etc.), update any shared fixtures/imports to a common
conftest or helper module, and re-run linting to ensure each new file no longer
triggers too-many-lines.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 78da4bbd-626f-4430-9300-9f3326bdbbc8
📒 Files selected for processing (2)
src/app/endpoints/rlsapi_v1.pytests/unit/app/endpoints/test_rlsapi_v1.py
🚧 Files skipped from review as they are similar to previous changes (1)
- src/app/endpoints/rlsapi_v1.py
Description
The verbose
/inferpath records metrics viabuild_turn_summary()on success, but ifextract_text_from_response_items()throws after a successful LLM call,build_turn_summary()is never reached and token metrics (llm_calls_total,llm_token_sent_total,llm_token_received_total) are lost for tokens that were consumed. This addsextract_token_usage()to the exception handler (guarded byverbose_enabled and response is not None) so metrics are captured even when post-processing fails. Also addsresponse = Noneinitialization before the try block to avoidUnboundLocalError.Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Unit tests verify
extract_token_usageis called in the except block when the verbose LLM call succeeds but post-processing fails (extract_text_from_response_itemsraises). A second test verifies the non-verbose exception path does NOT callextract_token_usageagain (already called insideretrieve_simple_response()). All 43 existing tests continue to pass.uv run make test-unitanduv run make verifypass with 0 errors.Summary by CodeRabbit
Bug Fixes
Tests