-
-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Bugfix/usage for openrouter #11627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix/usage for openrouter #11627
Conversation
…sage handling and calculation logic for better modularity and accuracy."
…of cost and token details, enhance chunk processing, and fix PromptTokensDetails to PromptTokensDetailsWrapper.
…ng response processing.
… to ignore new usage metadata fields.
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@krrishdholakia the test fails but locally it does work, can you please have a look and let me know what I need to fix? |
Hey @daarko10 this PR changes a lot of files - including core components which could impact other providers. Can we reduce the scope somehow? |
Unfortunatly, @krrishdholakia , this was necessary as the current implementation completly swallows the usage with openrouter as it comes in the chunk after the chunk with finish_reason, so I added another state tracker of the usage, I construct it based of its arriving or not, make test-unit locally work nicely, and tested it with openrouter+multiplr providers, anthropic, gemini. Also added fallbacks that revert to earlier functionality to preserve previous behaviour in case it fails for whatever reason |
any chance of getting this approved @krrishdholakia we are using this extensively at kodu-ai we need accurate usage reporting for open router users |
6c18b9a
into
BerriAI:litellm_openrouter_improvement_staging
Hey, @krrishdholakia, I played with that a bit and had a few of issues of inconsistency cause of pydantic fails, this made it stable and not fail. |
Propogate the openrouter usage information back correctly
Add OpenRouter
include_usage
flag mapping and propagatecost
&is_byok
in responsesRelevant issues
Fixes #11626
Pre-Submission checklist
tests/litellm/
make test-unit
Type
🐛 Bug Fix
Changes
Parameter mapping
litellm/llms/openrouter/chat/transformation.py
, detectstream_options.include_usage
and emit:include_usage: true
is set.OpenRouter chunk parsing
OpenRouterChatCompletionStreamingHandler.chunk_parser
, readchunk["usage"]["cost"]
andchunk["usage"]["is_byok"]
and assign them directly toModelResponseStream.usage
.Final stream builder
litellm/main.py
’sstream_chunk_builder
, prefer the OpenRouter-providedusage
(includingcost
/is_byok
) over any manual fallback computation.Cost calculator wrappers
litellm/cost_calculator.py
, switched to usingPromptTokensDetailsWrapper
andCompletionTokensDetailsWrapper
incombine_usage_objects
to carry nested token details.Response conversion
litellm_core_utils/llm_response_utils/convert_dict_to_response.py
, unified both streaming and non-streaming paths to use:Logging utils
litellm_core_utils/logging_utils.py
to preserve exception context when logging errors in_assemble_complete_response_from_streaming_chunks
.Streaming chunk builder utilities
litellm_core_utils/streaming_chunk_builder_utils.py
:_usage_chunk_calculation_helper
to includecost
andis_byok
in its returned dict._process_usage_chunks
to pullcost
/is_byok
from each chunk and feed them into the final usage data.Streaming handler iteration
litellm_core_utils/streaming_handler.py
, refactored both__next__
and__anext__
to propagate intermediate chunkusage
objects and combine them into a finalUsage
without re-computing cost.Type extensions
litellm/types/utils.py
, extended theUsage
model to include:Tests
tests/test_litellm/llms/openrouter/chat/test_openrouter_chat_transformation.py
: new test verifying that, whenstream_options.include_usage
is true, the parsedModelResponseStream.usage
includes the OpenRouter cost andis_byok
.