Skip to content

Bugfix/usage for openrouter #11627

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

daarko10
Copy link
Contributor

@daarko10 daarko10 commented Jun 11, 2025

Propogate the openrouter usage information back correctly

Add OpenRouter include_usage flag mapping and propagate cost & is_byok in responses

Relevant issues

Fixes #11626

Pre-Submission checklist

  • Added or updated tests under tests/litellm/
  • Tests pass locally (see attached log snippet)
  • All unit tests pass via make test-unit
  • Changes are scoped to only OpenRouter usage-flag handling and cost propagation

Type

🐛 Bug Fix

Changes

  • Parameter mapping

    • In litellm/llms/openrouter/chat/transformation.py, detect stream_options.include_usage and emit:
      extra_body["usage"] = {"include": True}
    • Removes manual default—only when include_usage: true is set.
  • OpenRouter chunk parsing

    • In OpenRouterChatCompletionStreamingHandler.chunk_parser, read chunk["usage"]["cost"] and chunk["usage"]["is_byok"] and assign them directly to ModelResponseStream.usage.
  • Final stream builder

    • In litellm/main.py’s stream_chunk_builder, prefer the OpenRouter-provided usage (including cost/is_byok) over any manual fallback computation.
  • Cost calculator wrappers

    • In litellm/cost_calculator.py, switched to using PromptTokensDetailsWrapper and CompletionTokensDetailsWrapper in combine_usage_objects to carry nested token details.
  • Response conversion

    • In litellm_core_utils/llm_response_utils/convert_dict_to_response.py, unified both streaming and non-streaming paths to use:
      usage_object = Usage(**response_object["usage"])
      setattr(model_response_object, "usage", usage_object)
  • Logging utils

    • Minor tweak in litellm_core_utils/logging_utils.py to preserve exception context when logging errors in _assemble_complete_response_from_streaming_chunks.
  • Streaming chunk builder utilities

    • In litellm_core_utils/streaming_chunk_builder_utils.py:
      • Updated _usage_chunk_calculation_helper to include cost and is_byok in its returned dict.
      • Revised _process_usage_chunks to pull cost/is_byok from each chunk and feed them into the final usage data.
  • Streaming handler iteration

    • In litellm_core_utils/streaming_handler.py, refactored both __next__ and __anext__ to propagate intermediate chunk usage objects and combine them into a final Usage without re-computing cost.
  • Type extensions

    • In litellm/types/utils.py, extended the Usage model to include:
      cost: Optional[float]
      is_byok: Optional[bool]
      and retained private cache-token attributes for prompt caching.
  • Tests

    • tests/test_litellm/llms/openrouter/chat/test_openrouter_chat_transformation.py: new test verifying that, when stream_options.include_usage is true, the parsed ModelResponseStream.usage includes the OpenRouter cost and is_byok.
image

Copy link

vercel bot commented Jun 11, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 11, 2025 2:47pm

@daarko10
Copy link
Contributor Author

@krrishdholakia the test fails but locally it does work, can you please have a look and let me know what I need to fix?

@krrishdholakia
Copy link
Contributor

Hey @daarko10 this PR changes a lot of files - including core components which could impact other providers. Can we reduce the scope somehow?

@daarko10
Copy link
Contributor Author

Unfortunatly, @krrishdholakia , this was necessary as the current implementation completly swallows the usage with openrouter as it comes in the chunk after the chunk with finish_reason, so I added another state tracker of the usage, I construct it based of its arriving or not, make test-unit locally work nicely, and tested it with openrouter+multiplr providers, anthropic, gemini.

Also added fallbacks that revert to earlier functionality to preserve previous behaviour in case it fails for whatever reason

@matannahmani
Copy link

matannahmani commented Jun 16, 2025

any chance of getting this approved @krrishdholakia we are using this extensively at kodu-ai we need accurate usage reporting for open router users

@krrishdholakia krrishdholakia changed the base branch from main to litellm_openrouter_improvement_staging June 17, 2025 02:24
@krrishdholakia krrishdholakia merged commit 6c18b9a into BerriAI:litellm_openrouter_improvement_staging Jun 17, 2025
5 of 6 checks passed
@daarko10
Copy link
Contributor Author

Hey, @krrishdholakia, I played with that a bit and had a few of issues of inconsistency cause of pydantic fails, this made it stable and not fail.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Openrouter streaming Doesn't Return 'cost' and 'is_byok' from openrouter
3 participants