Skip to content

feat: flatten LiteLLM cache/reasoning usage sub-counts in _usage_to_dict#6033

Merged
lucasgomide merged 2 commits into
mainfrom
luzk/surface-cache-litellm
Jun 3, 2026
Merged

feat: flatten LiteLLM cache/reasoning usage sub-counts in _usage_to_dict#6033
lucasgomide merged 2 commits into
mainfrom
luzk/surface-cache-litellm

Conversation

@lucasgomide
Copy link
Copy Markdown
Contributor

@lucasgomide lucasgomide commented Jun 3, 2026

LiteLLM returns provider usage as-is, nesting cache-read / cache-creation / reasoning counts under provider-specific shapes (e.g. prompt_tokens_details.cached_tokens, Anthropic-style cache_read_input_tokens). Surface them as flat cached_prompt_tokens / reasoning_tokens / cache_creation_tokens keys so the span pipeline can read them; prompt / completion / total token counts are left untouched.


Note

Low Risk
Observability-only normalization on usage payloads before events; behavior change is copying dicts instead of returning the same reference, with broad test coverage.

Overview
LiteLLM usage dicts are normalized before LLM call events and spans see them. LLM._usage_to_dict no longer returns raw provider shapes unchanged; it copies dict/Pydantic/__dict__ usage into a new dict and promotes cache-read, cache-creation, and reasoning counts from nested or Anthropic-style fields into top-level cached_prompt_tokens, reasoning_tokens, and cache_creation_tokens, matching what BaseLLM._track_token_usage_internal and the span pipeline already expect. Core counts (prompt_tokens, completion_tokens, total_tokens) are not rewritten, and plain usage without those buckets is left without the derived keys.

Tests replace the old “dict pass-through” assertion with coverage that inputs are copied (not mutated), parametrized provider shapes normalize correctly, core totals stay intact, and missing buckets are omitted.

Reviewed by Cursor Bugbot for commit 9123b32. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced token usage normalization to ensure consistent handling across different LLM providers, properly flattening nested token details into expected top-level fields for cached tokens, reasoning tokens, and cache creation tokens.
  • Tests

    • Expanded test coverage for token usage handling, adding parametrized tests to verify normalization across various provider formats and confirming core token counts are preserved correctly.

LiteLLM returns provider usage as-is, nesting cache-read / cache-creation /
reasoning counts under provider-specific shapes (e.g.
prompt_tokens_details.cached_tokens, Anthropic-style cache_read_input_tokens).
Surface them as flat cached_prompt_tokens / reasoning_tokens /
cache_creation_tokens keys so the span pipeline can read them; prompt /
completion / total token counts are left untouched.
@github-actions github-actions Bot added the size/M label Jun 3, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 3, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

The LLM _usage_to_dict method now normalizes provider usage objects (dict, BaseModel, or plain object instances) into plain dicts and flattens nested token fields into standardized top-level keys (cached_prompt_tokens, reasoning_tokens, cache_creation_tokens) with multi-source fallback precedence. The test suite validates normalization across flat and nested usage shapes while preserving primary token counts.

Changes

Token Usage Normalization

Layer / File(s) Summary
_usage_to_dict normalization implementation
lib/crewai/src/crewai/llm.py
LLM._usage_to_dict converts provider usage objects and dicts to plain dicts, extracts nested/provider-specific token fields (*_tokens_details, cache_read_*, cache_creation_*) with key precedence lookup, and flattens them into standardized top-level keys while preserving prompt_tokens, completion_tokens, and total_tokens.
Test coverage for usage normalization
lib/crewai/tests/events/test_llm_usage_event.py
Tests assert flat dicts are returned unchanged without adding derived bucket keys, parametrized variants verify normalization of nested LiteLLM-style buckets into flattened token fields, primary token counts are preserved, and absent bucket inputs do not produce added derived fields.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Token fields, once scattered and deep,
Now flattened and tidy, a promise to keep,
With nested buckets brought into the light,
Usage normalization shines crystal bright! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and specifically describes the main change: flattening LiteLLM cache/reasoning usage sub-counts in the _usage_to_dict method, which aligns with the changeset's core objective.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch luzk/surface-cache-litellm

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
lib/crewai/src/crewai/llm.py (1)

1962-1967: ⚡ Quick win

Be aware: or chaining treats explicit zero as absent.

The precedence chain uses or, so if cached_tokens is explicitly 0, the chain continues and might select a non-zero value from cached_prompt_tokens or other sources. Example: {cached_tokens: 0, cached_prompt_tokens: 5} yields 5.

This mirrors the existing pattern in BaseLLM._track_token_usage_internal (as documented), and in practice providers typically populate only one field, so the risk is low. However, it's worth being aware of this behavior.

📋 Consider adding test coverage for zero-value edge case

Add a test case to document the intended behavior when a field is explicitly 0:

def test_zero_cached_tokens_with_alternative_source():
    """Document behavior when primary source is 0 and alternative exists."""
    usage = {
        "cached_tokens": 0,
        "cached_prompt_tokens": 5,
    }
    result = LLM._usage_to_dict(usage)
    # Current behavior: returns 5 (continues precedence chain)
    # Alternative: could return 0 (stop at first explicit value)
    assert result["cached_prompt_tokens"] == 5  # or 0, depending on intent

This would clarify whether 0 means "no cached tokens" (continue checking) or "explicitly zero cached tokens" (stop here).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/llm.py` around lines 1962 - 1967, The current
precedence chain for cached_prompt_tokens uses boolean "or" which treats 0 as
falsy and skips it; update the selection logic in the block that computes
cached_prompt_tokens (referencing variables/data keys cached_tokens,
cached_prompt_tokens, cache_read_input_tokens and helper _nested) to explicitly
check for None (e.g., use "is not None" or a sentinel) so an explicit 0 is
preserved as a valid value; also add a unit test (e.g., in tests covering
LLM._usage_to_dict or the relevant conversion path) that asserts when
{"cached_tokens": 0, "cached_prompt_tokens": 5} the function returns
cached_tokens == 0 (or documents the chosen behavior) to prevent regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@lib/crewai/src/crewai/llm.py`:
- Around line 1962-1967: The current precedence chain for cached_prompt_tokens
uses boolean "or" which treats 0 as falsy and skips it; update the selection
logic in the block that computes cached_prompt_tokens (referencing
variables/data keys cached_tokens, cached_prompt_tokens, cache_read_input_tokens
and helper _nested) to explicitly check for None (e.g., use "is not None" or a
sentinel) so an explicit 0 is preserved as a valid value; also add a unit test
(e.g., in tests covering LLM._usage_to_dict or the relevant conversion path)
that asserts when {"cached_tokens": 0, "cached_prompt_tokens": 5} the function
returns cached_tokens == 0 (or documents the chosen behavior) to prevent
regressions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 0c8cab37-1177-4c91-8bc9-2c60f0b0d113

📥 Commits

Reviewing files that changed from the base of the PR and between ea88904 and 3da3d46.

📒 Files selected for processing (2)
  • lib/crewai/src/crewai/llm.py
  • lib/crewai/tests/events/test_llm_usage_event.py

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 9123b32. Configure here.

Comment thread lib/crewai/src/crewai/llm.py
@lucasgomide lucasgomide merged commit d09e3f4 into main Jun 3, 2026
56 checks passed
@lucasgomide lucasgomide deleted the luzk/surface-cache-litellm branch June 3, 2026 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants