Fix TypeError when tracking usage with Anthropic models returning Pydantic objects #8978

Copilot · 2025-10-27T01:22:16Z

Anthropic's API returns CacheCreation Pydantic model objects in usage data when prompt caching is enabled. UsageTracker._merge_usage_entries() attempted to add these objects arithmetically, causing TypeError: unsupported operand type(s) for +: 'CacheCreation' and 'CacheCreation'.

# Before: TypeError on second call
with track_usage() as tracker:
    lm = dspy.LM("claude-sonnet-4-5-20250929", cache=True)
    with dspy.context(lm=lm):
        predictor = dspy.Predict("question -> answer")
        await predictor.acall(question="What is 2+2?")
        await predictor.acall(question="What is 3+3?")  # Crashes here
    tracker.get_total_tokens()  # TypeError

# After: Works correctly, merges nested token counts
tracker.get_total_tokens()
# {'claude-sonnet-4': {
#     'input_tokens': 300,
#     'cache_creation_input_tokens': {
#         'ephemeral_1h_input_tokens': 3072,  # 1024 + 2048
#         'ephemeral_5m_input_tokens': 1536   # 512 + 1024
#     }
# }}

Changes

_flatten_usage_entry(): Convert Pydantic BaseModel instances to dicts via model_dump() when usage is added

_merge_usage_entries(): Detect and convert any Pydantic models before merging; recursively merge nested dicts and sum numeric fields

Tests: Added test_merge_usage_entries_with_pydantic_models() validating multiple CacheCreation objects merge correctly

Fix applies to any Pydantic model objects from any LM provider.

Original prompt

This section details on the original issue you should resolve

<issue_title>[Bug] TypeError when using track_usage() with Anthropic models that return CacheCreation objects</issue_title>
<issue_description>### What happened?

Summary

The UsageTracker crashes with a TypeError when tracking usage for Anthropic models that use prompt caching. Anthropic's API returns CacheCreation Pydantic model objects in the usage data, but DSPy's _merge_usage_entries() attempts to add these objects arithmetically, which fails.

Expected Behavior

Usage tracking should work correctly with all LM providers, including Anthropic models with prompt caching enabled. The tracker should either:

Skip or ignore non-numeric fields
Convert Pydantic objects to their numeric values
Handle these objects gracefully in some other way

Actual Behavior

When calling tracker.get_total_tokens() after making multiple API calls to Anthropic with caching enabled, a TypeError is raised:

$ DSPY_CACHEDIR=./ignored/dspycache_$(date +%s).cache uv run --prerelease=allow repro_dspy_cache_bug.py

   ERROR: unsupported operand type(s) for +: 'CacheCreationTokenDetails' and 'CacheCreationTokenDetails'

============================================================
✗ FAILED: Bug reproduced!
Traceback (most recent call last):
  File "/Users/ndr/coding/github/stanfordnlp/dspy/repro_dspy_cache_bug.py", line 74, in <module>
    asyncio.run(main())
    ~~~~~~~~~~~^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.5/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.5/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.5/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/base_events.py", line 725, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/Users/ndr/coding/github/stanfordnlp/dspy/repro_dspy_cache_bug.py", line 62, in main
    total_usage = tracker.get_total_tokens()
  File "/Users/ndr/coding/github/stanfordnlp/dspy/dspy/utils/usage_tracker.py", line 58, in get_total_tokens
    total_usage = self._merge_usage_entries(total_usage, usage_entry)
  File "/Users/ndr/coding/github/stanfordnlp/dspy/dspy/utils/usage_tracker.py", line 42, in _merge_usage_entries
    result[k] = self._merge_usage_entries(current_v, v)
                ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
  File "/Users/ndr/coding/github/stanfordnlp/dspy/dspy/utils/usage_tracker.py", line 44, in _merge_usage_entries
    result[k] = (current_v or 0) + (v or 0)
                ~~~~~~~~~~~~~~~~~^~~~~~~~~~
TypeError: unsupported operand type(s) for +: 'CacheCreationTokenDetails' and 'CacheCreationTokenDetails'

Root Cause

The bug seems to be in dspy/utils/usage_tracker.py:44:

result[k] = (current_v or 0) + (v or 0)

This line assumes all usage values are numeric (int/float), but Anthropic's API returns structured objects like:

Usage(
    input_tokens=100,
    output_tokens=50,
    cache_creation_input_tokens=CacheCreation(ephemeral_1h_input_tokens=1024, ephemeral_5m_input_tokens=512),
    cache_read_input_tokens=None
)

When merging usage from multiple API calls, the code tries to add two CacheCreation objects together, which fails because Pydantic models don't support addition.

Steps to Reproduce

I've included a standalone reproduction script (repro_dspy_cache_bug.py) that demonstrates the issue. To run it:

ANTHROPIC_API_KEY=your_key_here DSPY_CACHEDIR=./cache uv run --prerelease=allow repro_dspy_cache_bug.py

Additional Context

The bug only manifests when:
1. Using Anthropic models (Claude)
2. Making 2+ API calls within a track_usage() context
3. The API returns CacheCreation objects (when caching is used)
First call doesn't crash (nothing to merge)
OpenAI and other providers that return only numeric usage values are unaffected

Proposed Solution

The _merge_usage_entries() method should detect non-numeric values and handle them appropriately. Possible approaches:

Check if values support addition before attempting it
Convert Pydantic models to dicts before merging
Store non-numeric values in a way that doesn't require arithmetic operations

I can submit a PR with a fix if helpful!

Steps to reproduce

Run the following with:

DSPY_CACHEDIR=./dspycache_$(date +%s).cache uv run --prerelease=allow repro_dspy_cache_bug.py

#!/usr/bin/env -S uv run --prerelease=allow
# /// script
# requires-python = ">=3.12"
# dependencies = [
#     "dspy==3.0.4b1",
#     "anthropic",
# ]
# ///
"""
Reproduces ...

</details>

- Fixes stanfordnlp/dspy#8965

<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for you](https://github.com/stanfordnlp/dspy/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo.

…racker Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>

Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>

chenmoneygithub

I simplified the implementation a bit, but copilot did accurately find the error and made the fix.

Meanwhile it seems to understand our unit test poorly, in its code it's using Anthropic's format, while we should use litellm's.

commit 056d54e Author: Isaac Miller <17116851+isaacbmiller@users.noreply.github.com> Date: Wed Oct 29 17:23:09 2025 +0100 fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot (stanfordnlp#8909) * fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot * remove extra logs * Remove log * Fix merge conflict * Remove extra whitespace commit da69f9d Author: TomuHirata <tomu.hirata@gmail.com> Date: Wed Oct 29 13:23:34 2025 +0900 Update anthropic model name (stanfordnlp#8992) Signed-off-by: TomuHirata <tomu.hirata@gmail.com> commit aaadf05 Author: Chen Qian <chen.qian@databricks.com> Date: Tue Oct 28 12:21:55 2025 -0700 lints (stanfordnlp#8987) commit e842ba1 Author: eramis73 <130156545+eramis73@users.noreply.github.com> Date: Tue Oct 28 02:40:34 2025 +0300 [docs] Add Google-style docstrings for dspy/evaluate/metrics.py (stanfordnlp#8954) * docs(metrics): add Google-style docstrings for public metrics * docs(metrics): address review feedback (concise openings, mkdocs block examples); revert non-doc changes * fixes --------- Co-authored-by: chenmoneygithub <chen.qian@databricks.com> commit 6c43880 Author: TomuHirata <tomu.hirata@gmail.com> Date: Tue Oct 28 07:21:06 2025 +0900 Cache Ollama to speed up CI (stanfordnlp#8972) * Cache Ollama to speed up CI * fix permission commit 462baef Author: Copilot <198982749+Copilot@users.noreply.github.com> Date: Mon Oct 27 11:57:27 2025 -0700 Fix TypeError when tracking usage with Anthropic models returning Pydantic objects (stanfordnlp#8978) * Initial plan * Fix TypeError when merging Anthropic CacheCreation objects in usage tracker Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> * Enhance _flatten_usage_entry to convert Pydantic models on first add Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> * Fix potential TypeError when both usage entries are None Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> * simplify * small fix * lint * robust version handling --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> Co-authored-by: chenmoneygithub <chen.qian@databricks.com> commit 9b467b5 Author: Noah Ziems <nziems2@nd.edu> Date: Mon Oct 27 13:32:07 2025 -0400 Add Disable Fallback Option in ChatAdapter (stanfordnlp#8984) commit bf022c7 Author: Lakshya A Agrawal <lakshyaaagrawal@berkeley.edu> Date: Sat Oct 25 23:37:42 2025 +0530 Update gepa[dspy] dependency version to 0.0.18 (stanfordnlp#8969) * Update gepa[dspy] dependency version to 0.0.18 * Update pyproject.toml * fix test --------- Co-authored-by: TomuHirata <tomu.hirata@gmail.com>

commit 31b96af Author: Dushmanta <dushmanta0511@gmail.com> Date: Thu Oct 30 13:52:40 2025 +0530 fix: broken PyPI downloads badge from pepy.tech in README and docs home page (stanfordnlp#8995) * fix: update broken pypi download badge in readme * fix: update broken pypi download badge in docs home page commit 056d54e Author: Isaac Miller <17116851+isaacbmiller@users.noreply.github.com> Date: Wed Oct 29 17:23:09 2025 +0100 fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot (stanfordnlp#8909) * fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot * remove extra logs * Remove log * Fix merge conflict * Remove extra whitespace commit da69f9d Author: TomuHirata <tomu.hirata@gmail.com> Date: Wed Oct 29 13:23:34 2025 +0900 Update anthropic model name (stanfordnlp#8992) Signed-off-by: TomuHirata <tomu.hirata@gmail.com> commit aaadf05 Author: Chen Qian <chen.qian@databricks.com> Date: Tue Oct 28 12:21:55 2025 -0700 lints (stanfordnlp#8987) commit e842ba1 Author: eramis73 <130156545+eramis73@users.noreply.github.com> Date: Tue Oct 28 02:40:34 2025 +0300 [docs] Add Google-style docstrings for dspy/evaluate/metrics.py (stanfordnlp#8954) * docs(metrics): add Google-style docstrings for public metrics * docs(metrics): address review feedback (concise openings, mkdocs block examples); revert non-doc changes * fixes --------- Co-authored-by: chenmoneygithub <chen.qian@databricks.com> commit 6c43880 Author: TomuHirata <tomu.hirata@gmail.com> Date: Tue Oct 28 07:21:06 2025 +0900 Cache Ollama to speed up CI (stanfordnlp#8972) * Cache Ollama to speed up CI * fix permission commit 462baef Author: Copilot <198982749+Copilot@users.noreply.github.com> Date: Mon Oct 27 11:57:27 2025 -0700 Fix TypeError when tracking usage with Anthropic models returning Pydantic objects (stanfordnlp#8978) * Initial plan * Fix TypeError when merging Anthropic CacheCreation objects in usage tracker Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> * Enhance _flatten_usage_entry to convert Pydantic models on first add Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> * Fix potential TypeError when both usage entries are None Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> * simplify * small fix * lint * robust version handling --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> Co-authored-by: chenmoneygithub <chen.qian@databricks.com> commit 9b467b5 Author: Noah Ziems <nziems2@nd.edu> Date: Mon Oct 27 13:32:07 2025 -0400 Add Disable Fallback Option in ChatAdapter (stanfordnlp#8984) commit bf022c7 Author: Lakshya A Agrawal <lakshyaaagrawal@berkeley.edu> Date: Sat Oct 25 23:37:42 2025 +0530 Update gepa[dspy] dependency version to 0.0.18 (stanfordnlp#8969) * Update gepa[dspy] dependency version to 0.0.18 * Update pyproject.toml * fix test --------- Co-authored-by: TomuHirata <tomu.hirata@gmail.com>

Initial plan

208bdd6

Copilot AI assigned Copilot and TomeHirata Oct 27, 2025

Copilot started work on behalf of TomeHirata October 27, 2025 01:22 View session

Copilot AI and others added 3 commits October 27, 2025 01:29

Fix TypeError when merging Anthropic CacheCreation objects in usage t…

54f1744

…racker Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>

Enhance _flatten_usage_entry to convert Pydantic models on first add

0e668a8

Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>

Fix potential TypeError when both usage entries are None

a065e83

Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix TypeError when tracking usage with Anthropic models~~ Fix TypeError when tracking usage with Anthropic models returning Pydantic objects Oct 27, 2025

Copilot AI requested a review from TomeHirata October 27, 2025 01:38

Copilot finished work on behalf of TomeHirata October 27, 2025 01:38

TomeHirata marked this pull request as ready for review October 27, 2025 02:10

TomeHirata mentioned this pull request Oct 27, 2025

[Bug] TypeError when using track_usage() with Anthropic models that return CacheCreation objects #8965

Closed

chenmoneygithub added 3 commits October 27, 2025 11:43

simplify

13a40ec

small fix

bb39502

lint

23db9a8

chenmoneygithub approved these changes Oct 27, 2025

View reviewed changes

robust version handling

751a4f9

chenmoneygithub merged commit 462baef into main Oct 27, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix TypeError when tracking usage with Anthropic models returning Pydantic objects #8978

Fix TypeError when tracking usage with Anthropic models returning Pydantic objects #8978

Uh oh!

Copilot AI commented Oct 27, 2025 •

edited

Loading

Uh oh!

chenmoneygithub left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix TypeError when tracking usage with Anthropic models returning Pydantic objects #8978

Fix TypeError when tracking usage with Anthropic models returning Pydantic objects #8978

Uh oh!

Conversation

Copilot AI commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Summary

Expected Behavior

Actual Behavior

Root Cause

Steps to Reproduce

Additional Context

Proposed Solution

Steps to reproduce

Uh oh!

chenmoneygithub left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Oct 27, 2025 •

edited

Loading