Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 27, 2025

Anthropic's API returns CacheCreation Pydantic model objects in usage data when prompt caching is enabled. UsageTracker._merge_usage_entries() attempted to add these objects arithmetically, causing TypeError: unsupported operand type(s) for +: 'CacheCreation' and 'CacheCreation'.

# Before: TypeError on second call
with track_usage() as tracker:
    lm = dspy.LM("claude-sonnet-4-5-20250929", cache=True)
    with dspy.context(lm=lm):
        predictor = dspy.Predict("question -> answer")
        await predictor.acall(question="What is 2+2?")
        await predictor.acall(question="What is 3+3?")  # Crashes here
    tracker.get_total_tokens()  # TypeError

# After: Works correctly, merges nested token counts
tracker.get_total_tokens()
# {'claude-sonnet-4': {
#     'input_tokens': 300,
#     'cache_creation_input_tokens': {
#         'ephemeral_1h_input_tokens': 3072,  # 1024 + 2048
#         'ephemeral_5m_input_tokens': 1536   # 512 + 1024
#     }
# }}

Changes

_flatten_usage_entry(): Convert Pydantic BaseModel instances to dicts via model_dump() when usage is added

_merge_usage_entries(): Detect and convert any Pydantic models before merging; recursively merge nested dicts and sum numeric fields

Tests: Added test_merge_usage_entries_with_pydantic_models() validating multiple CacheCreation objects merge correctly

Fix applies to any Pydantic model objects from any LM provider.

Original prompt

This section details on the original issue you should resolve

<issue_title>[Bug] TypeError when using track_usage() with Anthropic models that return CacheCreation objects</issue_title>
<issue_description>### What happened?

Summary

The UsageTracker crashes with a TypeError when tracking usage for Anthropic models that use prompt caching. Anthropic's API returns CacheCreation Pydantic model objects in the usage data, but DSPy's _merge_usage_entries() attempts to add these objects arithmetically, which fails.

Expected Behavior

Usage tracking should work correctly with all LM providers, including Anthropic models with prompt caching enabled. The tracker should either:

  • Skip or ignore non-numeric fields
  • Convert Pydantic objects to their numeric values
  • Handle these objects gracefully in some other way

Actual Behavior

When calling tracker.get_total_tokens() after making multiple API calls to Anthropic with caching enabled, a TypeError is raised:

$ DSPY_CACHEDIR=./ignored/dspycache_$(date +%s).cache uv run --prerelease=allow repro_dspy_cache_bug.py

   ERROR: unsupported operand type(s) for +: 'CacheCreationTokenDetails' and 'CacheCreationTokenDetails'

============================================================
✗ FAILED: Bug reproduced!
Traceback (most recent call last):
  File "/Users/ndr/coding/github/stanfordnlp/dspy/repro_dspy_cache_bug.py", line 74, in <module>
    asyncio.run(main())
    ~~~~~~~~~~~^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.5/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.5/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.5/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/base_events.py", line 725, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/Users/ndr/coding/github/stanfordnlp/dspy/repro_dspy_cache_bug.py", line 62, in main
    total_usage = tracker.get_total_tokens()
  File "/Users/ndr/coding/github/stanfordnlp/dspy/dspy/utils/usage_tracker.py", line 58, in get_total_tokens
    total_usage = self._merge_usage_entries(total_usage, usage_entry)
  File "/Users/ndr/coding/github/stanfordnlp/dspy/dspy/utils/usage_tracker.py", line 42, in _merge_usage_entries
    result[k] = self._merge_usage_entries(current_v, v)
                ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
  File "/Users/ndr/coding/github/stanfordnlp/dspy/dspy/utils/usage_tracker.py", line 44, in _merge_usage_entries
    result[k] = (current_v or 0) + (v or 0)
                ~~~~~~~~~~~~~~~~~^~~~~~~~~~
TypeError: unsupported operand type(s) for +: 'CacheCreationTokenDetails' and 'CacheCreationTokenDetails'

Root Cause

The bug seems to be in dspy/utils/usage_tracker.py:44:

result[k] = (current_v or 0) + (v or 0)

This line assumes all usage values are numeric (int/float), but Anthropic's API returns structured objects like:

Usage(
    input_tokens=100,
    output_tokens=50,
    cache_creation_input_tokens=CacheCreation(ephemeral_1h_input_tokens=1024, ephemeral_5m_input_tokens=512),
    cache_read_input_tokens=None
)

When merging usage from multiple API calls, the code tries to add two CacheCreation objects together, which fails because Pydantic models don't support addition.

Steps to Reproduce

I've included a standalone reproduction script (repro_dspy_cache_bug.py) that demonstrates the issue. To run it:

ANTHROPIC_API_KEY=your_key_here DSPY_CACHEDIR=./cache uv run --prerelease=allow repro_dspy_cache_bug.py

Additional Context

  • The bug only manifests when:
    1. Using Anthropic models (Claude)
    2. Making 2+ API calls within a track_usage() context
    3. The API returns CacheCreation objects (when caching is used)
  • First call doesn't crash (nothing to merge)
  • OpenAI and other providers that return only numeric usage values are unaffected

Proposed Solution

The _merge_usage_entries() method should detect non-numeric values and handle them appropriately. Possible approaches:

  1. Check if values support addition before attempting it
  2. Convert Pydantic models to dicts before merging
  3. Store non-numeric values in a way that doesn't require arithmetic operations

I can submit a PR with a fix if helpful!

Steps to reproduce

Run the following with:

DSPY_CACHEDIR=./dspycache_$(date +%s).cache uv run --prerelease=allow repro_dspy_cache_bug.py

#!/usr/bin/env -S uv run --prerelease=allow
# /// script
# requires-python = ">=3.12"
# dependencies = [
#     "dspy==3.0.4b1",
#     "anthropic",
# ]
# ///
"""
Reproduces ...

</details>

- Fixes stanfordnlp/dspy#8965

<!-- START COPILOT CODING AGENT TIPS -->
---Let Copilot coding agent [set things up for you](https://github.com/stanfordnlp/dspy/issues/new?title=+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 3 commits October 27, 2025 01:29
…racker

Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>
Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>
Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix TypeError when tracking usage with Anthropic models Fix TypeError when tracking usage with Anthropic models returning Pydantic objects Oct 27, 2025
Copilot AI requested a review from TomeHirata October 27, 2025 01:38
Copilot finished work on behalf of TomeHirata October 27, 2025 01:38
@TomeHirata TomeHirata marked this pull request as ready for review October 27, 2025 02:10
Copy link
Collaborator

@chenmoneygithub chenmoneygithub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified the implementation a bit, but copilot did accurately find the error and made the fix.

Meanwhile it seems to understand our unit test poorly, in its code it's using Anthropic's format, while we should use litellm's.

@chenmoneygithub chenmoneygithub merged commit 462baef into main Oct 27, 2025
10 checks passed
hironow added a commit to hironow/dspy that referenced this pull request Oct 30, 2025
commit 056d54e
Author: Isaac Miller <17116851+isaacbmiller@users.noreply.github.com>
Date:   Wed Oct 29 17:23:09 2025 +0100

    fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot (stanfordnlp#8909)

    * fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot

    * remove extra logs

    * Remove log

    * Fix merge conflict

    * Remove extra whitespace

commit da69f9d
Author: TomuHirata <tomu.hirata@gmail.com>
Date:   Wed Oct 29 13:23:34 2025 +0900

    Update anthropic model name (stanfordnlp#8992)

    Signed-off-by: TomuHirata <tomu.hirata@gmail.com>

commit aaadf05
Author: Chen Qian <chen.qian@databricks.com>
Date:   Tue Oct 28 12:21:55 2025 -0700

    lints (stanfordnlp#8987)

commit e842ba1
Author: eramis73 <130156545+eramis73@users.noreply.github.com>
Date:   Tue Oct 28 02:40:34 2025 +0300

    [docs] Add Google-style docstrings for dspy/evaluate/metrics.py (stanfordnlp#8954)

    * docs(metrics): add Google-style docstrings for public metrics

    * docs(metrics): address review feedback (concise openings, mkdocs block examples); revert non-doc changes

    * fixes

    ---------

    Co-authored-by: chenmoneygithub <chen.qian@databricks.com>

commit 6c43880
Author: TomuHirata <tomu.hirata@gmail.com>
Date:   Tue Oct 28 07:21:06 2025 +0900

    Cache Ollama to speed up CI (stanfordnlp#8972)

    * Cache Ollama to speed up CI

    * fix permission

commit 462baef
Author: Copilot <198982749+Copilot@users.noreply.github.com>
Date:   Mon Oct 27 11:57:27 2025 -0700

    Fix TypeError when tracking usage with Anthropic models returning Pydantic objects (stanfordnlp#8978)

    * Initial plan

    * Fix TypeError when merging Anthropic CacheCreation objects in usage tracker

    Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>

    * Enhance _flatten_usage_entry to convert Pydantic models on first add

    Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>

    * Fix potential TypeError when both usage entries are None

    Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>

    * simplify

    * small fix

    * lint

    * robust version handling

    ---------

    Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
    Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>
    Co-authored-by: chenmoneygithub <chen.qian@databricks.com>

commit 9b467b5
Author: Noah Ziems <nziems2@nd.edu>
Date:   Mon Oct 27 13:32:07 2025 -0400

    Add Disable Fallback Option in ChatAdapter (stanfordnlp#8984)

commit bf022c7
Author: Lakshya A Agrawal <lakshyaaagrawal@berkeley.edu>
Date:   Sat Oct 25 23:37:42 2025 +0530

    Update gepa[dspy] dependency version to 0.0.18 (stanfordnlp#8969)

    * Update gepa[dspy] dependency version to 0.0.18

    * Update pyproject.toml

    * fix test

    ---------

    Co-authored-by: TomuHirata <tomu.hirata@gmail.com>
hironow added a commit to hironow/dspy that referenced this pull request Oct 30, 2025
commit 31b96af
Author: Dushmanta <dushmanta0511@gmail.com>
Date:   Thu Oct 30 13:52:40 2025 +0530

    fix: broken PyPI downloads badge from pepy.tech in README and docs home page (stanfordnlp#8995)

    * fix: update broken pypi download badge in readme

    * fix: update broken pypi download badge in docs home page

commit 056d54e
Author: Isaac Miller <17116851+isaacbmiller@users.noreply.github.com>
Date:   Wed Oct 29 17:23:09 2025 +0100

    fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot (stanfordnlp#8909)

    * fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot

    * remove extra logs

    * Remove log

    * Fix merge conflict

    * Remove extra whitespace

commit da69f9d
Author: TomuHirata <tomu.hirata@gmail.com>
Date:   Wed Oct 29 13:23:34 2025 +0900

    Update anthropic model name (stanfordnlp#8992)

    Signed-off-by: TomuHirata <tomu.hirata@gmail.com>

commit aaadf05
Author: Chen Qian <chen.qian@databricks.com>
Date:   Tue Oct 28 12:21:55 2025 -0700

    lints (stanfordnlp#8987)

commit e842ba1
Author: eramis73 <130156545+eramis73@users.noreply.github.com>
Date:   Tue Oct 28 02:40:34 2025 +0300

    [docs] Add Google-style docstrings for dspy/evaluate/metrics.py (stanfordnlp#8954)

    * docs(metrics): add Google-style docstrings for public metrics

    * docs(metrics): address review feedback (concise openings, mkdocs block examples); revert non-doc changes

    * fixes

    ---------

    Co-authored-by: chenmoneygithub <chen.qian@databricks.com>

commit 6c43880
Author: TomuHirata <tomu.hirata@gmail.com>
Date:   Tue Oct 28 07:21:06 2025 +0900

    Cache Ollama to speed up CI (stanfordnlp#8972)

    * Cache Ollama to speed up CI

    * fix permission

commit 462baef
Author: Copilot <198982749+Copilot@users.noreply.github.com>
Date:   Mon Oct 27 11:57:27 2025 -0700

    Fix TypeError when tracking usage with Anthropic models returning Pydantic objects (stanfordnlp#8978)

    * Initial plan

    * Fix TypeError when merging Anthropic CacheCreation objects in usage tracker

    Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>

    * Enhance _flatten_usage_entry to convert Pydantic models on first add

    Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>

    * Fix potential TypeError when both usage entries are None

    Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>

    * simplify

    * small fix

    * lint

    * robust version handling

    ---------

    Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
    Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>
    Co-authored-by: chenmoneygithub <chen.qian@databricks.com>

commit 9b467b5
Author: Noah Ziems <nziems2@nd.edu>
Date:   Mon Oct 27 13:32:07 2025 -0400

    Add Disable Fallback Option in ChatAdapter (stanfordnlp#8984)

commit bf022c7
Author: Lakshya A Agrawal <lakshyaaagrawal@berkeley.edu>
Date:   Sat Oct 25 23:37:42 2025 +0530

    Update gepa[dspy] dependency version to 0.0.18 (stanfordnlp#8969)

    * Update gepa[dspy] dependency version to 0.0.18

    * Update pyproject.toml

    * fix test

    ---------

    Co-authored-by: TomuHirata <tomu.hirata@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants