Skip to content

Add LLM cost change analysis to CI#1357

Open
Jwrede wants to merge 8 commits intoAgentOps-AI:mainfrom
Jwrede:add-tokentoll-cost-analysis
Open

Add LLM cost change analysis to CI#1357
Jwrede wants to merge 8 commits intoAgentOps-AI:mainfrom
Jwrede:add-tokentoll-cost-analysis

Conversation

@Jwrede
Copy link
Copy Markdown

@Jwrede Jwrede commented May 3, 2026

Summary

Add tokentoll GitHub Action to analyze LLM API cost changes on pull requests.

When a PR modifies code that makes LLM API calls (model swaps, max_tokens changes, new call sites), tokentoll posts a comment showing the projected cost impact. It stays silent on PRs that don't touch LLM code.

What it detects

tokentoll uses Python AST analysis (zero runtime dependencies) to find calls to:

  • OpenAI (chat.completions.create, responses.create, embeddings.create)
  • Anthropic (messages.create)
  • Google GenAI (generate_content)
  • LiteLLM (completion, embedding)
  • LangChain (ChatOpenAI, ChatAnthropic, init_chat_model)
  • Zhipu AI / GLM (ZhipuAiClient, ZhipuAI)

Current scan of this project

tokentoll Scan Report

File Line SDK Model Est. Cost/Call Monthly
app/e2e/sdk-api/src/agents/basic_agent.py 25 openai gpt-3.5-turbo $0.001539 $1.54
app/e2e/sdk-api/src/agents/basic_agent.py 36 openai gpt-3.5-turbo $0.001539 $1.54
app/e2e/sdk-api/src/agents/basic_agent.py 48 openai gpt-3.5-turbo $0.001538 $1.54
app/e2e/sdk-api/src/agents/basic_agent.py 61 openai gpt-3.5-turbo $0.001540 $1.54
app/e2e/sdk-api/src/agents/basic_agent.py 72 openai gpt-3.5-turbo $0.001540 $1.54
examples/anthropic/agentops-anthropic-understanding-tools.py 104 anthropic claude-3-7-sonnet-20250219 $0.08 $76.50
examples/anthropic/agentops-anthropic-understanding-tools.py 150 anthropic claude-3-7-sonnet-20250219 $0.08 $76.50
examples/anthropic/agentops-anthropic-understanding-tools.py 264 anthropic claude-3-7-sonnet-20250219 $0.08 $76.50
examples/anthropic/agentops-anthropic-understanding-tools.py 345 anthropic claude-3-7-sonnet-20250219 $0.08 $76.50
examples/anthropic/anthropic-example-async.py 49 anthropic claude-3-7-sonnet-20250219 $0.02 $15.82
examples/anthropic/anthropic-example-sync.py 80 anthropic claude-3-7-sonnet-20250219 $0.04 $36.20
examples/google_genai/gemini_example.py 23 google_genai gemini-1.5-flash $0.000001 $0.000600
examples/google_genai/gemini_example.py 28 google_genai gemini-1.5-flash $0.000001 $0.000825
examples/google_genai/gemini_example.py 38 google_genai gemini-1.5-flash $0.000001 $0.000900
examples/langchain/langchain_examples.py 38 langchain gpt-3.5-turbo $0.001786 $1.79
examples/langgraph/langgraph_example.py 50 langchain gpt-4o-mini $0.002533 $2.53
examples/litellm/litellm_example.py 49 litellm gpt-4o-mini $0.002533 $2.53
examples/openai/multi_tool_orchestration.py 58 openai text-embedding-3-small $0.000010 $0.01
examples/openai/multi_tool_orchestration.py 145 openai gpt-4o $0.04 $42.21
examples/openai/multi_tool_orchestration.py 282 openai gpt-4o $0.04 $42.21
examples/openai/multi_tool_orchestration.py 340 openai gpt-4o $0.04 $42.21
examples/openai/multi_tool_orchestration.py 207 openai gpt-4o $0.04 $42.21
examples/openai/multi_tool_orchestration.py 90 openai text-embedding-3-small $0.000010 $0.01
examples/openai/multi_tool_orchestration.py 249 openai gpt-4o $0.04 $42.21
examples/openai/multi_tool_orchestration.py 113 openai gpt-4o (default) $0.04 $42.21
examples/openai/multi_tool_orchestration.py 137 openai text-embedding-3-small $0.000010 $0.01
examples/openai/o3_responses_example.py 116 openai gpt-4o (default) $0.04 $42.21
examples/openai/o3_responses_example.py 163 openai gpt-4o (default) $0.04 $42.21
examples/openai/o3_responses_example.py 260 openai gpt-4o (default) $0.04 $42.21
examples/openai/o3_responses_example.py 307 openai gpt-4o (default) $0.04 $42.21
examples/openai/openai_example_async.py 57 openai gpt-4o-mini $0.002533 $2.53
examples/openai/openai_example_async.py 71 openai gpt-4o-mini $0.002533 $2.53
examples/openai/openai_example_sync.py 45 openai gpt-4o-mini $0.002533 $2.53
examples/openai/openai_example_sync.py 55 openai gpt-4o-mini $0.002533 $2.53
examples/openai/web_search.py 37 openai gpt-4o-mini $0.002458 $2.46
examples/openai/web_search.py 49 openai gpt-4o-mini $0.002458 $2.46
examples/openai/web_search.py 53 openai gpt-4o-mini $0.002460 $2.46
examples/openai/web_search.py 66 openai gpt-4o $0.04 $40.98
examples/openai/web_search.py 82 openai gpt-4o $0.04 $42.21
examples/xai/grok_examples.py 68 openai grok-3-mini $0.04 $41.62
examples/xai/grok_vision_examples.py 56 openai grok-2-vision-1212 $0.04 $40.96
examples/xpander/coding_agent.py 73 openai gpt-4.1 $0.07 $66.54
tests/core_manual_tests/benchmark.py 12 openai gpt-3.5-turbo $0.001543 $1.54
tests/core_manual_tests/canary.py 13 openai gpt-3.5-turbo $0.001786 $1.79
tests/core_manual_tests/multi_session_llm.py 28 openai gpt-3.5-turbo $0.001786 $1.79
tests/core_manual_tests/providers/anthropic_canary.py 12 anthropic claude-3-5-sonnet-20240620 $0.02 $15.37
tests/core_manual_tests/providers/anthropic_canary.py 24 anthropic claude-3-5-sonnet-20240620 $0.02 $15.38
tests/core_manual_tests/providers/anthropic_canary.py 62 anthropic claude-3-5-sonnet-20240620 $0.02 $15.38
tests/core_manual_tests/providers/anthropic_canary.py 45 anthropic claude-3-5-sonnet-20240620 $0.02 $15.38
tests/core_manual_tests/providers/litellm_canary.py 10 litellm gpt-3.5-turbo $0.001540 $1.54
tests/core_manual_tests/providers/litellm_canary.py 12 litellm gpt-3.5-turbo $0.001540 $1.54
tests/core_manual_tests/providers/litellm_canary.py 34 litellm gpt-3.5-turbo $0.001540 $1.54
tests/core_manual_tests/providers/litellm_canary.py 23 litellm gpt-3.5-turbo $0.001540 $1.54
tests/core_manual_tests/providers/openai_canary.py 12 openai gpt-3.5-turbo $0.001537 $1.54
tests/core_manual_tests/providers/openai_canary.py 18 openai gpt-3.5-turbo $0.001538 $1.54
tests/core_manual_tests/providers/openai_canary.py 28 openai gpt-3.5-turbo $0.001538 $1.54
tests/core_manual_tests/providers/openai_canary.py 36 openai gpt-3.5-turbo $0.001538 $1.54
tests/smoke/test_openai.py 11 openai gpt-3.5-turbo $0.001539 $1.54
tests/unit/instrumentation/fixtures/generate_anthropic_fixtures.py 33 anthropic claude-3-opus-20240229 $0.007620 $7.62
tests/unit/instrumentation/fixtures/generate_anthropic_fixtures.py 41 anthropic claude-3-opus-20240229 $0.007725 $7.73
tests/unit/instrumentation/fixtures/generate_anthropic_fixtures.py 50 anthropic claude-3-opus-20240229 $0.007995 $8.00
tests/unit/instrumentation/fixtures/generate_anthropic_fixtures.py 63 anthropic claude-3-opus-20240229 $0.007650 $7.65
tests/unit/instrumentation/fixtures/generate_anthropic_fixtures.py 74 anthropic claude-3-opus-20240229 $0.007650 $7.65

Total estimated monthly cost: $1163.61

Assumptions
  • 1000 calls/month per call site

Generated by tokentoll

How it works

  • Runs only on PRs, compares base vs head
  • Posts a sticky comment (updates in place, no duplicates)
  • Only comments when LLM calls are added, removed, or modified
  • First run posts an initial scan (this PR) so you can see the output
  • Zero runtime dependencies, ~1s scan time for most projects
  • PyPI | GitHub

Changes

  • Adds .github/workflows/llm-costs.yml (17 lines)
  • No code changes, no new dependencies

Jwrede added 8 commits May 3, 2026 12:29
Adds a CI workflow that analyzes LLM API cost changes on pull requests.
Posts a comment when model swaps, token limit changes, or new call sites
are detected. Stays silent on PRs that don't touch LLM code.
Set default model for dynamic call sites so cost estimates are shown
instead of N/A. Bump action to v0.4.0 which supports config files.
v0.5.0 includes built-in per-SDK default models (Anthropic calls use
claude-sonnet, Google calls use gemini-flash, etc.) so a .tokentoll.yml
config is no longer needed for basic usage.
v0.5.2 fixes false positives where OpenAI-compatible SDKs were
misidentified as openai, and skips AzureChatOpenAI calls without an
explicit model name. Also pins the pip install version inside the
action so the SHA pin is meaningful.
v0.6.0 adds a ZhipuDetector (so Zhipu/GLM calls are attributed
correctly) and a constructor-call fallback for OpenAI/Anthropic
detectors, recovering DI-style detections that the v0.5.2 strict
import check had dropped.
v0.6.1 reverts the v0.6.0 special-case that silently skipped
AzureChatOpenAI(deployment_name=...) calls. Those calls now flow
through the standard dynamic-default path like every other unresolved
model, restoring consistency with the rest of the SDK detectors. The
new skip_dynamic_models config option lets projects opt out of cost
estimation for dynamic models project-wide or per path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant