Add LLM cost change analysis to CI by Jwrede · Pull Request #1357 · AgentOps-AI/agentops

Jwrede · 2026-05-03T12:29:40Z

Summary

Add tokentoll GitHub Action to analyze LLM API cost changes on pull requests.

When a PR modifies code that makes LLM API calls (model swaps, max_tokens changes, new call sites), tokentoll posts a comment showing the projected cost impact. It stays silent on PRs that don't touch LLM code.

What it detects

tokentoll uses Python AST analysis (zero runtime dependencies) to find calls to:

OpenAI (chat.completions.create, responses.create, embeddings.create)
Anthropic (messages.create)
Google GenAI (generate_content)
LiteLLM (completion, embedding)
LangChain (ChatOpenAI, ChatAnthropic, init_chat_model)
Zhipu AI / GLM (ZhipuAiClient, ZhipuAI)

Current scan of this project

tokentoll Scan Report

File	Line	SDK	Model	Est. Cost/Call	Monthly
`app/e2e/sdk-api/src/agents/basic_agent.py`	25	openai	gpt-3.5-turbo	$0.001539	$1.54
`app/e2e/sdk-api/src/agents/basic_agent.py`	36	openai	gpt-3.5-turbo	$0.001539	$1.54
`app/e2e/sdk-api/src/agents/basic_agent.py`	48	openai	gpt-3.5-turbo	$0.001538	$1.54
`app/e2e/sdk-api/src/agents/basic_agent.py`	61	openai	gpt-3.5-turbo	$0.001540	$1.54
`app/e2e/sdk-api/src/agents/basic_agent.py`	72	openai	gpt-3.5-turbo	$0.001540	$1.54
`examples/anthropic/agentops-anthropic-understanding-tools.py`	104	anthropic	claude-3-7-sonnet-20250219	$0.08	$76.50
`examples/anthropic/agentops-anthropic-understanding-tools.py`	150	anthropic	claude-3-7-sonnet-20250219	$0.08	$76.50
`examples/anthropic/agentops-anthropic-understanding-tools.py`	264	anthropic	claude-3-7-sonnet-20250219	$0.08	$76.50
`examples/anthropic/agentops-anthropic-understanding-tools.py`	345	anthropic	claude-3-7-sonnet-20250219	$0.08	$76.50
`examples/anthropic/anthropic-example-async.py`	49	anthropic	claude-3-7-sonnet-20250219	$0.02	$15.82
`examples/anthropic/anthropic-example-sync.py`	80	anthropic	claude-3-7-sonnet-20250219	$0.04	$36.20
`examples/google_genai/gemini_example.py`	23	google_genai	gemini-1.5-flash	$0.000001	$0.000600
`examples/google_genai/gemini_example.py`	28	google_genai	gemini-1.5-flash	$0.000001	$0.000825
`examples/google_genai/gemini_example.py`	38	google_genai	gemini-1.5-flash	$0.000001	$0.000900
`examples/langchain/langchain_examples.py`	38	langchain	gpt-3.5-turbo	$0.001786	$1.79
`examples/langgraph/langgraph_example.py`	50	langchain	gpt-4o-mini	$0.002533	$2.53
`examples/litellm/litellm_example.py`	49	litellm	gpt-4o-mini	$0.002533	$2.53
`examples/openai/multi_tool_orchestration.py`	58	openai	text-embedding-3-small	$0.000010	$0.01
`examples/openai/multi_tool_orchestration.py`	145	openai	gpt-4o	$0.04	$42.21
`examples/openai/multi_tool_orchestration.py`	282	openai	gpt-4o	$0.04	$42.21
`examples/openai/multi_tool_orchestration.py`	340	openai	gpt-4o	$0.04	$42.21
`examples/openai/multi_tool_orchestration.py`	207	openai	gpt-4o	$0.04	$42.21
`examples/openai/multi_tool_orchestration.py`	90	openai	text-embedding-3-small	$0.000010	$0.01
`examples/openai/multi_tool_orchestration.py`	249	openai	gpt-4o	$0.04	$42.21
`examples/openai/multi_tool_orchestration.py`	113	openai	gpt-4o (default)	$0.04	$42.21
`examples/openai/multi_tool_orchestration.py`	137	openai	text-embedding-3-small	$0.000010	$0.01
`examples/openai/o3_responses_example.py`	116	openai	gpt-4o (default)	$0.04	$42.21
`examples/openai/o3_responses_example.py`	163	openai	gpt-4o (default)	$0.04	$42.21
`examples/openai/o3_responses_example.py`	260	openai	gpt-4o (default)	$0.04	$42.21
`examples/openai/o3_responses_example.py`	307	openai	gpt-4o (default)	$0.04	$42.21
`examples/openai/openai_example_async.py`	57	openai	gpt-4o-mini	$0.002533	$2.53
`examples/openai/openai_example_async.py`	71	openai	gpt-4o-mini	$0.002533	$2.53
`examples/openai/openai_example_sync.py`	45	openai	gpt-4o-mini	$0.002533	$2.53
`examples/openai/openai_example_sync.py`	55	openai	gpt-4o-mini	$0.002533	$2.53
`examples/openai/web_search.py`	37	openai	gpt-4o-mini	$0.002458	$2.46
`examples/openai/web_search.py`	49	openai	gpt-4o-mini	$0.002458	$2.46
`examples/openai/web_search.py`	53	openai	gpt-4o-mini	$0.002460	$2.46
`examples/openai/web_search.py`	66	openai	gpt-4o	$0.04	$40.98
`examples/openai/web_search.py`	82	openai	gpt-4o	$0.04	$42.21
`examples/xai/grok_examples.py`	68	openai	grok-3-mini	$0.04	$41.62
`examples/xai/grok_vision_examples.py`	56	openai	grok-2-vision-1212	$0.04	$40.96
`examples/xpander/coding_agent.py`	73	openai	gpt-4.1	$0.07	$66.54
`tests/core_manual_tests/benchmark.py`	12	openai	gpt-3.5-turbo	$0.001543	$1.54
`tests/core_manual_tests/canary.py`	13	openai	gpt-3.5-turbo	$0.001786	$1.79
`tests/core_manual_tests/multi_session_llm.py`	28	openai	gpt-3.5-turbo	$0.001786	$1.79
`tests/core_manual_tests/providers/anthropic_canary.py`	12	anthropic	claude-3-5-sonnet-20240620	$0.02	$15.37
`tests/core_manual_tests/providers/anthropic_canary.py`	24	anthropic	claude-3-5-sonnet-20240620	$0.02	$15.38
`tests/core_manual_tests/providers/anthropic_canary.py`	62	anthropic	claude-3-5-sonnet-20240620	$0.02	$15.38
`tests/core_manual_tests/providers/anthropic_canary.py`	45	anthropic	claude-3-5-sonnet-20240620	$0.02	$15.38
`tests/core_manual_tests/providers/litellm_canary.py`	10	litellm	gpt-3.5-turbo	$0.001540	$1.54
`tests/core_manual_tests/providers/litellm_canary.py`	12	litellm	gpt-3.5-turbo	$0.001540	$1.54
`tests/core_manual_tests/providers/litellm_canary.py`	34	litellm	gpt-3.5-turbo	$0.001540	$1.54
`tests/core_manual_tests/providers/litellm_canary.py`	23	litellm	gpt-3.5-turbo	$0.001540	$1.54
`tests/core_manual_tests/providers/openai_canary.py`	12	openai	gpt-3.5-turbo	$0.001537	$1.54
`tests/core_manual_tests/providers/openai_canary.py`	18	openai	gpt-3.5-turbo	$0.001538	$1.54
`tests/core_manual_tests/providers/openai_canary.py`	28	openai	gpt-3.5-turbo	$0.001538	$1.54
`tests/core_manual_tests/providers/openai_canary.py`	36	openai	gpt-3.5-turbo	$0.001538	$1.54
`tests/smoke/test_openai.py`	11	openai	gpt-3.5-turbo	$0.001539	$1.54
`tests/unit/instrumentation/fixtures/generate_anthropic_fixtures.py`	33	anthropic	claude-3-opus-20240229	$0.007620	$7.62
`tests/unit/instrumentation/fixtures/generate_anthropic_fixtures.py`	41	anthropic	claude-3-opus-20240229	$0.007725	$7.73
`tests/unit/instrumentation/fixtures/generate_anthropic_fixtures.py`	50	anthropic	claude-3-opus-20240229	$0.007995	$8.00
`tests/unit/instrumentation/fixtures/generate_anthropic_fixtures.py`	63	anthropic	claude-3-opus-20240229	$0.007650	$7.65
`tests/unit/instrumentation/fixtures/generate_anthropic_fixtures.py`	74	anthropic	claude-3-opus-20240229	$0.007650	$7.65

Total estimated monthly cost: $1163.61

Assumptions

1000 calls/month per call site

Generated by tokentoll

How it works

Runs only on PRs, compares base vs head
Posts a sticky comment (updates in place, no duplicates)
Only comments when LLM calls are added, removed, or modified
First run posts an initial scan (this PR) so you can see the output
Zero runtime dependencies, ~1s scan time for most projects
PyPI | GitHub

Changes

Adds .github/workflows/llm-costs.yml (17 lines)
No code changes, no new dependencies

Adds a CI workflow that analyzes LLM API cost changes on pull requests. Posts a comment when model swaps, token limit changes, or new call sites are detected. Stays silent on PRs that don't touch LLM code.

Set default model for dynamic call sites so cost estimates are shown instead of N/A. Bump action to v0.4.0 which supports config files.

v0.5.0 includes built-in per-SDK default models (Anthropic calls use claude-sonnet, Google calls use gemini-flash, etc.) so a .tokentoll.yml config is no longer needed for basic usage.

v0.5.2 fixes false positives where OpenAI-compatible SDKs were misidentified as openai, and skips AzureChatOpenAI calls without an explicit model name. Also pins the pip install version inside the action so the SHA pin is meaningful.

v0.6.0 adds a ZhipuDetector (so Zhipu/GLM calls are attributed correctly) and a constructor-call fallback for OpenAI/Anthropic detectors, recovering DI-style detections that the v0.5.2 strict import check had dropped.

v0.6.1 reverts the v0.6.0 special-case that silently skipped AzureChatOpenAI(deployment_name=...) calls. Those calls now flow through the standard dynamic-default path like every other unresolved model, restoring consistency with the rest of the SDK detectors. The new skip_dynamic_models config option lets projects opt out of cost estimation for dynamic models project-wide or per path.

Jwrede added 8 commits May 3, 2026 12:29

Add tokentoll GitHub Action for LLM cost analysis

9e4c27a

Adds a CI workflow that analyzes LLM API cost changes on pull requests. Posts a comment when model swaps, token limit changes, or new call sites are detected. Stays silent on PRs that don't touch LLM code.

Add contents: read permission for checkout

2d45140

Add .tokentoll.yml config and bump to v0.4.0

fb8ecdb

Set default model for dynamic call sites so cost estimates are shown instead of N/A. Bump action to v0.4.0 which supports config files.

Pin tokentoll action to commit SHA for supply-chain safety

6de2014

Bump tokentoll action to v0.5.0

76b2686

v0.5.0 includes built-in per-SDK default models (Anthropic calls use claude-sonnet, Google calls use gemini-flash, etc.) so a .tokentoll.yml config is no longer needed for basic usage.

Bump tokentoll action SHA pin to v0.5.2

93b387f

v0.5.2 fixes false positives where OpenAI-compatible SDKs were misidentified as openai, and skips AzureChatOpenAI calls without an explicit model name. Also pins the pip install version inside the action so the SHA pin is meaningful.

Bump tokentoll action SHA pin to v0.6.0

36f0f09

v0.6.0 adds a ZhipuDetector (so Zhipu/GLM calls are attributed correctly) and a constructor-call fallback for OpenAI/Anthropic detectors, recovering DI-style detections that the v0.5.2 strict import check had dropped.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LLM cost change analysis to CI#1357

Add LLM cost change analysis to CI#1357
Jwrede wants to merge 8 commits intoAgentOps-AI:mainfrom
Jwrede:add-tokentoll-cost-analysis

Jwrede commented May 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Jwrede commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What it detects

Current scan of this project

tokentoll Scan Report

How it works

Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Jwrede commented May 3, 2026 •

edited

Loading