feat(llm-gateway): emit $ai_input_cost_usd and $ai_output_cost_usd#60660
Merged
VojtechBartos merged 5 commits intoMay 29, 2026
Conversation
The llm-gateway only emits $ai_total_cost_usd from LiteLLM's response_cost. Ingestion's cost calculator (nodejs/src/ingestion/ai/costs/index.ts::processCost) then recomputes $ai_input_cost_usd and $ai_output_cost_usd from its bundled model catalog because the passthrough short-circuit requires both per-side cost properties to be present. The recomputation misprices cache-read tokens versus LiteLLM's cost_breakdown, inflating $ai_input_cost_usd + $ai_output_cost_usd well above $ai_total_cost_usd on cache-heavy traffic. Emit $ai_input_cost_usd and $ai_output_cost_usd from LiteLLM's cost_breakdown so the per-side and total properties stay in agreement. Fold cache read/creation components into the input total to match the PostHog property semantics. Once both properties are present, processCost flips to passthrough and preserves the gateway-supplied numbers. Billing already reads $ai_total_cost_usd, so this changes the user-visible per-side breakdown without altering billed amounts. Generated-By: PostHog Code Task-Id: ec6de343-7e98-492f-87af-ea2d88c14ff7
Initial revision summed cache_read_cost and cache_creation_cost into $ai_input_cost_usd to match the ingestion calculator's gross-input semantic. That conflicts with the convention the ai-gateway emitter established (each component emitted as its own disjoint property: $ai_input_cost_usd, $ai_output_cost_usd, $ai_cache_read_cost_usd, $ai_cache_creation_cost_usd, summing to $ai_total_cost_usd) and with LiteLLM's own semantic where input_cost is non-cached input only. Emit each cost_breakdown component to its own PostHog property so the data is internally consistent (sum of components reconciles to total) and matches what the ai-gateway already emits. The ingestion passthrough short-circuit only requires the input and output cost to be present, so cache components remain optional bonus context. Generated-By: PostHog Code Task-Id: ec6de343-7e98-492f-87af-ea2d88c14ff7
Contributor
|
mypy narrowed the outer `value` name from the posthog_properties loop, so reusing it for cost_breakdown values failed with an assignment incompatibility. Rename to `cost_value`. Also collapse the rationale comment down to one short line — the reasoning lives in the PR description. Generated-By: PostHog Code Task-Id: ec6de343-7e98-492f-87af-ea2d88c14ff7
joshsny
approved these changes
May 29, 2026
Greptile (P1) caught that only $ai_input_cost_usd / $ai_output_cost_usd / $ai_total_cost_usd are materialized as columns in posthog/models/ai_events/sql.py. The previously emitted $ai_cache_read_cost_usd and $ai_cache_creation_cost_usd would have been invisible to analytics, breaking the input + output ≈ total invariant on cache-heavy traffic. Sum the cache_read_cost and cache_creation_cost components from LiteLLM's cost_breakdown into $ai_input_cost_usd so the gross input cost is captured in the materialized column. Output stays straight from cost_breakdown.output_cost. Also collapse the two cost-breakdown tests into a single parametrized test per Greptile (P2). Generated-By: PostHog Code Task-Id: ec6de343-7e98-492f-87af-ea2d88c14ff7
Reverts the fold-cache-into-input change. Each cost_breakdown component maps 1:1 to its own PostHog property: $ai_input_cost_usd, $ai_output_cost_usd, $ai_cache_read_cost_usd, $ai_cache_creation_cost_usd. Parametrized test kept; assertions updated to check each component. Generated-By: PostHog Code Task-Id: ec6de343-7e98-492f-87af-ea2d88c14ff7
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The llm-gateway captures
$ai_generationevents with only$ai_total_cost_usdset (sourced from LiteLLM'sresponse_cost). Ingestion's cost calculator innodejs/src/ingestion/ai/costs/index.ts::processCostshort-circuits to the passthrough path only when both$ai_input_cost_usdand$ai_output_cost_usdare present on the event. Since the gateway sets neither, ingestion falls into the model-lookup path and rederives the per-side costs from its bundled OpenRouter catalog.That recomputation misprices cache-read tokens versus LiteLLM's
cost_breakdown— observed in production as$ai_input_cost_usd + $ai_output_cost_usdrunning ~4x higher than$ai_total_cost_usdon cache-heavy Anthropic traffic. Billing already reads$ai_total_cost_usd, so the billed amount is correct, but per-side cost breakdowns shown to customers (and to ourselves in usage exploration) are inflated.Changes
cost_breakdownfrom LiteLLM'sstandard_logging_objectinPostHogCallback._on_success.$ai_input_cost_usdas the sum ofinput_cost + cache_read_cost + cache_creation_costso the property keeps PostHog's gross-input semantics (matches what the ingestion calculator would have produced from raw tokens).$ai_output_cost_usdfromcost_breakdown.output_cost.Once both properties are present,
processCostflips to passthrough ($ai_cost_model_source = passthrough) and preserves the gateway-supplied numbers verbatim. No ingestion-side change required.How did you test this code?
Agent-authored; I am Claude (Opus 4.7).
Automated tests (
uv run pytest tests/callbacks/test_posthog.py):test_on_success_emits_cost_breakdown_components_separately— fullcost_breakdownmaps 1:1 to per-side and per-cache PostHog properties; sum reconciles to$ai_total_cost_usd.test_on_success_omits_cache_costs_when_breakdown_lacks_them— breakdowns without cache components don't emit zeroed cache properties.test_on_success_omits_cost_breakdown_when_litellm_omits_it— providers that don't populatecost_breakdowncontinue to emit only$ai_total_cost_usd.Manual verification against a local llm-gateway built from this branch — single Anthropic
claude-haiku-4-5request via/v1/chat/completions. The captured$ai_generationevent carried$ai_input_cost_usd = 1.6e-05,$ai_output_cost_usd = 2.5e-05,$ai_total_cost_usd = 4.1e-05— per-side numbers reconcile to the total. Cache cost properties were correctly omitted on this run since LiteLLM'scost_breakdowndidn't include cache components.Automatic notifications
Docs update
No docs change; this is an internal data-correctness fix.
🤖 Agent context
Authored by PostHog Code in a Slack-triggered investigation. Josh observed in production that
$ai_input_cost_usd + $ai_output_cost_usdran ~4x higher than$ai_total_cost_usdon cache-heavy traffic. Root cause traced through three layers: (1) the production llm-gateway only emits$ai_total_cost_usd, (2) the ingestion-sideprocessCostpassthrough short-circuit requires both per-side properties to be present, and (3) the recomputation path doesn't match LiteLLM's cache-aware cost split. Fix applied at the gateway — the smallest change that lets the existing passthrough path do its job.Considered but rejected: adding an ingestion-side skip flag (e.g.
$ai_cache_reporting_exclusive-style). That property already exists with a different meaning and would conflate two unrelated concerns. The passthrough mechanism is already there, the gateway just wasn't feeding it.Created with PostHog Code