Skip to content

Add provider-specific OPTIONAL fields to usage.jsonl + drop spec-requires framing#5

Merged
riddim-developer-bot[bot] merged 1 commit into
mainfrom
sunny/cost-spec-token-breakdown-and-quota
May 27, 2026
Merged

Add provider-specific OPTIONAL fields to usage.jsonl + drop spec-requires framing#5
riddim-developer-bot[bot] merged 1 commit into
mainfrom
sunny/cost-spec-token-breakdown-and-quota

Conversation

@riddim-developer-bot
Copy link
Copy Markdown
Contributor

Summary

Previously the backfilled usage.jsonl dropped three high-signal cost fields:

  • Codex per-window quota readouts (the most visceral cost-as-it-happens metric)
  • Codex reasoning-vs-visible output split
  • Claude cache-tier split (5m vs 1h ephemeral cache writes)

The cost-telemetry spec allows OPTIONAL field additions without bumping schemaVersion (per its own §6.3), so this PR adds them. The bake path is now lossless for every signal the cost analysis actually uses.

Spec additions (specs/symphony-cost-telemetry-extension/SPEC.md)

§5.2.1 Input-Token Breakdown

  • inputUncachedTokens, inputCachedReadTokens, inputCacheWriteTokens
  • inputCacheWriteEphemeral5mTokens, inputCacheWriteEphemeral1hTokens (Anthropic-only; other writers MUST omit)

§5.2.2 Output-Token Breakdown

  • outputVisibleTokens, outputReasoningTokens (reasoning is Codex/o-series-only)

§5.2.3 Quota Sample

"quota": {
  "planType": "pro",
  "windows": [
    { "label": "primary",   "windowMinutes": 300,   "usedPercent": 64, "resetsAt": 1779863673 },
    { "label": "secondary", "windowMinutes": 10080, "usedPercent": 57, "resetsAt": 1780187884 }
  ]
}

Generic shape — Codex emits primary (5h) and secondary (7d), but any provider with any number of rate-limit windows fits the structure.

§5.3 Semantics adds SHOULD-sum relations across the breakdown buckets.

Implementation

  • transcripts/codex.mjs emits per-turn quota samples in the spec's windows shape directly (no lossy reshape).
  • transcript-to-usage.mjs emits every breakdown field the transcript carries plus the quota object.
  • usage-aggregator.mjs prefers the breakdown fields when present; falls back to the REQUIRED inputTokens/outputTokens totals when not.
  • bin/llm-cost.mjs quota printer is now generic over windows (renders every label the provider reports, not hard-coded to primary/secondary).

Framing fixes folded in

The broken "Symphony Telemetry Extension Spec" framing for the workspace convention was still on main (the per-issue-workspace requirement is in OpenAI Symphony's parent SPEC.md §4.1.4 — there is no extension spec for it). Re-applies the correction that got lost when its prior worktree was torn down:

  • DEFAULT_CWD_PATTERN broadened to match both spec-default <system-temp>/symphony_workspaces/<ID> and the common in-repo <repo>/.symphony/workspaces/<ID>.
  • Issue-ID character class widened from [A-Z]+-\d+ (Linear-specific) to [A-Za-z0-9._-]+ (matches the spec's workspace_key sanitization rule).

README prose changes (root README + package README) — per the user's explicit guidance:

  • No prose claims that llm-cost-attribution REQUIRES any extension spec.
  • The usage.jsonl bake feature is presented as built-in; spec interop with other tools is mentioned only as an optional side-benefit.
  • The OG Symphony spec (https://github.com/openai/symphony/blob/main/SPEC.md) is cited correctly for the per-issue-workspace convention it actually requires.

End-to-end verification on real data

Re-backfilled the full 4,309 sessions / 5 GB of transcripts on this machine:

Before this PR After this PR
usage.jsonl size 83 MB 125 MB (40× still smaller than the 5 GB source)
llm-cost EPAC-1940 --from-usage output matches transcript-source? partial (lost quota / cache split / reasoning) identical
Query time 0.3s 0.3s

Sample backfilled Codex record now includes:

{
  ...
  "inputUncachedTokens": 19975,
  "inputCachedReadTokens": 3456,
  "outputVisibleTokens": 135,
  "outputReasoningTokens": 60,
  "quota": {
    "planType": "pro",
    "windows": [
      { "label": "primary",   "windowMinutes": 300,   "usedPercent": 6, "resetsAt": 1778021087 },
      { "label": "secondary", "windowMinutes": 10080, "usedPercent": 7, "resetsAt": 1778548110 }
    ]
  }
}

Sample backfilled Claude record with cache writes:

{
  ...
  "inputUncachedTokens": 3,
  "inputCachedReadTokens": 0,
  "inputCacheWriteTokens": 45728,
  "inputCacheWriteEphemeral5mTokens": 45728
}

Test plan

  • 33 of 33 tests pass (was 32) via node --test. Added: Symphony-spec-default cwd test, breakdown field tests, quota round-trip test. Updated: existing aggregator quota fixtures to the new spec shape.
  • node --check clean on every .mjs (including the new src/quota.mjs)
  • Backfilled 190,481 records and validated every one passes validateUsageRecord
  • llm-cost EPAC-1940 (transcripts) and llm-cost EPAC-1940 --from-usage <file> produce identical token totals, turn counts, model lists, wall-clock spans, AND quota readouts

Per-issue cost attribution loses too much signal if the bake-to-
usage.jsonl path drops Codex quota readouts, Codex reasoning-output,
and Claude cache-tier splits. The Symphony Coding-Agent Cost
Telemetry Extension spec allows OPTIONAL field additions without
bumping schemaVersion (per its own §6.3), so this PR adds them.

Spec additions (specs/symphony-cost-telemetry-extension/SPEC.md):

  §5.2.1 Input-Token Breakdown
    inputUncachedTokens, inputCachedReadTokens, inputCacheWriteTokens,
    inputCacheWriteEphemeral5mTokens, inputCacheWriteEphemeral1hTokens
    (the last two are Anthropic-only; non-Anthropic writers MUST omit)

  §5.2.2 Output-Token Breakdown
    outputVisibleTokens, outputReasoningTokens
    (reasoning is Codex/o-series-only; other providers MUST omit)

  §5.2.3 Quota Sample
    quota: { planType, windows: [{label, windowMinutes, usedPercent, resetsAt?}] }
    Generic shape — Codex uses `primary` (5h) and `secondary` (7d)
    labels but any provider with any number of windows fits the shape.

  §5.3 Semantics
    SHOULD relations for the breakdown sums.

Package implementation:

  - transcripts/codex.mjs now emits per-turn quota samples in the
    spec shape directly (no lossy reshape later).
  - transcript-to-usage.mjs emits every breakdown field the
    transcript carries plus the quota object.
  - usage-aggregator.mjs prefers breakdown fields when present and
    falls back to inputTokens/outputTokens totals when not.
  - bin/llm-cost.mjs's quota printer is now generic over windows
    (renders every label the provider reports, not hard-coded to
    primary/secondary).

Other fixes folded in (the broken "Symphony Telemetry Extension
Spec" framing for the workspace convention was still on main — the
per-issue-workspace requirement is in OpenAI Symphony's parent
SPEC.md §4.1.4, not an extension):

  - DEFAULT_CWD_PATTERN broadened to match both the spec-default
    `<system-temp>/symphony_workspaces/<ID>` and the common in-repo
    `<repo>/.symphony/workspaces/<ID>` workspace.root settings.
  - Issue-ID character class widened from [A-Z]+-\d+ (Linear-only)
    to [A-Za-z0-9._-]+ (matches the spec's workspace_key
    sanitization rule).
  - README sections rewritten: usage.jsonl bake is presented as a
    built-in feature of the package; spec interop is mentioned only
    as an optional side-benefit. No prose claims that any extension
    spec is required to use llm-cost.

End-to-end verified on real data (4,309 sessions / 5 GB transcripts):

  - Backfill: 190,481 spec-compliant records in a 125 MB file (was
    83 MB before the optional-field additions — the extra ~50% is
    the breakdown + quota payload). Still 40x smaller than the 5 GB
    source.
  - Read-back: `llm-cost EPAC-1940 --from-usage <backfilled-file>`
    now produces an IDENTICAL output to the transcript-source path,
    including the Codex 5h/7d quota readout (58 -> 64% / 56 -> 57%),
    the cache-read 51M-token split, and the reasoning-output 18,649
    split.
  - 0.3s query time vs ~3min for transcript scan, unchanged.

33 tests pass (was 32) — adds Symphony-spec-default cwd test,
adds tests for each new breakdown/quota OPTIONAL field, updates
existing aggregator quota fixtures to the new spec shape.
@riddim-developer-bot riddim-developer-bot Bot enabled auto-merge (squash) May 27, 2026 20:52
@riddim-developer-bot riddim-developer-bot Bot merged commit d940e83 into main May 27, 2026
2 checks passed
@riddim-developer-bot riddim-developer-bot Bot deleted the sunny/cost-spec-token-breakdown-and-quota branch May 27, 2026 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant