Skip to content

feat(inference): add "Measured J per Token" metric (input + output)#393

Merged
arygupt merged 1 commit into
masterfrom
feat/measured-j-per-total-token
May 27, 2026
Merged

feat(inference): add "Measured J per Token" metric (input + output)#393
arygupt merged 1 commit into
masterfrom
feat/measured-j-per-total-token

Conversation

@arygupt
Copy link
Copy Markdown
Collaborator

@arygupt arygupt commented May 27, 2026

Summary

Adds a third option to the gated Measured Energy dropdown:

  • Measured J per Token — joules / (input_tokens + output_tokens). Workload-shape-fair: doesn't treat the prompt as free.

Existing Measured J per Output Token stays — it's still a useful framing for "what does it cost to generate a token." The new metric answers a different question: "what does it cost to handle a token end-to-end."

Why the distinction matters

For an 8k1k workload (8K input, 1K output) the same system energy gets divided by 1024 tokens (output) vs 9216 tokens (total). J/output-token is ~9x higher than J/total-token despite identical real-world cost. Datacenter operators usually bill per total tokens handled, so J/total-token maps more cleanly to dollars-per-token.

For balanced workloads (1k1k) the ratio is closer to 2x.

Companion runner change

semianalysisai/InferenceX@363e49c4 on PR #1558 emits joules_per_total_token in every agg_<run>.json alongside the existing avg_power_w and joules_per_output_token.

Wiring

Same pattern as the existing measured-power fields:

File Change
packages/constants/src/metric-keys.ts Register joules_per_total_token
packages/app/src/lib/benchmark-transform.ts Pass through (left undefined for legacy rows)
packages/app/src/components/inference/types.ts Extend AggDataEntry, InferenceData, YAxisMetricKey, ChartDefinition
packages/app/src/lib/chart-utils.ts Extend Y_AXIS_METRICS, createChartDataPoint, calculateRoofline/computeAllRooflines yKey union, markRooflinePoints
packages/app/src/components/inference/inference-chart-config.json Add y_measuredJPerTotalToken to both chartTypes (roofline lower_right on interactivity, lower_left on e2e)
packages/app/src/components/inference/ui/ChartControls.tsx Add to the Measured Energy gated group

Graceful degradation

Same typeof === 'number' gate as the other measured-power fields:

  • Rows ingested before the runner-side change have joules_per_total_token absent → those rows don't show on the J/total chart (correct)
  • Rows from the next sweep onward will have it populated automatically

Test plan

  • pnpm typecheck — clean
  • pnpm lint / pnpm fmt — clean (pre-commit hook passes)
  • pnpm test:unit — 1944/1944 passing (+3 new tests covering the new field's presence, independence from J/output-token, graceful absence on legacy rows)
  • After companion runner PR merges + ingest: verify production chart renders the new option with real data

Note on overlay path

Per CLAUDE.md's overlay requirement: the new metric works on the ?unofficialrun= overlay path automatically because transformBenchmarkRows is shared between the official and overlay code paths. Once runner PR #1558 merges and a sweep produces an artifact with the new field, the overlay URL will display the new metric immediately.


Note

Low Risk
Additive UI and data plumbing for an optional telemetry field; no auth, billing, or breaking API changes. Main risk is empty charts until runner/ingest populates the new metric.

Overview
Adds Measured J per Token (joules_per_total_tokenmeasuredJPerTotalToken) as a third option in the gated Measured Energy Y-axis group, alongside avg power and J per output token. The value uses total tokens handled (input + output), which better matches billing-style “cost per token” than output-only J/tok on long-prompt workloads.

The change threads the field from METRIC_KEYS through benchmark-transform, inference types, createChartDataPoint (only when typeof joules_per_total_token === 'number'), roofline unions/marking, and both interactivity/e2e entries in inference-chart-config.json. Legacy rows without the field stay off the new chart view; unit tests cover presence, independence from J/output, and graceful omission.

Depends on runner emitting joules_per_total_token in aggregated JSON; until ingest has that field, the UI option exists but most historical points won’t plot.

Reviewed by Cursor Bugbot for commit 42fdf80. Bugbot is set up for automated code reviews on this repo. Configure here.

…nominator)

Adds a third option to the gated "Measured Energy" dropdown group:

  - Measured J per Token  (J/total-token: system energy / (input + output))

Distinct from the existing "Measured J per Output Token" which divides only
by output tokens (treating the prompt as free). For workload-shape-fair
comparisons — especially with prompt-heavy workloads like 8k/1k where
J/output-token is ~9x higher than J/total-token despite the same energy.

Wires the new field through the same plumbing as the existing measured-
power metrics:

  - packages/constants/src/metric-keys.ts:        register joules_per_total_token
  - packages/app/src/lib/benchmark-transform.ts:  pass through (left undefined for legacy rows)
  - packages/app/src/components/inference/types.ts: extend AggDataEntry, InferenceData, YAxisMetricKey, ChartDefinition
  - packages/app/src/lib/chart-utils.ts:          extend Y_AXIS_METRICS, createChartDataPoint, roofline union, markRooflinePoints
  - packages/app/src/components/inference/inference-chart-config.json: add y_measuredJPerTotalToken to both chartTypes (roofline lower_right / lower_left)
  - packages/app/src/components/inference/ui/ChartControls.tsx: add to the Measured Energy gated group

Companion runner-side change: SemiAnalysisAI/InferenceX@363e49c4 emits
joules_per_total_token in every agg_<run>.json.

Tests: +3 covering the new field (presence, parallel independence from
J/output-token, graceful absence on legacy rows). 1944/1944 vitest pass.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
inferencemax-app Ready Ready Preview, Comment May 27, 2026 1:03am

Request Review

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

Claude finished @arygupt's task in 1m 46s —— View job


Claude Code Review

  • Read PR diff
  • Review changed files
  • Verify test coverage
  • Check overlay path support
  • Post review summary

🟢 LGTM — no blocking issues found

The new measuredJPerTotalToken metric is wired mechanically identical to the existing measuredJPerOutputToken:

  • Metric registration (packages/constants/src/metric-keys.ts): added with a clear comment explaining the denominator.
  • Pass-through (benchmark-transform.ts:57): m.joules_per_total_token lands in AggDataEntry as number | undefined (the metrics: Record<string, number> source gives the right narrowing).
  • Types (types.ts): AggDataEntry, InferenceData, YAxisMetricKey, ChartDefinition all extended consistently.
  • Chart utils (chart-utils.ts): Y_AXIS_METRICS, createChartDataPoint typeof === 'number' gate, calculateRoofline/computeAllRooflines yKey unions, and markRooflinePoints reset + assignment branches all updated. No code path left over.
  • Config (inference-chart-config.json): roofline direction matches measuredJPerOutputToken for both chartTypes (lower_right on interactivity, lower_left on e2e).
  • Controls (ChartControls.tsx:59): added inside the gated Measured Energy group — inherits the existing ↑↑↓↓ gate automatically.
  • Tests (chart-utils.test.ts): three new tests cover presence, independence from J/output (the 8k1k workload sanity check), and graceful absence on legacy rows. The legacy-row test is particularly nice — it asserts the new field is undefined, not 0, which is exactly the property the typeof === 'number' gate is defending.
  • Overlay path: transformBenchmarkRows is shared, so the ?unofficialrun= overlay gets the metric for free once the runner-side field is in the artifact. PR description correctly notes this.

No other code paths in packages/app/src/ branch on measuredAvgPower or measuredJPerOutputToken outside the touched files, so there's no risk of a forgotten switch statement.
· branch feat/measured-j-per-total-token

@arygupt arygupt merged commit e929c3a into master May 27, 2026
18 checks passed
@arygupt arygupt deleted the feat/measured-j-per-total-token branch May 27, 2026 01:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant