disagg

Disaggregation & heterogeneity explorer for datacenter LLM inference.

Forked from ~/transformer_math.html (preserved at reference/transformer_math.html). The goal: pick a model + batch + disaggregation axis (prefill/decode, attention/expert, …), sweep heterogeneous chip splits, and trace the throughput × interactivity × $/token Pareto frontier — the thing the original tool explicitly listed as "not modeled."

Layout

src/engine/core.js — analytical engine extracted verbatim (ref 8179–8756): chip catalogue (chipPerfSpecs), model presets (modelPresets), FLOPs/bytes kernel (getStepCost), roofline + capacity + interconnect model (computeChipSummary), parallelism planner (deriveDeploymentMode). Only change vs original: export block at the bottom.
reference/ — the original 11k-line HTML, untouched.
test/anchors.mjs — reproduces published benchmark points to sanity-check the math. Run: npm run test:anchors
audit/AUDIT.md — full first-pass audit (math + benchmark data). Read this first.

Status

✅ Engine forked, audited, catalogue corrected w/ provenance (audit/AUDIT.md).
✅ Sustained-effective convention (per-chip MFU + BW-eff).
✅ MoE low-batch fix (M1) — DeepSeek-V3 B=1 now ~51 tok/s (was ~770).
✅ Two-tier memory (fast SRAM/HBM + cold LPDDR/CXL) — mem_*_cold, weight_tier. Drives d-Matrix.
✅ Disaggregation axes (src/engine/disagg.js) — three, all sharing one point shape + Pareto/UI:
- Prefill / Decode — phase split; KV shipped prefill→decode once per request.
- Attention / Expert (MoE) — per-layer hidden-state transfer between bandwidth-attention and capacity-expert pools (transfer is per-layer, both directions — the make-or-break cost).
- Speculative decoding — draft/target split; accepted tokens = (1-α^(K+1))/(1-α); ×2–3 speedup.
✅ UI (index.html) — axis selector, chip-pool pickers w/ provenance tags, throughput × interactivity × $/Mtok Pareto chart (◆ hetero / ○ homo), hetero-vs-homo callout. Self-contained ES modules, no build.

Run

npm run ui → http://localhost:5173 (needs a server, not file://, for ES-module imports)
npm test → validate (catalogue + closed-form + reality bands) and axis regression
npm run test:disagg → frontier demo

⚠ Confidentiality

Catalogue contains vendor-proprietary data (d-Matrix slide marked "Proprietary"; Tensordyne deck-derived numbers). Do not make the repo public / CDN-serve it without scrubbing those rows.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
audit		audit
reference		reference
src/engine		src/engine
test		test
.gitignore		.gitignore
README.md		README.md
index.html		index.html
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

disagg

Layout

Status

Run

Next

⚠ Confidentiality

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

disagg

Layout

Status

Run

Next

⚠ Confidentiality

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages