Fix typo on run_name for 4xv4-128 by michelle-yooh · Pull Request #6 · AI-Hypercomputer/maxtext

michelle-yooh · 2023-04-19T20:51:45Z

No description provided.

* Add dev composer environment * Upgrade image version to fix vulnerability * Update the env to the latest one to address vulnerability * Add helper function to manage the schedule * Update helper function to variable

- Switch benchmark checkpoint to mimo-v2-flash-fixed-ocdbt (scan_layers=false); scan_layers=true is broken for MIMO_V2_FLASH (missing layer_idx in scan branch) - Add 2026-04-17 result: commit 1a6b957, ~575 tok/s / 55.7 ms (cold GCS cache) - Retire stale scan_layers=true reference results (f42416a, 2ae1dc4) to footnote - Add perf table row AI-Hypercomputer#6 for today's run; update next-steps to prioritise the scan_layers=true fix before further sparse-dispatch work

- Benchmark history: add row AI-Hypercomputer#7 (scan=true, 71.1 ms / 450 tok/s, +28% vs AI-Hypercomputer#6) - Add 'scan_layers=true Analysis' section explaining the 15.4 ms overhead: root cause is loss of XLA inter-layer weight-prefetch pipelining (lax.while_loop prevents cross-iteration scheduling); ~0.32 ms/layer consistent with memory-bandwidth-bound workload losing prefetch overlap - Quantify sparse dispatch break-even bar: must recover >22% to beat 55.7 ms dense baseline; rough estimate ~34 ms achievable with ragged_all_to_all - Update 'Most Impactful Next Fixes' section: scan fix done, ragged_all_to_all is now AI-Hypercomputer#1 priority; update HEAD ref to 539cc04 - Re-rank optimisation table: ragged_all_to_all moved to rank 2 (highest unrun), scan_layers=true added as rank 3 (done, prerequisite)

Fix typo on run_name for 4xv4-128

1976ddf

rwitten approved these changes Apr 22, 2023

View reviewed changes

rwitten closed this Jun 6, 2023

shuningjin mentioned this pull request Jul 8, 2025

llama4 scanned checkpoint conversion #1910

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix typo on run_name for 4xv4-128#6

Fix typo on run_name for 4xv4-128#6
michelle-yooh wants to merge 1 commit into
AI-Hypercomputer:mainfrom
michelle-yooh:patch-1

michelle-yooh commented Apr 19, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

michelle-yooh commented Apr 19, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants