Skip to content

Fix typo on run_name for 4xv4-128#6

Closed
michelle-yooh wants to merge 1 commit into
AI-Hypercomputer:mainfrom
michelle-yooh:patch-1
Closed

Fix typo on run_name for 4xv4-128#6
michelle-yooh wants to merge 1 commit into
AI-Hypercomputer:mainfrom
michelle-yooh:patch-1

Conversation

@michelle-yooh
Copy link
Copy Markdown
Collaborator

No description provided.

@rwitten rwitten closed this Jun 6, 2023
A9isha pushed a commit that referenced this pull request Apr 11, 2024
* Add dev composer environment

* Upgrade image version to fix vulnerability

* Update the env to the latest one to address vulnerability

* Add helper function to manage the schedule

* Update helper function to variable
geeningwang pushed a commit to geeningwang/maxtext that referenced this pull request Apr 17, 2026
- Switch benchmark checkpoint to mimo-v2-flash-fixed-ocdbt (scan_layers=false);
  scan_layers=true is broken for MIMO_V2_FLASH (missing layer_idx in scan branch)
- Add 2026-04-17 result: commit 1a6b957, ~575 tok/s / 55.7 ms (cold GCS cache)
- Retire stale scan_layers=true reference results (f42416a, 2ae1dc4) to footnote
- Add perf table row AI-Hypercomputer#6 for today's run; update next-steps to prioritise
  the scan_layers=true fix before further sparse-dispatch work
geeningwang pushed a commit to geeningwang/maxtext that referenced this pull request Apr 17, 2026
- Benchmark history: add row AI-Hypercomputer#7 (scan=true, 71.1 ms / 450 tok/s, +28% vs AI-Hypercomputer#6)
- Add 'scan_layers=true Analysis' section explaining the 15.4 ms overhead:
  root cause is loss of XLA inter-layer weight-prefetch pipelining (lax.while_loop
  prevents cross-iteration scheduling); ~0.32 ms/layer consistent with
  memory-bandwidth-bound workload losing prefetch overlap
- Quantify sparse dispatch break-even bar: must recover >22% to beat 55.7 ms
  dense baseline; rough estimate ~34 ms achievable with ragged_all_to_all
- Update 'Most Impactful Next Fixes' section: scan fix done, ragged_all_to_all
  is now AI-Hypercomputer#1 priority; update HEAD ref to 539cc04
- Re-rank optimisation table: ragged_all_to_all moved to rank 2 (highest unrun),
  scan_layers=true added as rank 3 (done, prerequisite)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants