Cache already downloaded HuggingFace shards. by niting · Pull Request #3972 · AI-Hypercomputer/maxtext

niting · 2026-05-22T17:37:23Z

Description

Currently, shards seem to be redownloaded every time they are required causing slowdowns in conversion. Tried running the script with the changes and there's significant improvements.

Benchmark: 2-Layer Qwen3 MoE Checkpoint Conversion (Lazy Loading Enabled)

Metric	Baseline (Cached)	Optimized	Speedup
Sharding (Materialization)	81.6s (1.36 min)	16.2s (0.27 min)	5.0x
Overall Elapse	83.4s (1.39 min)	17.4s (0.29 min)	4.8x

Integration Tests (tests/integration/checkpoint_conversion_test.py):

Baseline: 148.73s (2:28)
Optimized: 77.33s (1:17) -> 1.9x speedup overall (includes model download)

Tests

Ran the conversion script to confirm it works.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

Currently, shards seem to be redownloaded every time they are required causing slowdowns in conversion. Tried running the script with the changes and there's significant improvements. Benchmark: 2-Layer Qwen3 MoE Checkpoint Conversion (Lazy Loading Enabled) | Metric | Baseline (Cached) | Optimized (Phase 1 Only) | Speedup | |------------------------------|-------------------|--------------------------|----------| | Sharding (Materialization) | 81.6s (1.36 min) | 16.2s (0.27 min) | **5.0x** | | Overall Elapse | 83.4s (1.39 min) | 17.4s (0.29 min) | **4.8x** | Integration Tests (tests/integration/checkpoint_conversion_test.py): - Baseline: 148.73s (2:28) - Optimized: 77.33s (1:17) -> **1.9x speedup overall** (includes model download)

khatwanimohit

LGTM

niting requested review from NicoGrande, RissyRan, bvandermoon, gagika, gobbleturk, hengtaoguo, jiangjy1982, parambole, richjames0, shralex, shuningjin and suexu1025 as code owners May 22, 2026 17:37

niting mentioned this pull request May 22, 2026

Cache already downloaded HuggingFace shards. #3965

Closed

3 tasks

khatwanimohit approved these changes May 22, 2026

View reviewed changes

niting assigned niting and unassigned niting May 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache already downloaded HuggingFace shards.#3972

Cache already downloaded HuggingFace shards.#3972
niting wants to merge 1 commit into
AI-Hypercomputer:mainfrom
niting:conversion_perf

niting commented May 22, 2026 •

edited

Loading

Uh oh!

khatwanimohit left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

niting commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

khatwanimohit left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

niting commented May 22, 2026 •

edited

Loading