Fix shape type mismatch in MaxText->HF ckpt conversion by hengtaoguo · Pull Request #2935 · AI-Hypercomputer/maxtext

hengtaoguo · 2026-01-13T19:57:10Z

Description

We are now seeing errors converting checkpoint from MaxText to Huggingface, caused by comparsion of list type against tuple type. This PR fixes the issue by converting list to tuple.

  File "/home/hengtaoguo_google_com/projects/maxtext/src/MaxText/utils/ckpt_conversion/to_huggingface.py", line 211, in main
    processed_params = process_maxtext_param(key, weight, param_map, hook_fn_map, shape_map, config)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hengtaoguo_google_com/projects/maxtext/src/MaxText/utils/ckpt_conversion/utils/utils.py", line 264, in process_maxtext_param
    _process(hf_path, maxtext_param_weight, output_weights, current_hook_fns, hf_shape_map)
  File "/home/hengtaoguo_google_com/projects/maxtext/src/MaxText/utils/ckpt_conversion/utils/utils.py", line 203, in _process
    raise ValueError(f"Shape mismatch for {hf_path}: Expect {target_hf_shape}, got {numpy_slice.shape}")
ValueError: Shape mismatch for model.embed_tokens.weight: Expect [128256, 4096], got (128256, 4096)

Tests

Bidirectional checkpoint conversion of Llama3.1-8b:

python3 -m MaxText.utils.ckpt_conversion.to_maxtext src/MaxText/configs/base.yml \
    model_name=llama3.1-8b \
    run_name=2026-01-13-11-19 \
    base_output_directory=gs://hengtaoguo-maxtext-logs/checkpoints/llama3.1-8b/unscanned \
    hf_access_token=<xxx> \
    scan_layers=false

python3 -m MaxText.utils.ckpt_conversion.to_huggingface src/MaxText/configs/base.yml \
    model_name=llama3.1-8b \
    load_parameters_path=gs://hengtaoguo-maxtext-logs/checkpoints/llama3.1-8b/unscanned/0/items \
    base_output_directory=/home/hengtaoguo_google_com/projects/hf_safetensor/llama31-8b \
    scan_layers=false \
    hf_access_token=<xxx> \
    weight_dtype=bfloat16

Logs:

I0113 20:36:23.102052 125316232400000 utils.py:497]    Saved model.safetensors.index.json to /home/hengtaoguo_google_com/projects/hf_safetensor/llama31-8b/model.safetensors.index.json
I0113 20:36:23.102197 125316232400000 utils.py:640] ✅ Model and tokenizer (if provided) successfully processed for /home/hengtaoguo_google_com/projects/hf_safetensor/llama31-8b
I0113 20:36:23.102242 125316232400000 to_huggingface.py:231] ✅ MaxText model successfully saved in HuggingFace format at /home/hengtaoguo_google_com/projects/hf_safetensor/llama31-8b
I0113 20:36:23.102272 125316232400000 to_huggingface.py:232] Elapse for save: 5.52 min
I0113 20:36:23.102293 125316232400000 to_huggingface.py:233] Overall Elapse: 6.87 min

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

codecov · 2026-01-13T20:09:07Z

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/MaxText/utils/ckpt_conversion/utils/utils.py	0.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

shuningjin

Thank you for fixing this!

PiperOrigin-RevId: 855869865

Fix shape type mismatch

64eea53

hengtaoguo marked this pull request as ready for review January 13, 2026 20:00

hengtaoguo requested review from NicoGrande, RissyRan, bvandermoon, gagika, gobbleturk, jiangjy1982, parambole, richjames0, shralex, shuningjin and suexu1025 as code owners January 13, 2026 20:00

hengtaoguo changed the title ~~Fix shape type mismatch~~ Fix shape type mismatch in MaxText->HF ckpt conversion Jan 13, 2026

shuningjin approved these changes Jan 13, 2026

View reviewed changes

NicoGrande approved these changes Jan 13, 2026

View reviewed changes

hengtaoguo added the pull ready label Jan 13, 2026

copybara-service Bot merged commit 9f4ce1c into main Jan 13, 2026
35 of 36 checks passed

copybara-service Bot deleted the hengtaoguo-conversion branch January 13, 2026 21:57

SurbhiJainUSC pushed a commit that referenced this pull request Jan 14, 2026

Merge pull request #2935 from AI-Hypercomputer:hengtaoguo-conversion

e73a6b6

PiperOrigin-RevId: 855869865

shuningjin mentioned this pull request Jan 16, 2026

Commit 171a3c6 breaks checkpoint conversion in to_huggingface.py #2914

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix shape type mismatch in MaxText->HF ckpt conversion#2935

Fix shape type mismatch in MaxText->HF ckpt conversion#2935
copybara-service[bot] merged 1 commit intomainfrom
hengtaoguo-conversion

hengtaoguo commented Jan 13, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jan 13, 2026

Uh oh!

shuningjin left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hengtaoguo commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

codecov Bot commented Jan 13, 2026

Codecov Report

Uh oh!

shuningjin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hengtaoguo commented Jan 13, 2026 •

edited

Loading