Skip to content

Fix shape type mismatch in MaxText->HF ckpt conversion#2935

Merged
copybara-service[bot] merged 1 commit intomainfrom
hengtaoguo-conversion
Jan 13, 2026
Merged

Fix shape type mismatch in MaxText->HF ckpt conversion#2935
copybara-service[bot] merged 1 commit intomainfrom
hengtaoguo-conversion

Conversation

@hengtaoguo
Copy link
Copy Markdown
Collaborator

@hengtaoguo hengtaoguo commented Jan 13, 2026

Description

We are now seeing errors converting checkpoint from MaxText to Huggingface, caused by comparsion of list type against tuple type. This PR fixes the issue by converting list to tuple.

  File "/home/hengtaoguo_google_com/projects/maxtext/src/MaxText/utils/ckpt_conversion/to_huggingface.py", line 211, in main
    processed_params = process_maxtext_param(key, weight, param_map, hook_fn_map, shape_map, config)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hengtaoguo_google_com/projects/maxtext/src/MaxText/utils/ckpt_conversion/utils/utils.py", line 264, in process_maxtext_param
    _process(hf_path, maxtext_param_weight, output_weights, current_hook_fns, hf_shape_map)
  File "/home/hengtaoguo_google_com/projects/maxtext/src/MaxText/utils/ckpt_conversion/utils/utils.py", line 203, in _process
    raise ValueError(f"Shape mismatch for {hf_path}: Expect {target_hf_shape}, got {numpy_slice.shape}")
ValueError: Shape mismatch for model.embed_tokens.weight: Expect [128256, 4096], got (128256, 4096)

Tests

Bidirectional checkpoint conversion of Llama3.1-8b:

python3 -m MaxText.utils.ckpt_conversion.to_maxtext src/MaxText/configs/base.yml \
    model_name=llama3.1-8b \
    run_name=2026-01-13-11-19 \
    base_output_directory=gs://hengtaoguo-maxtext-logs/checkpoints/llama3.1-8b/unscanned \
    hf_access_token=<xxx> \
    scan_layers=false

python3 -m MaxText.utils.ckpt_conversion.to_huggingface src/MaxText/configs/base.yml \
    model_name=llama3.1-8b \
    load_parameters_path=gs://hengtaoguo-maxtext-logs/checkpoints/llama3.1-8b/unscanned/0/items \
    base_output_directory=/home/hengtaoguo_google_com/projects/hf_safetensor/llama31-8b \
    scan_layers=false \
    hf_access_token=<xxx> \
    weight_dtype=bfloat16

Logs:

I0113 20:36:23.102052 125316232400000 utils.py:497]    Saved model.safetensors.index.json to /home/hengtaoguo_google_com/projects/hf_safetensor/llama31-8b/model.safetensors.index.json
I0113 20:36:23.102197 125316232400000 utils.py:640] ✅ Model and tokenizer (if provided) successfully processed for /home/hengtaoguo_google_com/projects/hf_safetensor/llama31-8b
I0113 20:36:23.102242 125316232400000 to_huggingface.py:231] ✅ MaxText model successfully saved in HuggingFace format at /home/hengtaoguo_google_com/projects/hf_safetensor/llama31-8b
I0113 20:36:23.102272 125316232400000 to_huggingface.py:232] Elapse for save: 5.52 min
I0113 20:36:23.102293 125316232400000 to_huggingface.py:233] Overall Elapse: 6.87 min

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 13, 2026

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/MaxText/utils/ckpt_conversion/utils/utils.py 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@hengtaoguo hengtaoguo changed the title Fix shape type mismatch Fix shape type mismatch in MaxText->HF ckpt conversion Jan 13, 2026
Copy link
Copy Markdown
Collaborator

@shuningjin shuningjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this!

@copybara-service copybara-service Bot merged commit 9f4ce1c into main Jan 13, 2026
35 of 36 checks passed
@copybara-service copybara-service Bot deleted the hengtaoguo-conversion branch January 13, 2026 21:57
SurbhiJainUSC pushed a commit that referenced this pull request Jan 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants