Improve forward_pass_logit_checker.py to perform mutual conversion check #1839

YixuanWang-99 · 2025-06-16T18:40:38Z

Description

This update to forward_pass_logit_checker.py enables direct comparisons between MaxText and Hugging Face model checkpoints.

Previously, the script could only compare a single checkpoint (either Hugging Face or MaxText) against a set of "golden logits." This was problematic for fine-tuned models, as their outputs often diverge from the original golden logits. Additionally, when converting models between MaxText and Hugging Face formats, it was difficult to verify the conversion's accuracy.

Added a new flag: --run_hf_model (default to False).
When --run_hf_model flag is set to True, forward_pass_logit_checker .py will run both MaxText and HuggingFace models on-the-fly and compare their output logits, including evaluating output logits for the last token prediction, top-k predicted tokens and their corresponding scores, and KL-divergence between the full logit distributions, ensuring similarity.
When --run_hf_model flag is not used (this is default behavior), it preserves the existing functionality. All existing shell scripts that rely on the original behavior will remain unaffected and run without changes.

This enhancement is crucial for verifying that model conversions accurately preserve predictive behavior.

Tests

Tested on Gemma-2b Model, with an example to comparing MaxText/Hugging Face models runs:

python3 -m MaxText.tests.forward_pass_logit_checker MaxText/configs/base.yml tokenizer_path=assets/tokenizer.gemma load_parameters_path=gs://maxtext-model-checkpoints/gemma-2b/2025-01-23-19-20/unscanned/checkpoints/0/items run_name=forward_pass_test_gemma2b per_device_batch_size=1 model_name=gemma-2b max_prefill_predict_length=4 max_target_length=4 dataset_type=synthetic scan_layers=false attention=dot_product --max_kl_div=0.015 --run_hf_model=True --hf_model_path=google/gemma-2b

A successful check between huggingface and MaxText checkpoints like this. And the similarity and KL div check should be with no errors

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed.

hengtaoguo

Excellent work!

MaxText/tests/mt_hf_mutual_conversion_check.py

MaxText/utils/ckpt_conversion/to_maxtext.py

hengtaoguo · 2025-06-20T18:35:53Z

Hi @gagika ! I've heard this might be interesting to you for loading/saving HF checkpoints. Would you like to take a look when you got a chance? Thanks a lot for your time!

shralex

Thanks Yixuan! Added a few comments

MaxText/utils/ckpt_conversion/to_maxtext.py

MaxText/tests/mt_hf_mutual_conversion_check.py

MaxText/utils/ckpt_conversion/examples/convert_gemma2_to_mt.sh

MaxText/utils/ckpt_conversion/to_maxtext.py

MaxText/utils/ckpt_conversion/examples/convert_gemma2_to_mt.sh

shralex

Thanks for addressing the comments! I have 1 small comment and also a question -- did you test both directions -- to and from HF ? if so can you add both to the PR description testing section, currently it includes 1 example. Thanks!

MaxText/utils/ckpt_conversion/examples/convert_gemma2_to_mt.sh

MaxText/tests/mt_hf_mutual_conversion_check.py

YixuanWang-99 · 2025-06-24T17:49:31Z

Thanks for addressing the comments! I have 1 small comment and also a question -- did you test both directions -- to and from HF ? if so can you add both to the PR description testing section, currently it includes 1 example. Thanks!

from HF conversion with examples is pushed in previous PR: #1785 and #1821. And I have revised the run name.

shralex · 2025-06-28T18:42:21Z

@YixuanWang-99 thank you for consolidating these files. Before merging this, lets make sure that end-to-end tests using forward logits checker still work - can you please run a couple of these tests.
@khatwanimohit I believe you're familiar with forward logits checker, could please also review

hengtaoguo · 2025-06-30T18:30:20Z

@YixuanWang-99 thank you for consolidating these files. Before merging this, lets make sure that end-to-end tests using forward logits checker still work - can you please run a couple of these tests. @khatwanimohit I believe you're familiar with forward logits checker, could please also review

Thank you for the constructive feedback! The new flag run_hf_model aims to add functionality without impacting existing nightly tests. We did a local end_to_end test run for gemma-2b model and the results passed with kl_div < max_kl_div (0.015).

python3 -m MaxText.tests.forward_pass_logit_checker  MaxText/configs/base.yml tokenizer_path=assets/tokenizer.gemma load_parameters_path=gs://runner-maxtext-logs/unscanned_chkpt_2025-06-30-04-17/checkpoints/0/items run_name=forward_pass_test_gemma2b per_device_batch_size=1 model_name=gemma-2b max_prefill_predict_length=4 max_target_length=4 dataset_type=synthetic scan_layers=false attention=dot_product --max_kl_div=0.015

Workload that runs a full test_gemma.sh: link

…check

YixuanWang-99 changed the title ~~Enable conversion from Huggingface to Maxtext~~ Enable Checkpoint Conversion from Huggingface to Maxtext Jun 16, 2025

hengtaoguo approved these changes Jun 20, 2025

View reviewed changes

hengtaoguo marked this pull request as ready for review June 20, 2025 18:43

hengtaoguo requested review from A9isha, RissyRan, SurbhiJainUSC, aireenmei, bvandermoon, gagika, gobbleturk, khatwanimohit, richjames0, shralex, vipannalla and yangyuwei as code owners June 20, 2025 18:43

shralex reviewed Jun 21, 2025

View reviewed changes

MaxText/utils/ckpt_conversion/examples/convert_gemma2_to_mt.sh Outdated Show resolved Hide resolved

MaxText/utils/ckpt_conversion/examples/convert_gemma2_to_mt.sh Outdated Show resolved Hide resolved

shralex reviewed Jun 22, 2025

View reviewed changes

MaxText/utils/ckpt_conversion/examples/convert_gemma2_to_mt.sh Outdated Show resolved Hide resolved

MaxText/utils/ckpt_conversion/to_maxtext.py Outdated Show resolved Hide resolved

MaxText/utils/ckpt_conversion/examples/convert_gemma2_to_mt.sh Outdated Show resolved Hide resolved

shralex reviewed Jun 23, 2025

View reviewed changes

MaxText/utils/ckpt_conversion/examples/convert_gemma2_to_mt.sh Outdated Show resolved Hide resolved

shralex reviewed Jun 24, 2025

View reviewed changes

MaxText/utils/ckpt_conversion/examples/convert_gemma2_to_mt.sh Outdated Show resolved Hide resolved

shralex reviewed Jun 24, 2025

View reviewed changes

MaxText/tests/mt_hf_mutual_conversion_check.py Outdated Show resolved Hide resolved

YixuanWang-99 changed the title ~~Enable Checkpoint Conversion from Huggingface to Maxtext~~ Improve forward_pass_logit_checker.py to perform mutual conversion check Jun 27, 2025

shralex approved these changes Jun 28, 2025

View reviewed changes

YixuanWang-99 force-pushed the yixuannwang-test2 branch from b2faae0 to 015c6cf Compare June 30, 2025 18:05

YixuanWang-99 requested review from gpolovets1, mitalisi and parambole as code owners June 30, 2025 18:05

YixuanWang-99 requested review from Lumosis, jrplatin, mailvijayasingh and patemotter as code owners June 30, 2025 18:05

YixuanWang-99 force-pushed the yixuannwang-test2 branch 2 times, most recently from b2faae0 to d8de947 Compare June 30, 2025 18:16

Improve forward_pass_logit_checker.py to perform HF vs MT conversion …

49da068

…check

YixuanWang-99 force-pushed the yixuannwang-test2 branch from a0725a2 to 49da068 Compare June 30, 2025 18:43

github-actions bot added the pull ready label Jun 30, 2025

copybara-service bot merged commit a832a34 into main Jun 30, 2025
18 checks passed

copybara-service bot deleted the yixuannwang-test2 branch June 30, 2025 20:54

YixuanWang-99 mentioned this pull request Jun 30, 2025

Dense Qwen3 support (0.6b, 4b, 8b) #1858

Merged

4 tasks

Improve forward_pass_logit_checker.py to perform mutual conversion check #1839

Improve forward_pass_logit_checker.py to perform mutual conversion check #1839

Uh oh!

Conversation

YixuanWang-99 commented Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

hengtaoguo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hengtaoguo commented Jun 20, 2025

Uh oh!

shralex left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shralex left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

YixuanWang-99 commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shralex commented Jun 28, 2025

Uh oh!

hengtaoguo commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

YixuanWang-99 commented Jun 16, 2025 •

edited

Loading

YixuanWang-99 commented Jun 24, 2025 •

edited

Loading

hengtaoguo commented Jun 30, 2025 •

edited

Loading