Fix save_pretrained for quantized models with custom serialization#43096
Open
480284856 wants to merge 2 commits intohuggingface:mainfrom
Open
Fix save_pretrained for quantized models with custom serialization#43096480284856 wants to merge 2 commits intohuggingface:mainfrom
480284856 wants to merge 2 commits intohuggingface:mainfrom
Conversation
When a quantizer provides state_dict via get_state_dict_and_metadata(), the state_dict is already in the correct serialization format. However, revert_weight_conversion() was still being called, which failed for quantizers like mxfp4 whose ConversionOps don't implement reverse_op. This fix skips revert_weight_conversion() when the quantizer has already provided the state_dict, since quantizers handle their own serialization logic in get_state_dict_and_metadata(). Fixes NotImplementedError when calling save_pretrained() on mxfp4 quantized models.
Collaborator
|
cc @SunMarc I think we plan to bring back serialization |
Member
|
hmmm indeed, we need to fix this cc @MekkCyber |
MekkCyber
reviewed
Jan 7, 2026
Contributor
MekkCyber
left a comment
There was a problem hiding this comment.
Hey @480284856 ! thanks for the contribution but i'm not really sure if we want to handle it this way, I think it makes more sense to have a reverse op for quantization too instead of handling it differently using a quantization state dict
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When calling
save_pretrained()on mxfp4 quantized models, aNotImplementedErrorwas raised becauserevert_weight_conversion()tried to reverse operations that don't implementreverse_op.Environment
a7f29523361b2cc12e51c1f5133d95f122f6f45c(main branch)openai/gpt-oss-20b(mxfp4 quantized)Reproduction Code
Error Traceback
Solution
Skip
revert_weight_conversion()when the quantizer has already provided the state_dict viaget_state_dict_and_metadata(), since quantizers handle their own serialization logic.Changes
quantizer_provided_state_dictflag to track when quantizer provides state_dictrevert_weight_conversion()when quantizer already provided serialized state_dictFixes the issue where mxfp4 quantized models cannot be saved due to missing
reverse_opimplementation.