rewrite state_dict in self.model.save_pretrained(), causing the '_metadata' it saved to be missing. #14268

changwangss · 2021-11-04T05:13:23Z

Environment info

transformers version: 4.12.0.dev0
Platform: linux
Python version: 3.6
PyTorch version (GPU?): 1.9.0
Using GPU in script?: no
Using distributed or parallel set-up in script?: no

Who can help

function: self.model.save_pretrained() in trainer.py @sgugger
root cause: the rewrite state_dict code in modeling_utils.py added by @stas00 in PR(#8737) to ignore keys

Information

I am using Helsinki-NLP/opus-mt-en-ro in translation task and make it quantized with intel neural compressor(version 1.7).

I would load it from a pre-trained model, fine-tune it, quantize it, then save its state_dict. The issue happens when saving and reloading this quantized version.

When DynamicQuantizedLinear generates keys, nn.quantized.Linear uses this format:
model.encoder.layers.0.self_attn.k_proj._packed_params._packed_params
corresponding version=3, but by using trainer.save_model() to save it to version= 1 due to missing _metadata.
it will cause the quantized model reload failed.
For more information about version, you can see here in pytorch repo.

    # Version 1
    #   self
    #   |--- weight : Tensor
    #   |--- bias : Tensor
    #
    # Version 2
    #   self
    #   |--- weight : Tensor
    #   |--- bias : Tensor
    #   |--- dtype : torch.dtype
    #
    # Version 3
    #   self
    #   |--- _packed_params : (Tensor, Tensor) representing (weight, bias)
    #                         of LinearPackedParams
    #   |--- dtype : torch.dtype

we found that the root cause is to rewrite state_dict in order to ignore keys, resulting in missing _metadata information which related with version choose.

code link: https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_utils.py#L1052

To reproduce

Steps to reproduce the behavior:

load a pre-trained model Helsinki-NLP/opus-mt-en-ro , fine-tune it, quantize it with dynamic,
save the quantized model and Load it again, you will get an error.

error

  File "/home2/changwa1/anaconda3/envs/inc_example/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1388, in load
    state_dict, prefix, local_metadata, True, missing_keys, unexpected_keys, error_msgs)
  File "/home2/changwa1/anaconda3/envs/inc_example/lib/python3.6/site-packages/torch/nn/quantized/dynamic/modules/linear.py", line 72, in _load_from_state_dict
    missing_keys, unexpected_keys, error_msgs)
  File "/home2/changwa1/anaconda3/envs/inc_example/lib/python3.6/site-packages/torch/nn/quantized/modules/linear.py", line 220, in _load_from_state_dict
    weight = state_dict.pop(prefix + 'weight')
KeyError: 'model.encoder.layers.0.self_attn.k_proj.weight'

modify the code as following that remove unexpceted keys from state_dict directly instead of rewriting. you will success reload.

modify

the rewrite state_dict code in modeling_utils.py line 1052.
origin

        if self._keys_to_ignore_on_save is not None:
            state_dict = {k: v for k, v in state_dict.items() if k not in self._keys_to_ignore_on_save}

change

        if self._keys_to_ignore_on_save is not None:
            for item in self._keys_to_ignore_on_save:
                del state_dict[item]

Expected behavior

You can modify it as I mentioned, it will be better if you have a more effective solution.

The text was updated successfully, but these errors were encountered:

sgugger · 2021-11-04T11:45:29Z

I think your solution is very good, to avoid deleting that _metadata attribute of the state dict, would you like to make a PR out of it, since you found the fix?

changwangss · 2021-11-04T13:52:37Z

PR has been committed, review please. @sgugger

changwangss mentioned this issue Nov 4, 2021

add more examples[support dynamic, static and aware_training quantization] huggingface/optimum#30

Merged

changwangss mentioned this issue Nov 4, 2021

improve rewrite state_dict missing _metadata #14276

Merged

5 tasks

sgugger closed this as completed in #14276 Nov 4, 2021

changwangss mentioned this issue Nov 10, 2021

enhance rewrite state_dict missing _metadata #14348

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rewrite state_dict in self.model.save_pretrained(), causing the '_metadata' it saved to be missing. #14268

rewrite state_dict in self.model.save_pretrained(), causing the '_metadata' it saved to be missing. #14268

changwangss commented Nov 4, 2021 •

edited

Loading

sgugger commented Nov 4, 2021

changwangss commented Nov 4, 2021

rewrite state_dict in self.model.save_pretrained(), causing the '_metadata' it saved to be missing. #14268

rewrite state_dict in self.model.save_pretrained(), causing the '_metadata' it saved to be missing. #14268

Comments

changwangss commented Nov 4, 2021 • edited Loading

Environment info

Who can help

Information

To reproduce

error

modify

Expected behavior

sgugger commented Nov 4, 2021

changwangss commented Nov 4, 2021

changwangss commented Nov 4, 2021 •

edited

Loading