Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model.config.to_diff_dict() delivers different result to model.save_pretrained() #35426

Closed
2 of 4 tasks
umarbutler opened this issue Dec 27, 2024 · 3 comments
Closed
2 of 4 tasks
Labels
bug Core: Modeling Internals of the library; Models.

Comments

@umarbutler
Copy link
Contributor

System Info

  • transformers version: 4.48.0.dev0
  • Platform: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.35
  • Python version: 3.12.5
  • Huggingface_hub version: 0.25.1
  • Safetensors version: 0.4.5
  • Accelerate version: 0.34.2
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.5.1+cu124 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?:
  • Using GPU in script?:
  • GPU type: NVIDIA GeForce RTX 4090

Who can help?

@ArthurZuc

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

I have a use case that requires that model weights always be encrypted when in local storage and only be decrypted in memory. As a result, it is not an option to use model.from_pretrained(dir).

Instead, my workaround has been to do:

import msgspec
from pyfakefs.fake_filesystem_unittest import Patcher as ffspatcher
from transformers import AutoConfig, AutoModelForSequenceClassification, PreTrainedModel

weights = {...} # Deserialized to `dict` from an encrypted file elsewhere.
config = {...} # Deserialized to `dict` from an encrypted file elsewhere.

json_encoder = msgspec.json.encode

with ffspatcher() as patcher:
    fakepath = f'FAKE_FILE_SYSTEM://config.json'
    patcher.fs.create_file(fakepath, contents = json_encoder(config))
    config = AutoConfig.from_pretrained(fakepath)

model: PreTrainedModel = AutoModelForSequenceClassification.from_config(config)
model.load_state_dict(weights)

The problem I've noticed, however, is that when I serialize my config like so:

config = model.config.to_diff_dict()

The resulting config includes the key _attn_implementation_autoset set to True whereas the actual config of the model does not include that key and as a result when I try loading the config with AutoConfig.from_pretrained(), it ends up not using the default attention implementation for my model, SDPA, delivering effectively a different model with different logits.

My current hotfix is to just delete the key _attn_implementation_autoset from all of my configs. But is it really necessary to add that key to to_diff_dict() when it is not added when you do save_pretrained()?

Expected behavior

I get the same model in a reproduciable way as when I save the config with to_diff_dict() vs save_pretrained().

@umarbutler umarbutler added the bug label Dec 27, 2024
@LysandreJik LysandreJik added the Core: Modeling Internals of the library; Models. label Dec 29, 2024
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@Rocketknight1
Copy link
Member

Rocketknight1 commented Jan 27, 2025

Hi @umarbutler, I'm sorry this slipped under the radar over the holiday break! It's an interesting bug because model.save_pretrained() calls config.save_pretrained() which calls config.to_diff_dict() directly, so I don't really see how their outputs could differ. The only changes it makes are sorting and indenting the keys when it's being formatted for JSON.

Can you confirm that when the model or config is saved with save_pretrained that this does not occur? Can you give us any code to reproduce the issue that doesn't include any of your sensitive data?

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot closed this as completed Mar 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Core: Modeling Internals of the library; Models.
Projects
None yet
Development

No branches or pull requests

3 participants