Open
Description
System Info
transformers
version: 4.53.0.dev0- Platform: Linux-5.10.0-34-cloud-amd64-x86_64-with-glibc2.31
- Python version: 3.9.2
- Huggingface_hub version: 0.32.2
- Safetensors version: 0.4.5
- Accelerate version: 1.7.0
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (accelerator?): 2.6.0+cu124 (NA)
- Tensorflow version (GPU?): 2.15.1 (False)
- Flax version (CPU?/GPU?/TPU?): 0.7.0 (cpu)
- Jax version: 0.4.13
- JaxLib version: 0.4.13
- Using distributed or parallel set-up in script?:
Who can help?
@ArthurZucker
(I listed you because you seem to be the reviewer of the original PR, feel free to re-assign this to other people in charge)
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Run the following script:
from transformers.models.auto.modeling_auto import (
AutoModelForSeq2SeqLM,
)
import logging
logger = logging.getLogger("transformers.modeling_utils")
logger.setLevel(logging.ERROR)
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-zh-en", revision="cf109095479db38d6df799875e34039d4938aaa6")
You see the following error:
Traceback (most recent call last):
File "/home/kokiryu/transformers/test_tp_plan.py", line 15, in <module>
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-zh-en", revision="cf109095479db38d6df799875e34039d4938aaa6")
File "/home/kokiryu/.local/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 586, in from_pretrained
return model_class.from_pretrained(
File "/home/kokiryu/.local/lib/python3.9/site-packages/transformers/modeling_utils.py", line 316, in _wrapper
return func(*args, **kwargs)
File "/home/kokiryu/.local/lib/python3.9/site-packages/transformers/modeling_utils.py", line 4697, in from_pretrained
) = cls._load_pretrained_model(
File "/home/kokiryu/.local/lib/python3.9/site-packages/transformers/modeling_utils.py", line 5113, in _load_pretrained_model
verify_tp_plan(expected_keys, getattr(model_to_load, "_tp_plan", None))
File "/home/kokiryu/.local/lib/python3.9/site-packages/transformers/integrations/tensor_parallel.py", line 904, in verify_tp_plan
param_name, _ = key.rsplit(".", 1) if "." in key else key
ValueError: too many values to unpack (expected 2)
Expected behavior
Keys without '.'
such as 'final_logits_bias'
can be given to the verify_tp_plan
. Current code ends up executing param_name, _ = key
in such a case and raises an error.
It should directly assign param_name = key
and output warnings without the errors.