Qwen2-1.5B eos_token NoneType Error prevents generation #31665

dcsuka · 2024-06-27T18:21:06Z

System Info

transformers version: 4.42.0
Platform: Linux-6.2.0-37-generic-x86_64-with-glibc2.35
Python version: 3.10.12
Huggingface_hub version: 0.23.4
Safetensors version: 0.4.3
Accelerate version: 0.31.0
Accelerate config: not found
PyTorch version (GPU?): 2.2.0+cu121 (True)
Tensorflow version (GPU?): 2.13.1 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: No
Using GPU in script?: Yes
GPU type: NVIDIA A100-SXM4-40GB

Who can help?

@ArthurZucker
@gante

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Run the following code using Qwen2-1.5B-Instruct downloaded from HuggingFace:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_kwargs = dict(
    use_cache=False,
    trust_remote_code=True,
    attn_implementation="flash_attention_2",
    torch_dtype=torch.bfloat16,
    device_map=None
)
model = AutoModelForCausalLM.from_pretrained("/home/ubuntu/Qwen2-1.5B-Instruct", **model_kwargs).to('cuda')

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-1.5B-Instruct", trust_remote_code=True)
tokenizer.padding_side = "left"
model.generation_config.pad_token_id = tokenizer.pad_token_id

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
messages = [{"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What are fun places to travel?"}]
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
    padding=True
).to(model.device)
outputs = model.generate(
    input_ids,
    max_new_tokens=16384,
    pad_token_id=tokenizer.pad_token_id,
    eos_token_id=terminators,
    do_sample=False
)
tokenizer.decode(outputs[y][input_ids.shape[-1]:], skip_special_tokens = True)

Produces the following error:

TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_15236/399264495.py in <module>
      7     padding=True
      8 ).to(model.device)
----> 9 outputs = model.generate(
     10     input_ids,
     11     max_new_tokens=16384,

~/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py in decorate_context(*args, **kwargs)
    113     def decorate_context(*args, **kwargs):
    114         with ctx_factory():
--> 115             return func(*args, **kwargs)
    116 
    117     return decorate_context

~/.local/lib/python3.10/site-packages/transformers/generation/utils.py in generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, negative_prompt_ids, negative_prompt_attention_mask, **kwargs)
   1662 
   1663         device = inputs_tensor.device
-> 1664         self._prepare_special_tokens(generation_config, kwargs_has_attention_mask, device=device)
   1665 
   1666         # decoder-only models must use left-padding for batched generation.

~/.local/lib/python3.10/site-packages/transformers/generation/utils.py in _prepare_special_tokens(self, generation_config, kwargs_has_attention_mask, device)
   1482             generation_config.bos_token_id, self.generation_config.bos_token_id, device=device
   1483         )
-> 1484         eos_token_id = _tensor_or_none(
   1485             generation_config.eos_token_id, self.generation_config.eos_token_id, device=device
   1486         )

~/.local/lib/python3.10/site-packages/transformers/generation/utils.py in _tensor_or_none(token_kwargs, token_self, device)
   1477             if token is None or isinstance(token, torch.Tensor):
   1478                 return token
-> 1479             return torch.tensor(token, device=device, dtype=torch.long)
   1480 
   1481         bos_token_id = _tensor_or_none(

TypeError: 'NoneType' object cannot be interpreted as an integer

Expected behavior

Proper generation would take place.

The text was updated successfully, but these errors were encountered:

dcsuka · 2024-06-27T19:06:58Z

Solution is here:

https://huggingface.co/Qwen/Qwen2-1.5B-Instruct/discussions/2

ArthurZucker · 2024-06-28T16:19:47Z

thanks for updating!

dcsuka closed this as completed Jun 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen2-1.5B eos_token NoneType Error prevents generation #31665

Qwen2-1.5B eos_token NoneType Error prevents generation #31665

dcsuka commented Jun 27, 2024

dcsuka commented Jun 27, 2024

ArthurZucker commented Jun 28, 2024

Qwen2-1.5B eos_token NoneType Error prevents generation #31665

Qwen2-1.5B eos_token NoneType Error prevents generation #31665

Comments

dcsuka commented Jun 27, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

dcsuka commented Jun 27, 2024

ArthurZucker commented Jun 28, 2024