We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transformers
@ArthurZucker @gante
examples
Run the following code using Qwen2-1.5B-Instruct downloaded from HuggingFace:
from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_kwargs = dict( use_cache=False, trust_remote_code=True, attn_implementation="flash_attention_2", torch_dtype=torch.bfloat16, device_map=None ) model = AutoModelForCausalLM.from_pretrained("/home/ubuntu/Qwen2-1.5B-Instruct", **model_kwargs).to('cuda') tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-1.5B-Instruct", trust_remote_code=True) tokenizer.padding_side = "left" model.generation_config.pad_token_id = tokenizer.pad_token_id terminators = [ tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|eot_id|>") ] messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What are fun places to travel?"}] input_ids = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors="pt", padding=True ).to(model.device) outputs = model.generate( input_ids, max_new_tokens=16384, pad_token_id=tokenizer.pad_token_id, eos_token_id=terminators, do_sample=False ) tokenizer.decode(outputs[y][input_ids.shape[-1]:], skip_special_tokens = True)
Produces the following error:
TypeError Traceback (most recent call last) /tmp/ipykernel_15236/399264495.py in <module> 7 padding=True 8 ).to(model.device) ----> 9 outputs = model.generate( 10 input_ids, 11 max_new_tokens=16384, ~/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py in decorate_context(*args, **kwargs) 113 def decorate_context(*args, **kwargs): 114 with ctx_factory(): --> 115 return func(*args, **kwargs) 116 117 return decorate_context ~/.local/lib/python3.10/site-packages/transformers/generation/utils.py in generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, negative_prompt_ids, negative_prompt_attention_mask, **kwargs) 1662 1663 device = inputs_tensor.device -> 1664 self._prepare_special_tokens(generation_config, kwargs_has_attention_mask, device=device) 1665 1666 # decoder-only models must use left-padding for batched generation. ~/.local/lib/python3.10/site-packages/transformers/generation/utils.py in _prepare_special_tokens(self, generation_config, kwargs_has_attention_mask, device) 1482 generation_config.bos_token_id, self.generation_config.bos_token_id, device=device 1483 ) -> 1484 eos_token_id = _tensor_or_none( 1485 generation_config.eos_token_id, self.generation_config.eos_token_id, device=device 1486 ) ~/.local/lib/python3.10/site-packages/transformers/generation/utils.py in _tensor_or_none(token_kwargs, token_self, device) 1477 if token is None or isinstance(token, torch.Tensor): 1478 return token -> 1479 return torch.tensor(token, device=device, dtype=torch.long) 1480 1481 bos_token_id = _tensor_or_none( TypeError: 'NoneType' object cannot be interpreted as an integer
Proper generation would take place.
The text was updated successfully, but these errors were encountered:
Solution is here:
https://huggingface.co/Qwen/Qwen2-1.5B-Instruct/discussions/2
Sorry, something went wrong.
thanks for updating!
No branches or pull requests
System Info
transformers
version: 4.42.0Who can help?
@ArthurZucker
@gante
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Run the following code using Qwen2-1.5B-Instruct downloaded from HuggingFace:
Produces the following error:
Expected behavior
Proper generation would take place.
The text was updated successfully, but these errors were encountered: