Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Help] Qwen-VL model.generate方法如何输出output_attention #428

Open
2 tasks done
itsqyh opened this issue Jul 9, 2024 · 1 comment
Open
2 tasks done

[Help] Qwen-VL model.generate方法如何输出output_attention #428

itsqyh opened this issue Jul 9, 2024 · 1 comment

Comments

@itsqyh
Copy link

itsqyh commented Jul 9, 2024

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)

model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, fp16=True).eval()

model.generation_config = GenerationConfig.from_pretrained(model_dir, trust_remote_code=True)
query = tokenizer.from_list_format([
{'image': '/root/autodl-tmp/1.jpg'},
{'text': '请描述该图片:'},
])
inputs = tokenizer(query, return_tensors='pt')
inputs = inputs.to(model.device)

with torch.inference_mode():
outputs = model.generate(inputs,output_attentions=True,output_scores=True,
return_dict_in_generate=True,)

我想用model.generate()方法获取output_attentions,请问应该怎么做?

期望行为 | Expected Behavior

我想用model.generate()方法获取output_attentions,请问应该怎么做?通过output['attentions']获取output_attentions

复现方法 | Steps To Reproduce

tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)

model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, fp16=True).eval()
model.generation_config = GenerationConfig.from_pretrained(model_dir, trust_remote_code=True)
query = tokenizer.from_list_format([
{'image': '/root/autodl-tmp/1.jpg'},
{'text': '请描述该图片:'},])
inputs = tokenizer(query, return_tensors='pt')
inputs = inputs.to(model.device)
outputs = model.generate(inputs,output_attentions=True,output_scores=True,return_dict_in_generate=True,)

KeyError Traceback (most recent call last)
File ~/miniconda3/envs/qwen/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:254, in BatchEncoding.getattr(self, item)
253 try:
--> 254 return self.data[item]
255 except KeyError:

KeyError: 'shape'

During handling of the above exception, another exception occurred:

AttributeError Traceback (most recent call last)
Cell In[17], line 33
31 print(inputs.keys())
32 with torch.inference_mode():
---> 33 outputs = model.generate(inputs,output_attentions=True,output_scores=True,
34 return_dict_in_generate=True,)
35 print(outputs.keys())
36 outputs_attention = []

File ~/.cache/huggingface/modules/transformers_modules/Qwen-VL/modeling_qwen.py:1058, in QWenLMHeadModel.generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, **kwargs)
1055 else:
1056 logits_processor.append(stop_words_logits_processor)
-> 1058 return super().generate(
1059 inputs,
1060 generation_config=generation_config,
1061 logits_processor=logits_processor,
1062 stopping_criteria=stopping_criteria,
1063 prefix_allowed_tokens_fn=prefix_allowed_tokens_fn,
1064 synced_gpus=synced_gpus,
1065 assistant_model=assistant_model,
1066 streamer=streamer,
1067 **kwargs,
1068 )

File ~/miniconda3/envs/qwen/lib/python3.10/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, **kwargs)
112 @functools.wraps(func)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)

File ~/miniconda3/envs/qwen/lib/python3.10/site-packages/transformers/generation/utils.py:1449, in GenerationMixin.generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, negative_prompt_ids, negative_prompt_attention_mask, **kwargs)
1441 # 3. Define model inputs
1442 # inputs_tensor has to be defined
1443 # model_input_name is defined if model-specific keyword input is passed
1444 # otherwise model_input_name is None
1445 # all model-specific keyword inputs are removed from model_kwargs
1446 inputs_tensor, model_input_name, model_kwargs = self._prepare_model_inputs(
1447 inputs, generation_config.bos_token_id, model_kwargs
1448 )
-> 1449 batch_size = inputs_tensor.shape[0]
1451 # 4. Define other model kwargs
1452 model_kwargs["output_attentions"] = generation_config.output_attentions

File ~/miniconda3/envs/qwen/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:256, in BatchEncoding.getattr(self, item)
254 return self.data[item]
255 except KeyError:
--> 256 raise AttributeError

AttributeError:

运行环境 | Environment

- OS:ubuntu22.04
- Python:3.12
- Transformers:4.32.0
- PyTorch:2.3.0
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):12.1

备注 | Anything else?

No response

@void721
Copy link

void721 commented Jul 21, 2024

same question here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants