请问chatglm得generate方法是否支持embedding输入？ #18

bingwork · 2023-10-27T13:06:07Z

我没看到具体generate方法代码，就先用prepare_inputs_for_generation分析。
如上图，llama的prepare_inputs_for_generation可以支持embedding输入，但是chatglm没有。
请问chatglm的generate方法是否不支持embedding输入？
如果理解错误，还望见谅。
@xunkai55 @davidlvxin @duzx16

LittleGreenYuan · 2023-11-10T02:04:25Z

请问，你有任何新的想法吗？我在源文件中找到了‘PrefixEncoder’的类，似乎被用在了P-TuningV2里
定义在：

65: class PrefixEncoder(torch.nn.Module)

使用在：

736: class ChatGLMModel(ChatGLMPreTrainedModel):
789:      def forward():

在这一过程中，官方是将这里的emmbedding作为past_key_values: Optional[Tuple[Tuple[torch.Tensor, torch.Tensor], ...]] = None
送入模型的，不知道能不能帮到你，我也恰好正在研究这里

zRzRzRzRzRzRzR · 2023-11-15T03:42:34Z

不确定，将跟算法同学进行讨论

Junjie-Chu · 2023-11-24T00:11:57Z

I'm trying to use GCG with ChatGLM3.

After I read the code carefully, I think generate() actually supports inputs_embeds, which may solve the issue.
I found that input_ids is only used to provide size to create attention_mask and position_ids. When inputs_embeds is passed in, according to the code

if inputs_embeds is None:
    inputs_embeds = self.embedding(input_ids)

the parameter input_ids does not actually affect the inference results?

So in fact, to use inputs_embeds as input, we only needmodel(input_ids, inputs_embeds)

Not sure if my understanding is correct?

And I find, when run model(input_ids=input_ids.unsqueeze(0),inputs_embeds=full_embeds), the output dimensions of ChatGLM3 seem to be different with those of Llama2 or Vicuna? Need to use something like .permute(1, 0, 2)?

Not sure about my understanding, thanks a lot in advance for your support!

LittleGreenYuan · 2023-11-24T01:18:20Z

Following the code below does pass embedding as an input, but when using model.generate(), it will prompt an error:"You passed inputs_embeds to .generate(), but the model class ChatGLMForConditionalGeneration doesn't have its forwarding implemented. See the GPT2 implementation for an example (huggingface/transformers#21405), and feel free to open a PR with it!"

inputs  = tokenizer(MutilTalk_Prompt,padding = 'max_length',max_length = 99)
tensor_input_ids = torch.tensor(inputs['input_ids']+[2])
tensor_input_ids = tensor_input_ids.cuda()
print(tensor_input_ids)
input_embeds = model.transformer.embedding(tensor_input_ids.unsqueeze(0))

outputs = model(input_ids=tensor_input_ids.unsqueeze(0),inputs_embeds=input_embeds)
logits_output = tokenizer.batch_decode(torch.argmax(outputs['logits'], -1).detach().cpu().numpy(), skip_special_tokens=True)
print(logits_output)

#error
outputs = model.generate(input_ids=tensor_input_ids.unsqueeze(0),inputs_embeds=input_embeds)
logits_output = tokenizer.batch_decode(torch.argmax(outputs['logits'], -1).detach().cpu().numpy(), skip_special_tokens=True)
print(logits_output)

Junjie-Chu · 2023-11-24T01:40:52Z

Following the code below does pass embedding as an input, but when using model.generate(), it will prompt an error:"You passed inputs_embeds to .generate(), but the model class ChatGLMForConditionalGeneration doesn't have its forwarding implemented. See the GPT2 implementation for an example (huggingface/transformers#21405), and feel free to open a PR with it!"

inputs  = tokenizer(MutilTalk_Prompt,padding = 'max_length',max_length = 99)
tensor_input_ids = torch.tensor(inputs['input_ids']+[2])
tensor_input_ids = tensor_input_ids.cuda()
print(tensor_input_ids)
input_embeds = model.transformer.embedding(tensor_input_ids.unsqueeze(0))

outputs = model(input_ids=tensor_input_ids.unsqueeze(0),inputs_embeds=input_embeds)
logits_output = tokenizer.batch_decode(torch.argmax(outputs['logits'], -1).detach().cpu().numpy(), skip_special_tokens=True)
print(logits_output)

#error
outputs = model.generate(input_ids=tensor_input_ids.unsqueeze(0),inputs_embeds=input_embeds)
logits_output = tokenizer.batch_decode(torch.argmax(outputs['logits'], -1).detach().cpu().numpy(), skip_special_tokens=True)
print(logits_output)

Oh, I get what u mean now, actually I do not use generate(), instead I just use model().logits. And in this case it runs well. But output has a different dimension with that of llama2 or vicuna XD

zRzRzRzRzRzRzR mentioned this issue Nov 10, 2023

寻求帮助：如何使用embedding作为输入给model生成输出 #244

Closed

zRzRzRzRzRzRzR self-assigned this Nov 15, 2023

zRzRzRzRzRzRzR added the enhancement New feature or request label Nov 21, 2023

THUDM locked and limited conversation to collaborators Nov 24, 2023

zRzRzRzRzRzRzR converted this issue into discussion #436 Nov 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

请问chatglm得generate方法是否支持embedding输入？ #18

请问chatglm得generate方法是否支持embedding输入？ #18

bingwork commented Oct 27, 2023 •

edited

Loading

LittleGreenYuan commented Nov 10, 2023

zRzRzRzRzRzRzR commented Nov 15, 2023

Junjie-Chu commented Nov 24, 2023

LittleGreenYuan commented Nov 24, 2023

Junjie-Chu commented Nov 24, 2023 •

edited

Loading

This issue was moved to a discussion.

This issue was moved to a discussion.

请问chatglm得generate方法是否支持embedding输入？ #18

请问chatglm得generate方法是否支持embedding输入？ #18

Comments

bingwork commented Oct 27, 2023 • edited Loading

LittleGreenYuan commented Nov 10, 2023

zRzRzRzRzRzRzR commented Nov 15, 2023

Junjie-Chu commented Nov 24, 2023

LittleGreenYuan commented Nov 24, 2023

Junjie-Chu commented Nov 24, 2023 • edited Loading

This issue was moved to a discussion.

bingwork commented Oct 27, 2023 •

edited

Loading

Junjie-Chu commented Nov 24, 2023 •

edited

Loading