Add seq2seq prompt tuning support #519

thomas-schillaci · 2023-05-30T10:26:47Z

This commit adds prompt tuning and support for the generate method for encoder-decoders.

Using generate for encoder-decoder models with prompt tuning was previously not supported as you can't use generate with ìnputs_embeds. I address this issue by generating the encoder_outputs of the input_ids + prompt, and passing it to generate.

Also included two examples notebooks to showcase this feature.

review-notebook-app · 2023-05-30T10:26:52Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

HuggingFaceDocBuilderDev · 2023-06-01T09:20:51Z

The documentation is not available anymore as the PR was closed or merged.

thomas-schillaci · 2023-06-01T09:34:58Z

@pacman100 I forgot to run make, this should be good with the latest commit

pacman100 · 2023-06-15T10:37:36Z

src/peft/peft_model.py

+
+            prompts = self.get_prompt(batch_size=batch_size)
+            prompts = prompts.to(inputs_embeds.dtype)
+            inputs_embeds = torch.cat((prompts[:, : peft_config.num_virtual_tokens], inputs_embeds), dim=1)


Hello @thomas-schillaci ,

Thank you so much for this PR. This leads to not using prompt tokens for the decoder which might result in a decrease in the model performance. This is the bottleneck because of which I wasn't able to do something similar to this PR.

Hello @pacman100, thank you for the review.
If I'm correct, prompt tuning only requires prompts on the encoder.

I went over the papers again. Prefix-Tuning paper is adding prefixes to both the encoder and decoder while prompt-tuning paper seems only to add them to the input.

Prompt Tuning Paper:

Given a series of n tokens, {x1, x2, . . . , xn}, the
first thing T5 does is embed the tokens, forming
a matrix Xe ∈ R
n×e where e is the dimension of
the embedding space. Our soft-prompts are repre�sented as a parameter Pe ∈ R
p×e
, where p is the
length of the prompt. Our prompt is then concate�nated to the embedded input forming a single ma�trix [Pe; Xe] ∈ R
(p+n)×e which then flows though
the encoder-decoder as normal. Our models are
trained to maximize the probability of Y , but only
the prompt parameters Pe are updated.

Prefix-Tuning paper

Therefore, you are correct. Thank you!

Could you please then remove the need for prompt tokens in the decoder for P-Tuning approach too? This will lead to both of these being supported by generate for seq2seq tasks.

Also, remove the point 3 from the caveats section of README.md as this PR solves it.

For encoder-decoder models, P_TUNING or PROMPT_TUNING doesn't support generate functionality of transformers because generate strictly requires decoder_input_ids but P_TUNING/PROMPT_TUNING appends soft prompt embeddings to input_embeds to create new input_embeds to be given to the model. Therefore, generate doesn't support this yet.

Awesome! I'll be working on it on Monday 😉

@thomas-schillaci @pacman100 Thanks for your valuable contribution. What is about the point 2, "When using P_TUNING or PROMPT_TUNING with SEQ_2_SEQ task, remember to remove the num_virtual_token virtual prompt predictions from the left side of the model outputs during evaluations."? Could you please also update it?

@thomas-schillaci @pacman100, thanks for your super useful code!

For the line encoder_outputs = self.base_model.get_encoder()(inputs_embeds=inputs_embeds) andkwargs["encoder_outputs"] = encoder_outputs, I am not sure that we can call it before updating the attention_mask, which should also be included to comput the encoder_outputs.

Additionally, it may not be necessary to get the encoder_outputs here. The self.base_model.generate will handle it with the method _prepare_encoder_decoder_kwargs_for_generation later. We may only need to update the inputs_embeds and attention_mask.

Please correct me if I am wrong. Thanks a lot.

pacman100

Thank you @thomas-schillaci for adding this and the efforts therein 🤗! Could you please extend go over the latest comments and see if those changes too can be incorporated in this PR?

thomas-schillaci huggingface#519

…rt, enabling the use of generate()

thomas-schillaci · 2023-06-22T15:39:34Z

@pacman100 I have incorporated the changes discussed above, and the suggestions from @ZhengxiangShi, thanks a lot!
Regarding point 2 of the caveats, I don't think it is longer relevant, I took the liberty to remove it.

pacman100

Thank you so much @thomas-schillaci for iterating, LGTM! ✨

It would be great to add @ZhengxiangShi as a co-author in this PR with respect to simplifying generate. Post that will merge the PR. Thank you!

Co-authored-by: ZhengxiangShi michaelszx117@gmail.com

Co-authored-by: ZhengxiangShi <michaelszx117@gmail.com>

thomas-schillaci · 2023-06-26T16:16:10Z

Thank you for the review @pacman100 !
I have added @ZhengxiangShi as a co-author as I have added his suggestions.

ZhengxiangShi · 2023-06-28T15:22:13Z

Thanks for your help! @thomas-schillaci @pacman100

Thomas SCHILLACI added 3 commits May 30, 2023 12:03

Added prompt tuning for seq2seq and corresponding notebook examples

b655b18

Added prompt tuning for seq2seq and corresponding notebook examples

83d1254

Added prompt tuning for seq2seq and corresponding notebook examples

c6f3bcb

Call encoder with get_encoder() and update notebook example

21d4fe6

Style formatting

af2c379

pacman100 reviewed Jun 15, 2023

View reviewed changes

pacman100 reviewed Jun 16, 2023

View reviewed changes

ZhengxiangShi added a commit to ZhengxiangShi/peft that referenced this pull request Jun 16, 2023

enable prompt tuning generate method, based on the

3489523

thomas-schillaci huggingface#519

ZhengxiangShi mentioned this pull request Jun 16, 2023

enable prompt tuning generate method #592

Closed

thomas-schillaci and others added 3 commits June 19, 2023 17:59

Merge branch 'huggingface:main' into add-seq2seq-prompt-tuning-support

e084fe5

Merge branch 'huggingface:main' into add-seq2seq-prompt-tuning-support

fead59d

Add seq2seq p-tuning support, and improve seq2seq prompt tuning suppo…

fa380ec

…rt, enabling the use of generate()

Thomas SCHILLACI added 2 commits June 22, 2023 17:49

Fix imports

408c064

Fix imports

9e37dd7

pacman100 approved these changes Jun 26, 2023

View reviewed changes

Thomas SCHILLACI and others added 3 commits June 26, 2023 17:50

Add co-author.

338e420

Co-authored-by: ZhengxiangShi michaelszx117@gmail.com

Add co-author.

264737f

Co-authored-by: ZhengxiangShi <michaelszx117@gmail.com>

Add co-author.

d703a2b

Co-authored-by: ZhengxiangShi <michaelszx117@gmail.com>

pacman100 merged commit 0e8932f into huggingface:main Jun 27, 2023
11 checks passed

thomas-schillaci deleted the add-seq2seq-prompt-tuning-support branch June 27, 2023 06:29

pacman100 mentioned this pull request Jun 27, 2023

PromptTuning and PTuning with SEQ_2_SEQ_LM #386

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add seq2seq prompt tuning support #519

Add seq2seq prompt tuning support #519

thomas-schillaci commented May 30, 2023

review-notebook-app bot commented May 30, 2023

HuggingFaceDocBuilderDev commented Jun 1, 2023 •

edited

Loading

thomas-schillaci commented Jun 1, 2023

pacman100 Jun 15, 2023

thomas-schillaci Jun 15, 2023

pacman100 Jun 16, 2023

pacman100 Jun 16, 2023 •

edited

Loading

thomas-schillaci Jun 16, 2023

ZhengxiangShi Jun 16, 2023

ZhengxiangShi Jun 16, 2023 •

edited

Loading

pacman100 left a comment

thomas-schillaci commented Jun 22, 2023

pacman100 left a comment

thomas-schillaci commented Jun 26, 2023

ZhengxiangShi commented Jun 28, 2023

Add seq2seq prompt tuning support #519

Add seq2seq prompt tuning support #519

Conversation

thomas-schillaci commented May 30, 2023

review-notebook-app bot commented May 30, 2023

HuggingFaceDocBuilderDev commented Jun 1, 2023 • edited Loading

thomas-schillaci commented Jun 1, 2023

pacman100 Jun 15, 2023

Choose a reason for hiding this comment

thomas-schillaci Jun 15, 2023

Choose a reason for hiding this comment

pacman100 Jun 16, 2023

Choose a reason for hiding this comment

pacman100 Jun 16, 2023 • edited Loading

Choose a reason for hiding this comment

thomas-schillaci Jun 16, 2023

Choose a reason for hiding this comment

ZhengxiangShi Jun 16, 2023

Choose a reason for hiding this comment

ZhengxiangShi Jun 16, 2023 • edited Loading

Choose a reason for hiding this comment

pacman100 left a comment

Choose a reason for hiding this comment

thomas-schillaci commented Jun 22, 2023

pacman100 left a comment

Choose a reason for hiding this comment

thomas-schillaci commented Jun 26, 2023

ZhengxiangShi commented Jun 28, 2023

HuggingFaceDocBuilderDev commented Jun 1, 2023 •

edited

Loading

pacman100 Jun 16, 2023 •

edited

Loading

ZhengxiangShi Jun 16, 2023 •

edited

Loading