Add custom stop token ids for generation #20727

tokestermw · 2022-12-12T05:20:03Z

Update (using eos_token_id instead): #20727 (comment)

What does this PR do?

Hi 🤗 team!

This adds stop token ids inside, e.g. model.generate(..., stop_token_ids=[10, 25]), and syntactic sugar for the generation pipelines, e.g. pipeline(..., stop_tokens=['\n']). When the generation detects the specified token ids for all examples in the batch, it will stop.

Rationale

It's common to set a stop id/token for text generation tasks. For example for dialogue, we may want to stop it when the speaker changes.
It's convenient to have arguments for stop tokens similar to max_new_tokens without digging into StoppingCriterion.
Some servers like DeepSpeed MII uses gRPC and it's difficult to pass StoppingCriteria objects.

Usage Example

# in pipeline
prompt = """Hello I believe in"""
text_generator = pipeline("text-generation", model="hf-internal-testing/tiny-random-gpt2", stop_tokens=[' fe'])
text_generator(prompt)

# from generate
gpt2_tokenizer = GPT2Tokenizer.from_pretrained("hf-internal-testing/tiny-random-gpt2")
gpt2_model = GPT2LMHeadModel.from_pretrained("hf-internal-testing/tiny-random-gpt2").to(torch_device)
input_ids = gpt2_tokenizer(prompt, return_tensors="pt").input_ids.to(torch_device)

stop_token_ids = gpt2_tokenizer.encode(" fe")
gpt2_model.generate(input_ids=input_ids, stop_token_ids=stop_token_ids)

How to Test

pytest tests/generation/test_stopping_criteria.py::StoppingCriteriaTestCase::test_stop_token_id_criteria
pytest tests/generation/test_utils.py::GenerationIntegrationTests::test_stop_token_ids_stopping_criteria
pytest tests/pipelines/test_pipelines_text_generation.py::TextGenerationPipelineTests::test_stop_token_ids_stopping_criteria
pytest tests/pipelines/test_pipelines_text_generation.py::TextGenerationPipelineTests::test_stop_tokens_stopping_criteria

Related PR(s)

There is a stop_sequence argument for the TextGeneration pipeline: #18444

But it's limited to a single token, only in the text generation pipeline, and overwrites eos_token_id. Instead, we use StoppingCriteria directly.

This PR is a bit overlapping with above, so please let me know if this approach is not optimal.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
[] Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings. (☢️ noting i've tried to update the docs from the instructions, but they don't seem correct)
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@patrickvonplaten @Narsil

HuggingFaceDocBuilderDev · 2022-12-12T05:35:15Z

The documentation is not available anymore as the PR was closed or merged.

sgugger · 2022-12-12T14:49:51Z

cc @gante

patrickvonplaten · 2022-12-15T20:52:40Z

Think we could actually allow eos_token_id to be both an integer and a list of integers no ? Both in the config and in the input.

gante · 2022-12-16T16:59:23Z

Hi @tokestermw 👋

Like my colleagues, I also think this would be a helpful feature! I also agree with @patrickvonplaten, allowing the existing argument (eos_token_id) to also accept list of integers would result in a cleaner interface and fewer lines of code to maintain :) It is also to port to TF/FLAX, which do not use StoppingCriterion.

In a nutshell, if eos_token_id can be a list of integers, we can replace the existing check with

unfinished_sequences = unfinished_sequences.mul((sum(next_tokens == i for i in eos_token_id)).long())

as long as we always cast eos_token_id to a list before the generation loop. In other words, 2 lines of change (per generation method) would probably do the trick!

@tokestermw WDYT?

tokestermw · 2022-12-16T17:21:02Z

Got it thanks for the suggestion! I can certainly make it so we use eos_token_id. It is also to port to TF/FLAX, which do not use StoppingCriterion. ah good to know :) I can look at this again this weekend

…

On Fri, Dec 16, 2022 at 8:59 AM, Joao Gante ***@***.***> wrote: Hi @tokestermw <https://github.com/tokestermw> [image: 👋] Like my colleagues, I also think this would be a helpful feature! I also agree with @patrickvonplaten <https://github.com/patrickvonplaten>, allowing the existing argument (eos_token_id) to also accept list of integers would result in a cleaner interface and fewer lines of code to maintain :) It is also to port to TF/FLAX, which do not use StoppingCriterion. In a nutshell, if eos_token_id can be a list of integers, we can replace the existing check <https://github.com/huggingface/transformers/blob/26dd041c6e45379141302e2d293ab4cd9cf805d4/src/transformers/generation/utils.py#L2154> with unfinished_sequences = unfinished_sequences.mul((sum(next_tokens == i for i in eos_token_id)).long()) as long as we always cast eos_token_id to a list before the generation loop. In other words, 2 lines of change (per generation method) would probably do the trick! @tokestermw <https://github.com/tokestermw> WDYT? — Reply to this email directly, view it on GitHub <#20727 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABEA3R6MBS7EMZQVFCS4POTWNSNXPANCNFSM6AAAAAAS3PYPPE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

tokestermw · 2022-12-19T07:01:28Z

Hi @gante,

Made eos_token_id into Union[int, List[int]] type. I convert into a list at the beginning of the respective functions. Also, looks like eos_token_id is used in a few more places, e.g. beam_search.py.
Some parts where we insert the eos_token_id, I only insert the first token id, here and here

You can see the changes here: https://github.com/tokestermw/transformers/pull/1/files

If this change looks good, I can merge into this PR, and start polishing (fixing tests, docs, remove dead code, etc.).

thanks!

gante · 2022-12-21T19:51:22Z

@tokestermw that's a comprehensive set of changes, it looks great to me! ❤️

patrickvonplaten · 2023-01-02T13:52:30Z

Awesome this looks nice to me, @gante @sgugger ok for you?

gante

LGTM! 👍

sgugger

Thanks for working on this and nice new tests! Just make sure all the docstrings using eos_token_id are updated and we should be good to merge!

sgugger · 2023-01-03T10:42:46Z

src/transformers/generation/configuration_utils.py

@@ -183,7 +183,7 @@ class GenerationConfig(PushToHubMixin):
            The id of the *padding* token.
        bos_token_id (`int`, *optional*):
            The id of the *beginning-of-sequence* token.
-        eos_token_id (`int`, *optional*):
+        eos_token_id (`Union[int, List[int]]`, *optional*):
            The id of the *end-of-sequence* token.


Can we adapt the text of the doc here? The docstrings also need to be updated in beam_search.py FYI.

sgugger · 2023-01-03T10:42:56Z

src/transformers/generation/logits_process.py

@@ -395,11 +398,11 @@ class NoBadWordsLogitsProcessor(LogitsProcessor):
            List of list of token ids that are not allowed to be generated. In order to get the token ids of the words
            that should not appear in the generated text, use `tokenizer(bad_words, add_prefix_space=True,
            add_special_tokens=False).input_ids`.
-        eos_token_id (`int`):
+        eos_token_id (`Union[int, List[int]]`):
            The id of the *end-of-sequence* token.


Same comment here.

sgugger · 2023-01-03T10:43:05Z

src/transformers/generation/logits_process.py

@@ -671,23 +684,26 @@ class ExponentialDecayLengthPenalty(LogitsProcessor):
        exponential_decay_length_penalty (`tuple(int, float)`, *optional*):
            This tuple shall consist of: `(start_index, decay_factor)` where `start_index` indicates where penalty
            starts and `decay_factor` represents the factor of exponential decay
-        eos_token_id (`int`):
+        eos_token_id (`Union[int, List[int]]`):
            The id of the *end-of-sequence* token.


* Add StopIdStoppingCriteria * add a working test for stop id criteria * add to global scope * add stop_ids to generate * add pipeline test * use tokenizer encode in test * add test to generation utils * reformat * fixup * make-fix-copies * rename to stop_token_id * use stop_tokens instead * add to text to text generation * make fixup * make repo-consistency * Add support for list of ints for eos_token_id inside generation/utils.py * Instead of having if elses, cast the eos_token_id into a List[int] * Add List[int] support for logits_process.py * add List[int] for beam_search.py * add List[int] for forced_eos_token_id * revert stop token id stopping criteria changes * make fixup * fix tests * add eos_token_id to generation/utils.py and added tests test_utils.py * add eos_token_id type hints and fix for pad tokens * add comments * remove some prints and remove forced false test * fix * put back test_stop_sequence_stopping_criteria * remove unused import and make fixup * add a none check * update docstring * add more docstring for list ints * make fixup

Ideally, generation should stop at '\n', but this feature is brand new on transformers (huggingface/transformers#20727)

oobabooga · 2023-01-08T18:05:31Z

Is this feature already available on the transformers version available through pip (4.25.1)? I have tried enabling it and the generation continued on even though I set eos_token_id, in my case to

tokenizer.encode('\n', return_tensors='pt')[1]

(I'm also not sure why 2 integers are returned by encode instead of just 1)

EDIT

Nevermind, I got it working

n = tokenizer.encode('\n', return_tensors='pt')[0][1]
output = model.generate(input_ids, eos_token_id=n).cuda()

* Add StopIdStoppingCriteria * add a working test for stop id criteria * add to global scope * add stop_ids to generate * add pipeline test * use tokenizer encode in test * add test to generation utils * reformat * fixup * make-fix-copies * rename to stop_token_id * use stop_tokens instead * add to text to text generation * make fixup * make repo-consistency * Add support for list of ints for eos_token_id inside generation/utils.py * Instead of having if elses, cast the eos_token_id into a List[int] * Add List[int] support for logits_process.py * add List[int] for beam_search.py * add List[int] for forced_eos_token_id * revert stop token id stopping criteria changes * make fixup * fix tests * add eos_token_id to generation/utils.py and added tests test_utils.py * add eos_token_id type hints and fix for pad tokens * add comments * remove some prints and remove forced false test * fix * put back test_stop_sequence_stopping_criteria * remove unused import and make fixup * add a none check * update docstring * add more docstring for list ints * make fixup

Ideally, generation should stop at '\n', but this feature is brand new on transformers (huggingface/transformers#20727)

tokestermw added 15 commits December 3, 2022 20:24

Add StopIdStoppingCriteria

ecdd003

add a working test for stop id criteria

739674d

add to global scope

3bce1cd

add stop_ids to generate

99904eb

add pipeline test

20aeeca

use tokenizer encode in test

683c320

add test to generation utils

2781b53

reformat

5316041

fixup

947af73

make-fix-copies

15834a8

rename to stop_token_id

0e6eb18

use stop_tokens instead

8c6e474

add to text to text generation

305e349

make fixup

64556a8

make repo-consistency

6f0812d

tokestermw force-pushed the add-custom-stop-token-ids-for-generation branch from ed18fb2 to 6f0812d Compare December 12, 2022 05:21

tokestermw added 5 commits December 18, 2022 22:07

Add support for list of ints for eos_token_id inside generation/utils.py

70c2dda

Instead of having if elses, cast the eos_token_id into a List[int]

b0cccd6

Add List[int] support for logits_process.py

fba3345

add List[int] for beam_search.py

405f79c

add List[int] for forced_eos_token_id

8298e87

tokestermw mentioned this pull request Dec 19, 2022

Add custom stop token ids for generation (by making eos_token_id List[int]) tokestermw/transformers#1

Merged

tokestermw changed the title ~~Add custom stop token ids for generation~~ ANTS-310: Add custom stop token ids for generation Dec 21, 2022

tokestermw added 7 commits December 30, 2022 22:50

add eos_token_id type hints and fix for pad tokens

96ccb52

add comments

dd34d52

remove some prints and remove forced false test

fc789bf

fix

e9bd3a9

put back test_stop_sequence_stopping_criteria

b25f052

remove unused import and make fixup

33e49ef

add a none check

6428562

gante approved these changes Jan 2, 2023

View reviewed changes

gante requested a review from sgugger January 2, 2023 17:16

sgugger approved these changes Jan 3, 2023

View reviewed changes

tokestermw added 3 commits January 3, 2023 11:25

update docstring

a60afdb

add more docstring for list ints

f402cb0

make fixup

84b8e1d

sgugger merged commit 45da7ce into huggingface:main Jan 3, 2023

This was referenced Jan 4, 2023

Generate: Fix CI related to #20727 #21003

Merged

Generate: FLAX infers pad token in its absence and has functional example #21009

Merged

gante added a commit that referenced this pull request Jan 4, 2023

Generate: Fix CI related to #20727 (#21003)

b910489

silverriver pushed a commit to silverriver/transformers that referenced this pull request Jan 6, 2023

Generate: Fix CI related to huggingface#20727 (huggingface#21003)

e3250d1

oobabooga added a commit to oobabooga/text-generation-webui that referenced this pull request Jan 8, 2023

Better default for chat output length

b871f76

Ideally, generation should stop at '\n', but this feature is brand new on transformers (huggingface/transformers#20727)

tokestermw mentioned this pull request Jan 12, 2023

Stop Sequence microsoft/DeepSpeed-MII#109

Open

venkat-natchi pushed a commit to venkat-natchi/transformers that referenced this pull request Jan 22, 2023

Generate: Fix CI related to huggingface#20727 (huggingface#21003)

3320471

tokestermw mentioned this pull request Feb 5, 2023

Fix multiple eos_token_ids in model.generate(...) #21461

Merged

5 tasks

miyu386 pushed a commit to miyu386/transformers that referenced this pull request Feb 9, 2023

Generate: Fix CI related to huggingface#20727 (huggingface#21003)

831dbc2

Ph0rk0z pushed a commit to Ph0rk0z/text-generation-webui-testing that referenced this pull request Apr 17, 2023

Better default for chat output length

7a136a7

Ideally, generation should stop at '\n', but this feature is brand new on transformers (huggingface/transformers#20727)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add custom stop token ids for generation #20727

Add custom stop token ids for generation #20727

tokestermw commented Dec 12, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 12, 2022 •

edited

Loading

sgugger commented Dec 12, 2022

patrickvonplaten commented Dec 15, 2022

gante commented Dec 16, 2022

tokestermw commented Dec 16, 2022 via email

tokestermw commented Dec 19, 2022

gante commented Dec 21, 2022

patrickvonplaten commented Jan 2, 2023

gante left a comment

sgugger left a comment

sgugger Jan 3, 2023

sgugger Jan 3, 2023

sgugger Jan 3, 2023

oobabooga commented Jan 8, 2023 •

edited

Loading

Add custom stop token ids for generation #20727

Add custom stop token ids for generation #20727

Conversation

tokestermw commented Dec 12, 2022 • edited Loading

Update (using eos_token_id instead): #20727 (comment)

What does this PR do?

Usage Example

How to Test

Related PR(s)

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Dec 12, 2022 • edited Loading

sgugger commented Dec 12, 2022

patrickvonplaten commented Dec 15, 2022

gante commented Dec 16, 2022

tokestermw commented Dec 16, 2022 via email

tokestermw commented Dec 19, 2022

gante commented Dec 21, 2022

patrickvonplaten commented Jan 2, 2023

gante left a comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

sgugger Jan 3, 2023

Choose a reason for hiding this comment

sgugger Jan 3, 2023

Choose a reason for hiding this comment

sgugger Jan 3, 2023

Choose a reason for hiding this comment

oobabooga commented Jan 8, 2023 • edited Loading

tokestermw commented Dec 12, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 12, 2022 •

edited

Loading

oobabooga commented Jan 8, 2023 •

edited

Loading