-
Notifications
You must be signed in to change notification settings - Fork 25.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix multiple eos_token_id
s in model.generate(...)
#21461
Fix multiple eos_token_id
s in model.generate(...)
#21461
Conversation
src/transformers/generation/utils.py
Outdated
@@ -2226,7 +2227,7 @@ def greedy_search( | |||
|
|||
# if eos_token was found in one sentence, set sentence to finished | |||
if eos_token_id is not None: | |||
unfinished_sequences = unfinished_sequences.mul((sum(next_tokens != i for i in eos_token_id)).long()) | |||
unfinished_sequences = unfinished_sequences.mul((math.prod(next_tokens != i for i in eos_token_id)).long()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before fix, this can go beyond 0 or 1, the next input_ids gets corrupted
The documentation is not available anymore as the PR was closed or merged. |
Hey @tokestermw 👋 Thank you for spotting the issues and adding a fix! One request, for two reasons: a) thin function wrappers are very undesirable, as they add another abstraction layer b) tensor ops should ideally be done with import torch
eos_token_id = torch.tensor([797, 641])
unfinished_sequences = torch.tensor([1, 1, 1])
next_tokens = torch.tensor([797, 641, 98])
next_in_eos = next_tokens.tile((eos_token_id.shape[0], 1)).ne(eos_token_id.unsqueeze(1)).prod(dim=0)
unfinished_sequences = unfinished_sequences.mul(next_in_eos).long() |
I just found the same issue I think and this is the code snippet I wanted to use for reporting the bug. Probably redundant as of now but before throwing it away, maybe it helps another user finding the issue. No further comment/processing required from my point of view: from transformers import AutoModelForCausalLM, GenerationConfig
MODEL = "gpt2"
NUM_RETURN_SEQUENCES = 2
MAX_NEW_TOKENS = 64
CONFIG_DIR = "./generation_test"
model = AutoModelForCausalLM.from_pretrained(MODEL)
model.save_pretrained(CONFIG_DIR)
config = GenerationConfig(
num_return_sequences=NUM_RETURN_SEQUENCES,
max_new_tokens=MAX_NEW_TOKENS,
return_full_text=True,
do_sample=True,
bos_token_id=50256,
pad_token_id=50256,
eos_token_id=[50000,50256], # the 50000 is just an example to prove the issue
)
config.save_pretrained(CONFIG_DIR)
model = AutoModelForCausalLM.from_pretrained(CONFIG_DIR)
tokenizer = AutoTokenizer.from_pretrained(MODEL)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
generated = pipe("As always this is a")
print(generated[0]["generated_text"]) |
Thanks @gante! will make the change in a bit Another issue I just found with beam search + multiple eos_token_id is that, on occasion we get this error: ValueError: At most 3 tokens in tensor([ 198, 198, 198, 0, 628, 14373], device='cuda:0') can be equal to
`eos_token_id: [198, 628]`. Make sure tensor([ 198, 198, 198, 0, 628, 14373], device='cuda:0') are corrected. This is because we generate 2 * num_beams, which can fail this check when we have more than one (I can post a separate issue if that's better) |
@tokestermw if that is not breaking the existing tests, yes, let's move it to a new issue. In essence, we probably want to keep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution!
Mmm, looks like a lot of tests have started failing @gante and @tokestermw |
fixed though there is a seemingly unrelated test error |
Yes, this one has been fixed on main :-) |
* add tests with multiple eos_token_ids * make math.prod instead of sum * make fixup * fix long and also use np.prod since math.prod does not exist <python 3.8 * make fixup * add prod util * use prod util instead of np.prod * make fixup * previous .long location * use tensor ops * remove prod * remove prod * update device * make fixup * fix none
Hi @tokestermw Thank you for working on this. After this PR being merged to To reproduce:We can check with specific commit on
|
* add tests with multiple eos_token_ids * make math.prod instead of sum * make fixup * fix long and also use np.prod since math.prod does not exist <python 3.8 * make fixup * add prod util * use prod util instead of np.prod * make fixup * previous .long location * use tensor ops * remove prod * remove prod * update device * make fixup * fix none
What does this PR do?
Fixes #20727 for using multiple
eos_token_id
sSmall repro
Error
if you run
then it errors
Tests
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@gante