model.generate with prefix_allowed_tokens_fn throws RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #15169

iamjanvijay · 2022-01-16T03:40:27Z

Environment info

transformers version: 4.15.0
Platform: Linux-5.4.0-90-generic-x86_64-with-debian-bullseye-sid
Python version: 3.7.12
PyTorch version (GPU?): 1.10.0+cu102 (True)
Tensorflow version (GPU?): 2.7.0 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Who can help

Information

Model I am using T5ForConditionalGeneration:

The problem arises when using my own modified scripts:
Script to reproduce error is mentioned below.

The tasks I am working on is my own task or dataset:
The task requires conditional generation from T5, in such a way, that the output vocabulary is restricted to a small set.

To reproduce

Run the following script to reproduce the behaviour.

from transformers import T5Tokenizer, T5ForConditionalGeneration, T5Config

lm_model = 't5-small'
model = T5ForConditionalGeneration.from_pretrained(lm_model)
tokenizer = T5Tokenizer.from_pretrained(lm_model)

def restrict_decode_vocab(batch_idx, prefix_beam):
    if len(prefix_beam)==3:
        restricted_vocab = tokenizer(' ', return_tensors="pt")['input_ids'].tolist()
    else:
        restricted_vocab = tokenizer('<extra_id_0> cute dog <extra_id_1> the <pad>', return_tensors="pt")['input_ids'].tolist()
    return restricted_vocab

source = ['The <extra_id_0> walks in <extra_id_1> park .']
source_encoding = tokenizer(source[:], padding='longest', return_tensors="pt")
input_ids, attention_mask = source_encoding['input_ids'], source_encoding['attention_mask']
decoded_beams = model.generate(input_ids=input_ids, attention_mask=attention_mask, do_sample=True, num_beams=2, prefix_allowed_tokens_fn=restrict_decode_vocab, min_length=4, max_length=4, remove_invalid_values=True)
print(decoded_beams)

Above script produces the following stack trace.

/home/jsingh319/uploaded_venvs/venv-koala-torch-1.10-python-3.7.12/lib/python3.7/site-packages/transformers/generation_utils.py:2259: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  next_indices = next_tokens // vocab_size
Traceback (most recent call last):
  File "reproduce_error.py", line 17, in <module>
    decoded_beams = model.generate(input_ids=input_ids, attention_mask=attention_mask, do_sample=True, num_beams=2, prefix_allowed_tokens_fn=restrict_decode_vocab, min_length=4, max_length=4, remove_invalid_values=True)
  File "/home/jsingh319/uploaded_venvs/venv-koala-torch-1.10-python-3.7.12/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/home/jsingh319/uploaded_venvs/venv-koala-torch-1.10-python-3.7.12/lib/python3.7/site-packages/transformers/generation_utils.py", line 1220, in generate
    **model_kwargs,
  File "/home/jsingh319/uploaded_venvs/venv-koala-torch-1.10-python-3.7.12/lib/python3.7/site-packages/transformers/generation_utils.py", line 2253, in beam_sample
    next_tokens = torch.multinomial(probs, num_samples=2 * num_beams)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

Expected behavior

No error.

Possible solution

The call function for class "InfNanRemoveLogitsProcessor" should include the following statement before returning "scores".

scores[scores == float("-inf")] = torch.finfo(scores.dtype).min

The text was updated successfully, but these errors were encountered:

Narsil · 2022-01-20T08:51:02Z

@patrickvonplaten Pinging you to get your input on this.

It seems -inf are explicitely set by prefix_allowed_tokens_fn and remove_invalid_values doesn't remove float(-inf) specifically.

However the script does seems to fail currently.

I added a PR containing the "fix" to accelerate things along, but given everything is ingrained in tests and other logits processors actively use float(-inf) I am not sure this is the desired behavior.

Other options I consider viable:

Stop using float(-inf) directly and use torch.finfo(scores.dtype).min instead (we don't introduce infinities anymore so should solve it)
Change float(-inf) only before using torch.multinomialt.

patrickvonplaten · 2022-01-21T15:34:13Z

I don't think that this is related in any way to the InfNanRemoveLogitsProcessor processor. IMO, the reason for the error here is that in the 3rd generation step, all values of next_token_scores are set to -inf (I think) due to the prefix_allowed_tokens_fn that you've added. This is not a bug IMO with transformers, but with the prefix_allowed_tokens_fn function as it should not set all values to -inf.

A tip from my side @iamjanvijay would be to do the following. Create the PrefixConstrainedLogitsProcessor object with your function and just play around with it locally (what happens at generation step 3) I think you'll see then that it sets all values to -inf at some point which it shouldn't do

iamjanvijay · 2022-01-21T15:38:27Z

@patrickvonplaten @Narsil Thanks for your response. I was trying to check why this is happening. I found that if the restricted_vocab at any generation step only includes "</s>" (end-of-sentence token) this error occurs. In other cases, the script doesn't encounter such an error. I'll try to look if all the elements at that generation step are set to -inf.

Narsil · 2022-01-21T15:42:30Z

I'll close my PR in the meantime. We can reopen it if needed, but I tend to agree with @patrickvonplaten that having everything float(-inf) can be considered a bug already.

mindojune · 2022-02-05T00:47:54Z

@patrickvonplaten @Narsil Thanks for your response. I was trying to check why this is happening. I found that if the restricted_vocab at any generation step only includes "" (end-of-sentence token) this error occurs. In other cases, the script doesn't encounter such an error. I'll try to look if all the elements at that generation step are set to -inf.

Have you found what was causing the issue by any chance @iamjanvijay? I'm encountering the same issue while I'm using the generate function with BART, but I'm not using any prefix_allowed_tokens, and this error usually happens when I've been training the model for a while. Like @iamjanvijay said, I suspect something to do with cases where some tokens are masked or filtered, but I haven't really figured out where/why it's happening. I'd appreciate any pointers.

patrickvonplaten · 2022-02-07T17:08:57Z

@mindojune - could you maybe open a new issue as it's not related to prefix_allowed_tokens_fn ?

github-actions · 2022-03-04T15:04:57Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

hongyuntw · 2022-06-29T08:57:42Z

@mindojune
Hi, I am facing the same problem as you and this error usually happens after I have trained the model for a while. And I am also using BART.
Do you have any idea why this is happening or how to fix this error?
Thank you a lot.

patrickvonplaten · 2022-06-29T22:45:50Z

Hey @hongyuntw, the reason is that BART forces the second token to be this id https://huggingface.co/facebook/bart-large/blob/main/config.json#L27 . However if you use additionally something like prefix_allowed_tokens_fn which might also not allow this id: https://huggingface.co/facebook/bart-large/blob/main/config.json#L27 => then all ids are set to -inf in which case the model cannot generate anything anymore. To solve this I would probably set this config: https://huggingface.co/facebook/bart-large/blob/main/config.json#L27 to None

paulbricman · 2022-12-12T15:39:54Z

Was running into similar issues when using prefix_allowed_tokens_fn in tandem with beam-search multinomial sampling, and realized the top_k and top_p args were sometimes preventing all the allowed tokens from being used, as they were outside those two tops. no_repeat_ngram_size can have a similar effect.

Consider removing top_k and top_p if only allowing certain tokens is more important.

chentiao · 2023-04-29T15:02:04Z

i also have this problem,but i dont know how to fix

Dhawgupta · 2023-05-04T16:39:09Z

Same I am also running into this issue, has there been any resolution for this ?

chentiao · 2023-05-05T06:29:07Z

i try another way to avoid this problem,i was apply minigpt-4 in 3090,when i use v0 version weight,the problem happen,for result it ,i try many pytorch version ,but it dosent work.finally,i use a new model version v1.1,this problem also away.so,i think the problem is relative with model.for minigpt4,model decoding is relateted with fschat here

Chanwhistle · 2024-02-08T08:08:25Z

Did somebody solved this problem?

amyeroberts · 2024-02-08T15:38:20Z

cc @gante for reference

gante · 2024-02-14T14:59:33Z

@Chanwhistle have a look at this comment

If you believe this comment does not apply to you, then a reproducer of the issue will be needed 🤗

Narsil mentioned this issue Jan 20, 2022

Updating InfNanLogitsprocess to also remove negative infinity. #15241

Closed

5 tasks

github-actions bot closed this as completed Mar 14, 2022

ellenxtan mentioned this issue Mar 11, 2023

Fix decoding last step and inf during sample topp facebookresearch/metaseq#674

Merged

pseudotensor mentioned this issue Mar 30, 2023

RuntimeError: probability tensor contains either inf, nan or element < 0 h2oai/h2ogpt#17

Open

abarbet mentioned this issue May 16, 2023

Generation issues with seq2seq LMs #23413

Closed

4 tasks

This was referenced Nov 23, 2023

RuntimeError with prefix_allowed_tokens_fn and do_sample=True When Allowed Tokens List is Empty #27676

Closed

Fix sampling method to handle all -inf scores in next_token_scores #27680

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model.generate with prefix_allowed_tokens_fn throws RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #15169

model.generate with prefix_allowed_tokens_fn throws RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #15169

iamjanvijay commented Jan 16, 2022 •

edited by patrickvonplaten

Loading

Narsil commented Jan 20, 2022 •

edited

Loading

patrickvonplaten commented Jan 21, 2022

iamjanvijay commented Jan 21, 2022 •

edited

Loading

Narsil commented Jan 21, 2022

mindojune commented Feb 5, 2022

patrickvonplaten commented Feb 7, 2022

github-actions bot commented Mar 4, 2022

hongyuntw commented Jun 29, 2022

patrickvonplaten commented Jun 29, 2022

paulbricman commented Dec 12, 2022 •

edited

Loading

chentiao commented Apr 29, 2023

Dhawgupta commented May 4, 2023

chentiao commented May 5, 2023 •

edited

Loading

Chanwhistle commented Feb 8, 2024

amyeroberts commented Feb 8, 2024

gante commented Feb 14, 2024

model.generate with prefix_allowed_tokens_fn throws RuntimeError: probability tensor contains either inf, nan or element < 0 #15169

model.generate with prefix_allowed_tokens_fn throws RuntimeError: probability tensor contains either inf, nan or element < 0 #15169

Comments

iamjanvijay commented Jan 16, 2022 • edited by patrickvonplaten Loading

Environment info

Who can help

Information

To reproduce

Expected behavior

Possible solution

Narsil commented Jan 20, 2022 • edited Loading

patrickvonplaten commented Jan 21, 2022

iamjanvijay commented Jan 21, 2022 • edited Loading

Narsil commented Jan 21, 2022

mindojune commented Feb 5, 2022

patrickvonplaten commented Feb 7, 2022

github-actions bot commented Mar 4, 2022

hongyuntw commented Jun 29, 2022

patrickvonplaten commented Jun 29, 2022

paulbricman commented Dec 12, 2022 • edited Loading

chentiao commented Apr 29, 2023

Dhawgupta commented May 4, 2023

chentiao commented May 5, 2023 • edited Loading

Chanwhistle commented Feb 8, 2024

amyeroberts commented Feb 8, 2024

gante commented Feb 14, 2024

model.generate with prefix_allowed_tokens_fn throws RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #15169

model.generate with prefix_allowed_tokens_fn throws RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #15169

iamjanvijay commented Jan 16, 2022 •

edited by patrickvonplaten

Loading

Narsil commented Jan 20, 2022 •

edited

Loading

iamjanvijay commented Jan 21, 2022 •

edited

Loading

paulbricman commented Dec 12, 2022 •

edited

Loading

chentiao commented May 5, 2023 •

edited

Loading