Generation issues with seq2seq LMs #23413

abarbet · 2023-05-16T18:29:27Z

System Info

transformers version: 4.27.1
Platform: Linux-5.19.0-41-generic-x86_64-with-glibc2.35
Python version: 3.9.12
Huggingface_hub version: 0.13.2
PyTorch version (GPU?): 2.0.0+cu117 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: Yes, parallel (accelerate auto-mapping)

Who can help?

@ArthurZucker @gante

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

This has most recently arisen in using trlX to do reinforcement learning on flan-T5. I wrote an issue on their own repo, but there seems to be no response, and it is somewhat more suited to be an issue in this repo since it has to do with transformers code at its core.

The main issue is that generate with a seq2seq model, namely flan-t5, sometimes generates the following error: RuntimeError: probability tensor contains either `inf`, `nan` or element < 0. This has been well documented in other issues like this one, but the behavior in that issue is more custom than calling generate in its standard configuration.

Here is a code example to reproduce:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

m = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-large", device_map="auto")
t = AutoTokenizer.from_pretrained("google/flan-t5-large")

in_text = """You are a highly intelligent and accurate HVAC domain Resource Description Framework (RDF) data model. You take Passage as input and convert it into HVAC domain RDF triples. A triple is a set of three entities that codifies a statement about semantic data in the form of subject–predicate–object expressions.
Your output format is only [[ subject, predicate, object ], ...] nothing else

Examples: 
Input: The HV123 heating unit can supply 50W of power
Output: [[HV123, powerSupply, 50W]]

Input: Unit: ft. (m)
Model | Cooling Mode | Heating Mode
ABC123 | 28.8 (8.8) | 19.0 (5.8)
ABC456 | 28.8 (8.8) | 19.0 (5.8)
ABC789 | 28.8 (8.8) | 21.3 (6.5)
ABC987 | 29.0 (8.9) | 22.9 (7.0)
Output:"""

ins = t(in_text, return_tensors="pt").input_ids.to("cuda")
outs = m.generate(ins, do_sample=True, max_length=512, top_k=0, temperature=0.7, num_beams=2)

NB:
temperature seems to be one of the main causes of this issue, as removing this kwarg from the generate call does not produce the error in the above case. However, that is not true of all cases. I have seen the error in my trlX training loops with kwargs as simple as: {"max_new_tokens": 512, "do_sample": True, "top_k": 0, "top_p": 1}. Thus it seems this error is not always related to temperature.

Expected behavior

The expected behavior in this case would be for the sampling to work every time instead of having strange edge cases where tokens are unreachable.

The text was updated successfully, but these errors were encountered:

gante · 2023-05-16T18:35:15Z

Hey @abarbet 👋

This issue may arise when beam search, sampling, and long outputs are used together. A potential bug on PyTorch itself compounds it. You can read the full story in this issue.

TL;DR -- my immediate suggestion would be to avoid using num_beams and do_sample together. If you want to use them both, you'll have to read the issue linked above, which describes the problem and solutions :)

abarbet · 2023-05-16T19:22:31Z

Ah thank you, that issue is very helpful! Do you have any idea why we would see a similar error in trlX training despite not using beam sampling? I know you don't have access to my training script and also are most likely not familiar with their codebase, so this is a complete longshot.

The only thing I can think of if it's not caused by a sampling bug is some kind of destructive learning in the PPO step that causes token distributions to get completely out of whack.

gante · 2023-05-17T09:02:11Z

@abarbet It may be due to this PyTorch issue, where the sampling step may pick very low probability tokens that it shouldn't and, in turn, cause computations to derail.

Try running your script with PT 1.x instead of 2.0!

Daryl149 · 2023-05-25T11:29:11Z

@abarbet It may be due to this PyTorch issue, where the sampling step may pick very low probability tokens that it shouldn't and, in turn, cause computations to derail.

Try running your script with PT 1.x instead of 2.0!

For me, this issue also occurs with pytorch 1.13.1
#22914 (comment)

github-actions · 2023-06-18T15:02:11Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

yungsinatra0 · 2023-08-23T08:32:21Z

Hello, has a fix been found for this issue? Using the latest version of transformers and can confirm that when running inference using model.generate() with parameters such as temperature and do_sample causes this issue.

  summary_ids = model.generate(
      inputs["input_ids"],
      max_length=max_length,
      min_length=128,
      temperature=0.1,
      do_sample=True,
      # top_p=0.3
      )

edit: can confirm now that do_sample and temperature is the cause of the issue as top_p works fine for me
edit2: I forgot to mention that the model that I'm using is BRIO, loading pre-trained weights from HF

gante · 2023-08-23T15:26:36Z

@yungsinatra0 The issue should only be gone with the next PT release (i.e. torch>2.0)

SteffenBauer mentioned this issue May 31, 2023

causalLM Text generation with OPT models give weird results #23630

Closed

4 tasks

github-actions bot closed this as completed Jun 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generation issues with seq2seq LMs #23413

Generation issues with seq2seq LMs #23413

abarbet commented May 16, 2023

gante commented May 16, 2023

abarbet commented May 16, 2023

gante commented May 17, 2023 •

edited

Daryl149 commented May 25, 2023

github-actions bot commented Jun 18, 2023

yungsinatra0 commented Aug 23, 2023 •

edited

gante commented Aug 23, 2023

Generation issues with seq2seq LMs #23413

Generation issues with seq2seq LMs #23413

Comments

abarbet commented May 16, 2023

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

gante commented May 16, 2023

abarbet commented May 16, 2023

gante commented May 17, 2023 • edited

Daryl149 commented May 25, 2023

github-actions bot commented Jun 18, 2023

yungsinatra0 commented Aug 23, 2023 • edited

gante commented Aug 23, 2023

gante commented May 17, 2023 •

edited

yungsinatra0 commented Aug 23, 2023 •

edited