Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized bad word ids #13433

Merged

Conversation

guillaume-be
Copy link
Contributor

What does this PR do?

This PR optimizes the generation routines when bad_word_ids are provided by the user. Currently, two inefficiencies significantly slow down the text generation when these are passed:

  • single-token bad words are looped over at each generation iteration: this is not required as these single word tokens are always banned, regardless of the input generated so far.
  • The token_ids are queried in nested for loops, causing unnecessary and inefficient cross-device communication if these are on the GPU.

The issue was raised in https://discuss.huggingface.co/t/gpt2-many-bad-words-ids-leading-to-slow-text-generation/9721
I could reproduce the issue and observed a severe slowdown of the generation when ~2000 bad word ids are provided (see https://gist.github.com/guillaume-be/2a3e91951869414b6f1f8ab8c2cd642f gist). I observed a ~20x slowdown of the generation when using the bad words with a GPU, from ~1.7s to >25s per generation.

This PR fixes the issue by:

  • Moving all of the current token ids to a Python list before multiple iteration through that list, leading to a 20x speedup
  • Splitting the bad word ids into 1-element bad words, and words that are made of multiple sub-tokens. For the 1-element bad words, a static bad word pas is pre-computed and re-used for each generation step. This accelerates the generation by a further ~10%.

Fixes https://discuss.huggingface.co/t/gpt2-many-bad-words-ids-leading-to-slow-text-generation/9721

Before submitting

Who can review?

@patrickvonplaten - maybe you would like to have a look?

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the improved generation processor here!

@patrickvonplaten patrickvonplaten merged commit 63b90a5 into huggingface:master Sep 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants