Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate: FLAX infers pad token in its absence and has functional example #21009

Merged
merged 2 commits into from
Jan 5, 2023

Conversation

gante
Copy link
Member

@gante gante commented Jan 4, 2023

What does this PR do?

Some bug fixing in advance of #21007 (PR that adds generation config to Flax), to ensure we start from a functional flax generate codebase.

In particular:

  1. Flax now assumes the value pad_token_id when it is None and eos_token_id is not None, like TF and PT do. This is very helpful for open text generation examples, like with GPT2, was an open request (Generating with Flax fails when using Causal Language models #18884), and was one of the causes for failure in the existing example. This also includes the recent changes of Add custom stop token ids for generation #20727, where eos_token_id can be a list of tokens.
  2. An int32 type specification was missing in the special tokens -- when converted to JAX variables, JAX assumed they were float32;
  3. The existing flax generate example is now part of our doctests, and runs.

Comment on lines +705 to +709
eos_token_id = generation_config.eos_token_id
if isinstance(eos_token_id, list):
eos_token_id = eos_token_id[0]
logger.warning(f"Setting `pad_token_id` to `eos_token_id`:{eos_token_id} for open-end generation.")
generation_config.pad_token_id = eos_token_id
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took the opportunity also to copy the logic to TF, so it can also handle eos_token_id as a list 👀

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jan 4, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@gante gante merged commit beb24f2 into huggingface:main Jan 5, 2023
Copy link
Contributor

@sanchit-gandhi sanchit-gandhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @gante!

silverriver pushed a commit to silverriver/transformers that referenced this pull request Jan 6, 2023
venkat-natchi pushed a commit to venkat-natchi/transformers that referenced this pull request Jan 22, 2023
miyu386 pushed a commit to miyu386/transformers that referenced this pull request Feb 9, 2023
@gante gante deleted the flax_generate_example branch May 18, 2023 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants