-
Notifications
You must be signed in to change notification settings - Fork 25.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generation doc #6470
Generation doc #6470
Conversation
Codecov Report
@@ Coverage Diff @@
## master #6470 +/- ##
==========================================
- Coverage 80.55% 80.37% -0.18%
==========================================
Files 153 156 +3
Lines 28001 28058 +57
==========================================
- Hits 22556 22552 -4
- Misses 5445 5506 +61
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, LGTM!
bos_token_id (:obj:`int`, `optional`): | ||
The id of the `beginning-of-stream` token. | ||
eos_token_id (:obj:`int`, `optional`): | ||
The id of the `end-of-stream` token. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beginning-of-sequence and end-of-sequence?
* add MBartForConditionalGeneration * style * rebase and fixes * add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS * fix docs * don't ignore mbart * doc * fix mbart fairseq link * put mbart before bart * apply doc suggestions
* Use hash to clean the test dirs * Use hash to clean the test dirs * Use hash to clean the test dirs * fix
* add cross attention layers for gpt2 * make gpt2 cross attention work * finish bert2gpt2 * add explicit comments * remove attention mask since not yet supported * revert attn mask in pipeline * Update src/transformers/modeling_gpt2.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_encoder_decoder.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
||
Adapted in part from `Facebook's XLM beam search code`_. | ||
r""" | ||
Generates sequences for models with a language modeling head. The method currently supports greedy decoding, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could mayb add here that this applies to all AUTO_MODEL_FOR_CAUSAL_LM
and AUTO_MODEL_FOR _SEQ2SEQ
-> not sure though whether this is very useful for the user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are not documented so it's not really useful inside the documentation.
top_p (:obj:`float`, `optional`, defaults to 1.0): | ||
If set to float < 1, only the most probable tokens with probabilities that add up to ``top_p`` or | ||
higher are kept for generation. | ||
repetition_penalty (:obj:`float`, `optional`, defaults to 1.0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can link even to the original paper here, where it is explained: https://arxiv.org/pdf/1909.05858.pdf - page 5 first equation, since it might be hard to understand what the penalty does exactly. But maybe this goes to far here...most of these params are explained here: https://huggingface.co/blog/how-to-generate maybe we can add a link above "Parameters".
eos_token_id (:obj:`int`, `optional`): | ||
The id of the `end-of-stream` token. | ||
length_penalty (:obj:`float`, `optional`, defaults to 1.0): | ||
Exponential penalty to the length. 1.0 means no penalty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this param can be confusing as a penalty < 1.0 penalizes length whereas a penalty > 1.0 encourages longer sequences. Maybe we can add a sentence like:
Note: In order to encourage the model to generate shorter sequences, `length_penalty` should be set < 1.0. In order to encourage the model to produce longer sequences, the parameter should be set to > 1.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #4915
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Would add one sentence for length_penalty
to avoid confusion as noted down in this issue: #4915
* change unique_no_split_tokens's type to set * use sorted list instead of set * style
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
This reverts commit 883b0ea.
Add documentation (and clean docstrings) of
GenerationMixin
andTFGenerationMixin
.