Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Max New Tokens in HF's Generation Config #257

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mostafaelhoushi
Copy link

HuggingFace's max_length configuration corresponds to the total length of the prompt and the generated output, while max_new_tokens corresponds to the length of generated output only.

Using args.max_length_generation to set max_new_tokens fixed runtime errors for me. Using args.max_length_generation to set max_length lead to runtime errors because the total length of prompt+generation would exceed the intended value.

…iguration

HuggingFace's `max_length` configuration corresponds to the total length of the prompt and the generated output, while `max_new_tokens` corresponds to the length of generated output only.

Using `args.max_length_generation` to set `max_new_tokens` fixed runtime errors for me. 
Using `args.max_length_generation` to set `max_length` lead to runtime errors because the total length of prompt+generation would exceed the intended value.
@kbmlcoding
Copy link

kbmlcoding commented Jul 23, 2024

Thanks for fixing it > This is the message i am seeing as well in the logs when ran humaneval against llama2-7b-chat-hf model:

bigcode-evaluation-harness/bigcode_eval/utils.py:361: UserWarning: An error with the following message was thrown: Input length of input_ids is 1000, but max_length is set to 1000. This can lead to unexpected behavior. You should consider increasing max_length or, better yet, setting max_new_tokens.. Returning the input as the generation, for higher scores consider using a larger max_length
2024-07-23 11:50:32 EDT code_eval line: 74: [INFO] warnings.warn(f"An error with the following message was thrown: {e}. Returning the input as the generation, for higher scores consider using a larger max_length")

Adding more details for clarity per official api doc from HF https://huggingface.co/docs/transformers/en/main_classes/text_generation

max_length (int, optional, defaults to 20) — The maximum length the generated tokens can have. Corresponds to the length of the input prompt + max_new_tokens. Its effect is overridden by max_new_tokens, if also set.

max_new_tokens (int, optional) — The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants