Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix llama3 urls + chat completion termination + nightlies in readme #443

Merged
merged 6 commits into from
Apr 19, 2024

Conversation

mreso
Copy link
Contributor

@mreso mreso commented Apr 19, 2024

What does this PR do?

This PR fix:

  • Identifiers for llama3 in tests
  • Correct termination condition in chat completion example
  • Remove nightlies installation in readme

Fixes # (issue)

Feature/Issue validation/testing

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

  • Test A
    Logs for Test A

  • Test B
    Logs for Test B

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Thanks for contributing 🎉!

@mreso mreso requested a review from subramen April 19, 2024 00:49
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated
@@ -23,10 +23,24 @@ The 'llama-recipes' repository is a companion to the [Meta Llama 2](https://gith
>
> {{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
> ```
> More details on the new tokenizer and prompt template: <PLACEHOLDER_URL>
>
> To signal the end of the current message the model emits the `<\|eot_id\|>` token. To terminate the generation we need to call the model's generate function as follows:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message the model => message, the model

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generation we => generation, we

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mreso is it like two the two EOS terminators should be used to stop generation early? wonder if that. matches our description here or we need a bit of lingo?

Copy link
Contributor

@subramen subramen Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the eos_token_id arg in model.generate specifying the stop sequence for generation?

I think some lingo around understanding the difference between eot_id and end_of_text usage would be helpful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments! Yes, the eos_token_id is the one thats checked in the stopping criteria and usually thats set to <|end_of_text|>. But for dialog kind of prompt the model is trained to use <|eot_id|> (probably to distinguish it from the more final end of sequence). Thats why we need to replace the eos_token_id against the latter id. Otherwise generate rambles on like in this example:

Model output:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Always answer with emojis<|eot_id|><|start_header_id|>user<|end_header_id|>

How to go from Beijing to NY?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

🛫️🚀🛬<|eot_id|><|start_header_id|>assistant<|end_header_id|>

🏨🛬🇨🇳                                                                                                                                                                                                                                 🕰️ 12+hours o
💺Business Class
[...]

The model learned that after an <|eot_id|> comes another header so its adds <|start_header_id|>assistant<|end_header_id|> and then comes another response. (The header is usually appended by the chat template, not the model)
If we exchange the eos_token_id in generate it stops after the model emits the first <|eot_id|>:

Model output:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Always answer with emojis<|eot_id|><|start_header_id|>user<|end_header_id|>

How to go from Beijing to NY?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

✈️ 🗼️🛬<|eot_id|>

Will rework the text accordingly before merigng.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this should have been addressed in this PR which has been merged, https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/discussions/4/files

Copy link
Contributor

@HamidShojanazeri HamidShojanazeri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @mreso just a minor comment if we need to tune the language a bit? if not I am fine with it.

README.md Outdated
@@ -23,10 +23,24 @@ The 'llama-recipes' repository is a companion to the [Meta Llama 2](https://gith
>
> {{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
> ```
> More details on the new tokenizer and prompt template: <PLACEHOLDER_URL>
>
> To signal the end of the current message the model emits the `<\|eot_id\|>` token. To terminate the generation we need to call the model's generate function as follows:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mreso is it like two the two EOS terminators should be used to stop generation early? wonder if that. matches our description here or we need a bit of lingo?

Copy link
Contributor

@subramen subramen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! only added a minor suggestion to include detail on how developers should use eos_tokens

Copy link
Contributor

@HamidShojanazeri HamidShojanazeri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just left minor comments we discussed offline, otherwise, LGTM thanks!

@mreso mreso merged commit 79aa704 into main Apr 19, 2024
3 checks passed
@mreso mreso deleted the fix/urls_readme_chat branch April 19, 2024 23:15
Copy link

@Karliz24 Karliz24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants