Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LLaMA Causal LM with 7B presets #1526

Merged
merged 7 commits into from
Mar 28, 2024

Conversation

tirthasheshpatel
Copy link
Contributor

@tirthasheshpatel tirthasheshpatel commented Mar 27, 2024

This PR adds the LLaMA Causal LM along with a weight conversion script for the 7B presets (LLaMA 7B and LLaMA Chat 7B).

Tested that the outputs of the models match with 1e-4 tolerance on CPU and float32.

TODO:

  • Upload the presets on Kaggle.
  • Look into why huggingface offers different versions of the Rotary Embeddings layers.
    • Looks like it's just an opt-in if the user wants to experiment or train a model from scratch. Might not be something we want at this stage.
  • Implement fixes size cache and cache update. Most probably, will leave this for a follow-up PR.

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! one nit

keras_nlp/models/llama/llama_causal_lm.py Outdated Show resolved Hide resolved
@mattdangerw mattdangerw added the kokoro:force-run Runs Tests on GPU label Mar 27, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Mar 27, 2024
@tirthasheshpatel tirthasheshpatel added the kokoro:force-run Runs Tests on GPU label Mar 27, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Mar 27, 2024
@tirthasheshpatel
Copy link
Contributor Author

Changes on master are breaking the tests in this PR. Will sync and push the final changes.

@tirthasheshpatel tirthasheshpatel added the kokoro:force-run Runs Tests on GPU label Mar 28, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Mar 28, 2024
@mattdangerw
Copy link
Member

Thanks!

@mattdangerw mattdangerw merged commit 62fbbff into keras-team:master Mar 28, 2024
10 checks passed
abuelnasr0 pushed a commit to abuelnasr0/keras-nlp that referenced this pull request Apr 2, 2024
* Add LLaMA Causal LM

* Add causal lm to the public API

* Update preset names and fix checkpoint script

* Fix discrepancies and add tests

* Add tests for CausalLM

* end_token -> stop_token_ids
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants