added "tie_word_emebeddings" setting necessary for Llama 3.2 #1895

martin-gorner · 2024-09-30T12:23:33Z

Starting from Llama 3.2, the models use tied embeddings which means that the checkpoints no longer have a separate set of weights for reverse embeddings. This change allows the Transformer "tie_word_embeddings" setting to be read from config.json and instantiates the Llama3 ReversibleEmbedding class with the correct tie_weights setting.

Without this, loading Llama 3.2 errors out with the following error message:
"SafetensorError: File does not contain tensor lm_head.weight"

google-cla · 2024-09-30T12:23:37Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

martin-gorner · 2024-09-30T15:38:02Z

with the exception of some Llama 2 models that do not even have a config.json file like this one:
https://huggingface.co/meta-llama/Llama-2-70b-chat/tree/main
Loading will fail on those with or without the fix.

osanseviero · 2024-09-30T17:16:33Z

Hi 👋 The Llama 2 model you link to (https://huggingface.co/meta-llama/Llama-2-70b-chat) is not a transformers compatible repository but the original research checkpoints that were released. The transformers compatible repo is https://huggingface.co/meta-llama/Llama-2-70b-chat-hf.

Since Llama 3, Meta has released the models with the transformers-compatible weights as the primary release artifact, with the original research checkpoints in an original repository on the Hub.

divyashreepathihalli

Thanks for the update Martin!

added "tie_word_emebeddings" setting necessary for Llama 3.2

68c12a1

divyashreepathihalli approved these changes Sep 30, 2024

View reviewed changes

divyashreepathihalli merged commit eb13900 into keras-team:master Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

added "tie_word_emebeddings" setting necessary for Llama 3.2 #1895

added "tie_word_emebeddings" setting necessary for Llama 3.2 #1895

Uh oh!

martin-gorner commented Sep 30, 2024

Uh oh!

google-cla bot commented Sep 30, 2024

Uh oh!

martin-gorner commented Sep 30, 2024 •

edited

Loading

Uh oh!

osanseviero commented Sep 30, 2024

Uh oh!

divyashreepathihalli left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

added "tie_word_emebeddings" setting necessary for Llama 3.2 #1895

added "tie_word_emebeddings" setting necessary for Llama 3.2 #1895

Uh oh!

Conversation

martin-gorner commented Sep 30, 2024

Uh oh!

google-cla bot commented Sep 30, 2024

Uh oh!

martin-gorner commented Sep 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

osanseviero commented Sep 30, 2024

Uh oh!

divyashreepathihalli left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

martin-gorner commented Sep 30, 2024 •

edited

Loading