Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Cerebras-GPT for training #2276

Merged
merged 4 commits into from
Mar 31, 2023

Conversation

olliestanley
Copy link
Collaborator

This adds configs for the 13B and 6.7B Cerebras models but with the necessary code changes made it should be easy enough to add configs for the smaller models too if desired. The tokenizer seems to be the GPT-2 fast tokenizer from HuggingFace so the special tokens have been configured for that.

@@ -189,6 +190,9 @@ def get_tokenizer(conf) -> transformers.AutoTokenizer:
# explicitly specify LLaMATokenizer class until AutoTokenizer works
# assumes that the tokenizer config is stored in the same directory as the model weights
tokenizer = LLaMATokenizer.from_pretrained(conf.model_name)
elif "cerebras" in conf.model_name:
# Cerebras tokenizer for 13B is the tokenizer for all sizes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this have to be specified? Are the other models released without a tokenizer-config? Otherwise this coud be removed .. the special handling for LLaMA was also removed already...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13B is the only model which has an accompanying tokenizer uploaded to HuggingFace, so when removing this line and running with any size other than 13B we get an error saying, for e.g., could not find tokenizer cerebras/Cerebras-GPT-6.7B

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of resolving merge conflicts with the LLaMa change I have tweaked this and added a comment explaining why it is done

Copy link
Collaborator

@andreaskoepf andreaskoepf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@andreaskoepf andreaskoepf merged commit 05d2895 into LAION-AI:main Mar 31, 2023
yk pushed a commit that referenced this pull request Apr 2, 2023
This adds configs for the 13B and 6.7B Cerebras models but with the
necessary code changes made it should be easy enough to add configs for
the smaller models too if desired. The tokenizer seems to be the GPT-2
fast tokenizer from HuggingFace so the special tokens have been
configured for that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants