Add support for Cerebras-GPT for training #2276

olliestanley · 2023-03-31T09:18:27Z

This adds configs for the 13B and 6.7B Cerebras models but with the necessary code changes made it should be easy enough to add configs for the smaller models too if desired. The tokenizer seems to be the GPT-2 fast tokenizer from HuggingFace so the special tokens have been configured for that.

andreaskoepf · 2023-03-31T09:53:48Z

model/model_training/utils.py

@@ -189,6 +190,9 @@ def get_tokenizer(conf) -> transformers.AutoTokenizer:
        # explicitly specify LLaMATokenizer class until AutoTokenizer works
        # assumes that the tokenizer config is stored in the same directory as the model weights
        tokenizer = LLaMATokenizer.from_pretrained(conf.model_name)
+    elif "cerebras" in conf.model_name:
+        # Cerebras tokenizer for 13B is the tokenizer for all sizes


does this have to be specified? Are the other models released without a tokenizer-config? Otherwise this coud be removed .. the special handling for LLaMA was also removed already...

13B is the only model which has an accompanying tokenizer uploaded to HuggingFace, so when removing this line and running with any size other than 13B we get an error saying, for e.g., could not find tokenizer cerebras/Cerebras-GPT-6.7B

As part of resolving merge conflicts with the LLaMa change I have tweaked this and added a comment explaining why it is done

andreaskoepf

LGTM!

This adds configs for the 13B and 6.7B Cerebras models but with the necessary code changes made it should be easy enough to add configs for the smaller models too if desired. The tokenizer seems to be the GPT-2 fast tokenizer from HuggingFace so the special tokens have been configured for that.

olliestanley added 3 commits March 31, 2023 09:03

Add configs for Cerebras 6.7B and 13B models

b88d004

Updates to make Cerebras-GPT work

11664cc

Add special tokens

1d5d8bb

olliestanley added the ml label Mar 31, 2023

olliestanley requested review from theblackcat102, sanagno, dvruette, andreaskoepf and yk as code owners March 31, 2023 09:18

andreaskoepf reviewed Mar 31, 2023

View reviewed changes

Merge branch 'main' into add-cerebras-models

577b1ae

andreaskoepf approved these changes Mar 31, 2023

View reviewed changes

andreaskoepf merged commit 05d2895 into LAION-AI:main Mar 31, 2023

olliestanley mentioned this pull request Mar 31, 2023

Use Cerebras-GPT for fine tuning. #2258

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Cerebras-GPT for training #2276

Add support for Cerebras-GPT for training #2276

olliestanley commented Mar 31, 2023

andreaskoepf Mar 31, 2023

olliestanley Mar 31, 2023

olliestanley Mar 31, 2023

andreaskoepf left a comment

Add support for Cerebras-GPT for training #2276

Add support for Cerebras-GPT for training #2276

Conversation

olliestanley commented Mar 31, 2023

andreaskoepf Mar 31, 2023

Choose a reason for hiding this comment

olliestanley Mar 31, 2023

Choose a reason for hiding this comment

olliestanley Mar 31, 2023

Choose a reason for hiding this comment

andreaskoepf left a comment

Choose a reason for hiding this comment