You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SmolGPT (49M) is not very performant. It gave it the following prompt:
defadd_one(x):
It completed it as follows:
defadd_one(x):
return10
I think that there are a number of issues.
First, The model could of course be bigger. With more optimised kernels, modern techniques like rotary embeddings and multi-gpu support, we will hopefully be able to train a larger model in a reasonable amount of time.
Second, the dataset can probably be improved. More evaluation is needed but I think adding some web text has the potential to make using the model easier
The text was updated successfully, but these errors were encountered:
SmolGPT (49M) is not very performant. It gave it the following prompt:
It completed it as follows:
I think that there are a number of issues.
First, The model could of course be bigger. With more optimised kernels, modern techniques like rotary embeddings and multi-gpu support, we will hopefully be able to train a larger model in a reasonable amount of time.
Second, the dataset can probably be improved. More evaluation is needed but I think adding some web text has the potential to make using the model easier
The text was updated successfully, but these errors were encountered: