You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The evaluation metric seems worse than I anticipated, especially the ppl_token which seems too high. I wonder if that's an averaged or summed measurement. I add a screenshot below and you can see my full results in this google sheet.
This may be related to 1. I noticed when it inialized allenai/OLMo-1B, there's a warning that some many of the weights (if not all) are not initialized correctly. From the log below, it seems to be trying to initialize with the class OlmoForCausalLM.
❓ The question
This is a cross-post from allenai/OLMo-Eval#31 for visibility.
I ran olmo_eval with allenai/OLMo-1B on the paloma dataset and I noticed two issues:
ppl_token
which seems too high. I wonder if that's an averaged or summed measurement. I add a screenshot below and you can see my full results in this google sheet.allenai/OLMo-1B
, there's a warning that some many of the weights (if not all) are not initialized correctly. From the log below, it seems to be trying to initialize with the classOlmoForCausalLM
.The configuration and environment I use to reproduce this results can be found in the issue allenai/OLMo-Eval#31
The text was updated successfully, but these errors were encountered: