-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there something wrong with 'google/gemma-1.1-2b-it' ? #1854
Comments
Hi! Could you rerun with For reasons unclear to me, Gemma performance is dramatically lower when it does not receive a BOS token. |
Thanks, happy to know this flag, will try it later. But I still have some question about gemma, one is it version dramatically worse than base version (10 points lower on gsm8k), the other is gemma model get lower score on gsm8k-cot than gsm8k. However I have tried a gsm8k script from gemma official github, which do show considerable benifit from cot prompt. I tried to align hf-eval and deepmind script but in vain. Would you help like to check this two strange question? Thanks for your brilliant job here. |
Hi, @haileyschoelkopf I have tried add_bos_token=True like this lm_eval --model vllm --model_args pretrained=gemma-2b,add_bos_token=True --tasks gsm8k-cot --batch_size auto
Using lm_eval harness hf model to test gsm8k_cot can get -> 16.76(19.26) which is quite similar to my result. |
I test google/gemma-1.1-2b-it on gsm8k with following command
CUDA_VISIBLE_DEVICES=3 lm_eval --model vllm \ --model_args pretrained=gemma-1.1-2b-it ,dtype=auto,gpu_memory_utilization=0.8, \ --tasks gsm8k \ --batch_size auto
My result is :
I think this is a weird result since gemma-2b reported a score close to 18 while here google/gemma-1.1-2b-it only get less than 10...
Any idea? 😢
The text was updated successfully, but these errors were encountered: