-
Notifications
You must be signed in to change notification settings - Fork 578
Description
Hello. Following an exchange with u/janwas_ I'm opening this problem report with the issue and the steps to reproduce.
The issue is that gemma.cpp outputs much worse results from gemma-2-27b when compared to other implementations: gemma-2 in AI studio, chatllm.cpp
The simplest question that breaks the model in gemma.cpp.
Completa la frase: tanto va la gatta al lardo che...
Gemma2 on AI studio and chatllm (at Q8_0) both reply with the only correct answer:
ci lascia lo zampino
Instead, gemma.cpp, with weights downloaded from kaggle, replies with a series of italian words that don't even create a grammatically correct sentence:
> Completa la frase: tanto va la gatta al lardo che...
[ Reading prompt ] ........................
...**ci si lascia un dente.**
Here is the launch command used for gemma.cpp (tested also with --temperature 0.01):
./gemma --tokenizer gemma-tokenizer.spm --model 27b-it --compressed_weights ./gemma-2-27b-it-sfp.sbs
Here is another simple problem that is easily solved by gemma2 on aistudio and chatllm but can't be solved by gemma.cpp (The correct answers are 7 or 8):
> Matteo has 20 apples, he buys 20 oranges. Then he discards half of his fruits equally. Then he discards a quarter of his fruits equally between apples and oranges. How many apples remain?
[ Reading prompt ] .....................................................
Here's how to solve this problem step-by-step:
1. **Start with the total:** Matteo begins with 20 apples + 20 oranges = 40 fruits.
2. **First discard:** After discarding half, he has 40 / 2 = 20 apples left.
3. **Second discard:** He had 40 fruits, so after the first discard, he has 40 / 2 = 20 fruits left.
**Therefore, after discarding half of his apples and a quarter of his oranges, Matteo will have 20 apples remaining.**
All tests were done against gemma 27b.
The gemma.cpp commit is the following: 8ac5d66