-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ollama run qwen:0.5B, Reply exception, stuck in a loop. #2405
Comments
I had the same behavior with phi2 model. I noticed that the model gives the right or the expected answer before going to a new line (\n). So I had to add "\n" in the stop list. const stream = await generate({
model: "phi",
prompt: text,
stream: true,
options: {
num_predict: 70,
temperature: 0.65,
penalize_newline: true,
top_p: 0.9,
// presence_penalty: 0.6,
stop: ["\n", "User:", "Assistant:", "User:"] //["\n"]
}
}) It still cuts at a wrong place sometimes, but I can manage to just remove the words after the last punctuation: . or , |
The infinite generation should be fixed now. As for the poor responses from smaller models – this may be from the prompt template, prompt or other reasons – all of which we are trying to improve. |
@jmorganca So we should update the Ollama binary verison then right ? |
~ uname -m -s -r Darwin 23.3.0 arm64
20240208104056.mp4
/label bug
The text was updated successfully, but these errors were encountered: