Completion seems to ignore EOS token

I'm using your library with [phi-2](https://huggingface.co/TheBloke/dolphin-2_6-phi-2-GGUF) on an Android device (after updating the llama.cpp version). I've noticed that generation seems to ignore or skip end of stream tokens somehow. For example here's the output from `llama.cpp` itself:

prompt:
```
<|im_start|>system
You are a helpful assistant<|im_end|>
<|im_start|>user
hello<|im_end|>
<|im_start|>assistant
```

output:
```
<|im_start|>system
You are a helpful assistant
<|im_start|>user
hello
<|im_start|>assistant
Hello! How can I assist you today? [end of text]
```

When using jllama it looks like this:
```
<|im_start|>system
You are a helpful assistant
<|im_start|>user
hello
<|im_start|>assistant
Hello! How can I assist you today?<|im_end|>
<|im_start|>user
[more output]
```

Note that the text includes the token, and the start token for the next user prompt, which the model is self-generating ;)

Looking at the llama.cpp source, it seems to be stopping here https://github.com/ggerganov/llama.cpp/blob/master/examples/main/main.cpp#L896 and there's a similar condition in jjlama here: https://github.com/kherud/java-llama.cpp/blob/master/src/main/cpp/jllama.cpp#L831 but since I'm not super familiar with these bindings and llama.cpp I haven't figured out what's different in that condition.

Debugging this some more it seems like the token generated when it emits the angled bracket "faulty" `<|im_end>` is not the end of stream token, but the token for `<`. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Completion seems to ignore EOS token #45

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Completion seems to ignore EOS token #45

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions