Skip to content

Completion seems to ignore EOS token #45

@hvisser

Description

@hvisser

I'm using your library with phi-2 on an Android device (after updating the llama.cpp version). I've noticed that generation seems to ignore or skip end of stream tokens somehow. For example here's the output from llama.cpp itself:

prompt:

<|im_start|>system
You are a helpful assistant<|im_end|>
<|im_start|>user
hello<|im_end|>
<|im_start|>assistant

output:

<|im_start|>system
You are a helpful assistant
<|im_start|>user
hello
<|im_start|>assistant
Hello! How can I assist you today? [end of text]

When using jllama it looks like this:

<|im_start|>system
You are a helpful assistant
<|im_start|>user
hello
<|im_start|>assistant
Hello! How can I assist you today?<|im_end|>
<|im_start|>user
[more output]

Note that the text includes the token, and the start token for the next user prompt, which the model is self-generating ;)

Looking at the llama.cpp source, it seems to be stopping here https://github.com/ggerganov/llama.cpp/blob/master/examples/main/main.cpp#L896 and there's a similar condition in jjlama here: https://github.com/kherud/java-llama.cpp/blob/master/src/main/cpp/jllama.cpp#L831 but since I'm not super familiar with these bindings and llama.cpp I haven't figured out what's different in that condition.

Debugging this some more it seems like the token generated when it emits the angled bracket "faulty" <|im_end> is not the end of stream token, but the token for <.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions