Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with special tokens tokenization #838

Closed
earzamastsev opened this issue Oct 23, 2023 · 2 comments
Closed

Error with special tokens tokenization #838

earzamastsev opened this issue Oct 23, 2023 · 2 comments

Comments

@earzamastsev
Copy link

earzamastsev commented Oct 23, 2023

I try using OpenAI-like API with vicuna LLM: python3 -m llama_cpp.server --n_gpu_layers 43 --model ./models/vicuna-13b-v1.5.Q8_0.gguf --port 8010 --host 0.0.0.0 --chat_format vicuna

Send request to endpoint /v1/chat/completions:
{
"max_tokens": 1024,
"temperature": 0.1,
"messages": [
{
"content": "Hello, what is your name?",
"role": "user"
},
{
"content": "My name is AI-asisstant",
"role": "assistant"
},
{
"content": "Can you repeat your name please?",
"role": "user"
}
]
}

And checking final prompt and final tokens. So, i see this:

PROMPT: A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hello, what is your name? ASSISTANT: My name is AI-asisstant </s> USER: Can you repeat your name please? ASSISTANT:

PROMPT TOKENS: [1, 319, 13563, 1546, 263, 12758, 1404, 322, 385, 23116, 21082, 20255, 29889, 450, 20255, 4076, 8444, 29892, 13173, 29892, 322, 1248, 568, 6089, 304, 278, 1404, 29915, 29879, 5155, 29889, 3148, 1001, 29901, 15043, 29892, 825, 338, 596, 1024, 29973, 319, 1799, 9047, 13566, 29901, 1619, 1024, 338, 319, 29902, 29899, 25101, 303, 424, 829, 29879, 29958, 11889, 29901, 1815, 366, 12312, 596, 1024, 3113, 29973, 319, 1799, 9047, 13566, 29901]

I see that special token </s> didn't correctly converted to token id (should be token id = 2) but converted to [29879, 29958]. It is bug?

I saw similar issue in discussion llama.cpp github repo - ggerganov/llama.cpp#1812

@antoine-lizee
Copy link
Contributor

antoine-lizee commented Oct 30, 2023

@abetlen I second the above.

I'm not sure where to share this given that it impacts a few different things. In particular, proper handling of special characters, especially </s> is key for any conversation application. Indeed, a few models (and the top ones: Llama2, Mistral, etc...) rely on reserved tokens all along the conversation - those are not "just" strings.

llama.cpp solved the problem only recently (in ggerganov/llama.cpp#3538), and it now works.

I think porting the change to llama-cpp-python will solve a few other problems that are shared around, eg: Chat Formats don't seem to work for instance ( #711 ).

Note: it might already be on your radar! Sorry if this is stating the obvious.

antoine-lizee pushed a commit to antoine-lizee/llama-cpp-python that referenced this issue Oct 30, 2023
@antoine-lizee
Copy link
Contributor

antoine-lizee commented Oct 30, 2023

Ok, after checking out the latest version of the repo, we do have the right bindings (per this commit) but we don't have the correct default behaviour. As a result, llama-cpp-python behaves differently (and doesn't treat the special characters as it should).

I've opened PR #850 to propose a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants