Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Configurable EOS token when using the server #2758

Open
Tracked by #2781
Propheticus opened this issue Apr 19, 2024 · 4 comments
Open
Tracked by #2781

feat: Configurable EOS token when using the server #2758

Propheticus opened this issue Apr 19, 2024 · 4 comments
Assignees
Labels
type: bug Something isn't working

Comments

@Propheticus
Copy link

To properly run Llama 3 models, you need to set stop token <|eot_id|>.
This is currently not configurable when running Jan in API server mode.
The model is automatically loaded by llama.cpp with

llm_load_print_meta: BOS token        = 128000 '<|begin_of_text|>'
llm_load_print_meta: EOS token        = 128001 '<|end_of_text|>'

This causes the model to not stop generating when it should. It places <|eot_id|>assistant\n\n in its output and continues generating several responses/turns.

Of course a fix for llama.cpp is already being made, and surely coming to Nitro Cortex.
Still having this configurable would be nice.

@Propheticus Propheticus added the type: bug Something isn't working label Apr 19, 2024
@Propheticus
Copy link
Author

Sorry started out as a bug (llama 3 not working via server), but chose a more positive approach... That did not change the label though 😞

@Propheticus
Copy link
Author

Right, so I found that you can actually specify the stop token in the API call below the messages array.
"stop": ["<|eot_id|>"]
Too bad not all apps that use restful (Open AI) API calls allow this to be set.

@louis-jan
Copy link
Contributor

Yes, it is quite confusing right now. We will work on the API server to have it directly communicate with model.json, so it can be set by default, and the chat/completion message settings can override this.

@louis-jan
Copy link
Contributor

Will be addressed when Jan x Integration is done @Van-QA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
Status: Planned
Development

No branches or pull requests

2 participants