feat: Configurable EOS token when using the server #2758

Propheticus · 2024-04-19T15:02:22Z

To properly run Llama 3 models, you need to set stop token <|eot_id|>.
This is currently not configurable when running Jan in API server mode.
The model is automatically loaded by llama.cpp with

llm_load_print_meta: BOS token        = 128000 '<|begin_of_text|>'
llm_load_print_meta: EOS token        = 128001 '<|end_of_text|>'

This causes the model to not stop generating when it should. It places <|eot_id|>assistant\n\n in its output and continues generating several responses/turns.

Of course a fix for llama.cpp is already being made, and surely coming to ~~Nitro~~ Cortex.
Still having this configurable would be nice.

The text was updated successfully, but these errors were encountered:

Propheticus · 2024-04-19T15:04:06Z

Sorry started out as a bug (llama 3 not working via server), but chose a more positive approach... That did not change the label though 😞

Propheticus · 2024-04-19T15:18:52Z

Right, so I found that you can actually specify the stop token in the API call below the messages array.
"stop": ["<|eot_id|>"]
Too bad not all apps that use restful (Open AI) API calls allow this to be set.

louis-jan · 2024-04-21T10:09:00Z

Yes, it is quite confusing right now. We will work on the API server to have it directly communicate with model.json, so it can be set by default, and the chat/completion message settings can override this.

louis-jan · 2024-07-05T04:06:53Z

Will be addressed when Jan x Integration is done @Van-QA

Propheticus added the type: bug Something isn't working label Apr 19, 2024

Van-QA assigned louis-jan Apr 22, 2024

This was referenced Apr 22, 2024

Refactor Inference Engine extensions to Backend #2783

Closed

Refactor Headless Backend #2781

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Configurable EOS token when using the server #2758

feat: Configurable EOS token when using the server #2758

Propheticus commented Apr 19, 2024

Propheticus commented Apr 19, 2024

Propheticus commented Apr 19, 2024

louis-jan commented Apr 21, 2024

louis-jan commented Jul 5, 2024

feat: Configurable EOS token when using the server #2758

feat: Configurable EOS token when using the server #2758

Comments

Propheticus commented Apr 19, 2024

Propheticus commented Apr 19, 2024

Propheticus commented Apr 19, 2024

louis-jan commented Apr 21, 2024

louis-jan commented Jul 5, 2024