Skip to content

Server: Handle n_keep parameter in the request#6174

Merged
phymbert merged 1 commit into
ggml-org:masterfrom
get-wrecked:server_n_keep
Mar 20, 2024
Merged

Server: Handle n_keep parameter in the request#6174
phymbert merged 1 commit into
ggml-org:masterfrom
get-wrecked:server_n_keep

Conversation

@jkarthic
Copy link
Copy Markdown
Contributor

No description provided.

Comment thread examples/server/utils.hpp
llama_params["repeat_last_n"] = json_value(body, "repeat_last_n", default_sparams.penalty_last_n);
llama_params["ignore_eos"] = json_value(body, "ignore_eos", false);
llama_params["tfs_z"] = json_value(body, "tfs_z", default_sparams.tfs_z);
llama_params["n_keep"] = json_value(body, "n_keep", 0);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thanks but @ggerganov @ngxson I worry this is actually not OAI compatible ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can consider it as an "extension" to OAI, for example tfs_z or mirostat that we're having, they are not available on OAI.

Copy link
Copy Markdown
Contributor

@ngxson ngxson Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact this code is duplicated to the one inside launch_slot_with_task, I planned to refactor all of OAI-related logic to one place, maybe I'll do this during weekend.

Copy link
Copy Markdown
Contributor

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It's quite surprise to know that server does not have --n-keep argument, maybe we need to add that in the future.

@phymbert phymbert merged commit 47cc7a7 into ggml-org:master Mar 20, 2024
@jkarthic jkarthic deleted the server_n_keep branch March 20, 2024 13:26
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 3, 2024
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026
AlexiAlp pushed a commit to minghaop/llama.cpp that referenced this pull request Jun 2, 2026
AlexiAlp pushed a commit to minghaop/llama.cpp that referenced this pull request Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants