-
Notifications
You must be signed in to change notification settings - Fork 14k
Open
Labels
enhancementNew feature or requestNew feature or requesthotSomething that is hotSomething that is hotserver/apiserver/webui
Description
A while ago, we introduced the notion of "preset" in #10932 which was implemented using the existing arg.cpp infrastructure.
I'm now thinking about extending this into a new level. As the title said, I'm proposing adding an extension on top of arg.cpp that will allow import/export arguments from/to INI file
Use cases for this feature are:
- Allow user to write their own preset, as suggested in arg : add model catalog #13385
- Allow user to share their config more easily
- Providing a way to specify per-model config in the newly added
llama-serverrouter mode server: introduce API for serving / loading / unloading multiple models #17470 - Allow new CLI experience to save certain user preferences - just like how the "Settings" on web UI works; ref: cli: new CLI experience #17824
Why choosing ini?
- It is simple to understand, simple to edit
- It provides multiple "sections" - each section can be used as a new preset
- The parser/writer can be simple - For example, this parser/writer is just ~700 LOC. We can do even better because we only use a subset of the language (only string value, no nested config)
Example of a config file:
(the config name is the env var name - already defined inside arg.cpp)
LLAMA_CONFIG_VERSION = 1
[gemma-4b-vision]
LLAMA_ARG_HF_REPO = ggml-org/gemma-3-4b-it-qat-GGUF
; this is a small model, I can offload everything to VRAM
LLAMA_ARG_N_GPU_LAYERS = 99999
LLAMA_ARG_N_CTX = 0
LLAMA_ARG_JINJA = true
[gemma-12b-vision]
LLAMA_ARG_HF_REPO = ggml-org/gemma-3-12b-it-qat-GGUF
; my system doesn't have enough VRAM, only offload some
LLAMA_ARG_N_GPU_LAYERS = 8
LLAMA_ARG_N_CTX = 0
LLAMA_ARG_JINJA = trueServeurpersoCom, ggerganov, iwr-redmond, dwrz, laurentperez and 1 more
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requesthotSomething that is hotSomething that is hotserver/apiserver/webui