bug: some of the engine parameters in the model load request are ignored

### Cortex version

1.0.6

### Describe the issue and expected behaviour

When starting a model, there are engine parameters that can be configured as described here: https://github.com/janhq/cortex.llamacpp. However, when sending these parameters through the cortex.cpp server, most of them are filtered out due to a new model.yaml configuration that hardcodes several acceptable parameters. 

After reviewing the model.yaml implementation, I noticed that the settings are not applicable because these declaration are missing. So that they all fallback to default settings.

- cpu_threads
- n_batch
- caching_enabled
- grp_attn_n
- grp_attn_w
- mlock
- grammar_file
- model_type
- model_alias
- flash_attn
- cache_type
- use_mmap
- llama_model_path
- embedding
- cont_batching
- user_prompt
- ai_prompt
- system_prompt
- pre_prompt


### Steps to Reproduce

1. Start cortex server
2. Start a model by sending a request with `cpu_threads` or `n_batch` settings
3. Observe cortex.log
4. See the error

### Screenshots / Logs

_No response_

### What is your OS?

- [ ] Windows
- [ ] Mac Silicon
- [ ] Mac Intel
- [ ] Linux / Ubuntu

### What engine are you running?

- [ ] cortex.llamacpp (default)
- [ ] cortex.tensorrt-llm (Nvidia GPUs)
- [ ] cortex.onnx (NPUs, DirectML)

### Hardware Specs eg OS version, GPU

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: some of the engine parameters in the model load request are ignored #1824

Cortex version

Describe the issue and expected behaviour

Steps to Reproduce

Screenshots / Logs

What is your OS?

What engine are you running?

Hardware Specs eg OS version, GPU

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: some of the engine parameters in the model load request are ignored #1824

Description

Cortex version

Describe the issue and expected behaviour

Steps to Reproduce

Screenshots / Logs

What is your OS?

What engine are you running?

Hardware Specs eg OS version, GPU

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions