bug: Concurrent chat doesnt work on Mac Silicon

### Cortex version

1.0.1-203

### Describe the Bug

Mac: Concurrent chats for the same model are queued up rather than parallel
- Models tested: tinyllama, llama3.2
- I expect to open 2 CLI windows / Postman window and have concurrent chats
- Works well if separate models (eg tinyllama chat & llama3.2 chat)

May be related to `n_parallel` parameter in model.yaml

Windows, Ubuntu: Working as expected

### Steps to Reproduce

_No response_

### Screenshots / Logs

_No response_

### What is your OS?

- [X] MacOS
- [ ] Windows
- [ ] Linux

### What engine are you running?

- [X] cortex.llamacpp (default)
- [ ] cortex.tensorrt-llm (Nvidia GPUs)
- [ ] cortex.onnx (NPUs, DirectML)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: Concurrent chat doesnt work on Mac Silicon #1569

Cortex version

Describe the Bug

Steps to Reproduce

Screenshots / Logs

What is your OS?

What engine are you running?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: Concurrent chat doesnt work on Mac Silicon #1569

Description

Cortex version

Describe the Bug

Steps to Reproduce

Screenshots / Logs

What is your OS?

What engine are you running?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions