Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

bug: Failed to Load Model with AMD GPU and Vulkan #1525

@hantran-co

Description

@hantran-co

Cortex version

Jan v0.5.4

Describe the Bug

https://discord.com/channels/1107178041848909847/1296496734901375146

Hi, when I try to use my AMD GPU, with vulkan, I get the failed to load model error and my model just stays inactive.

App log:

2024-10-17T15:33:09.267Z [CORTEX]::Error: Vulkan0: AMD Radeon R9 200 Series (AMD proprietary driver) | uma: 0 | fp16: 0 | warp size: 64

2024-10-17T15:33:10.401Z [CORTEX]::Error: llama_model_load: error loading model: vk::Device::createComputePipeline: ErrorUnknown
llama_load_model_from_file: failed to load model

2024-10-17T15:33:10.403Z [CORTEX]::Error: llama_init_from_gpt_params: error: failed to load model 'C:\Users\mauro\AppData\Roaming\Jan\data\models\tinyllama-1.1b\tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf'

2024-10-17T15:33:10.403Z [CORTEX]:: {"timestamp":1729179190,"level":"ERROR","function":"LoadModel","line":185,"message":"llama.cpp unable to load model","model":"C:\\Users\\mauro\\AppData\\Roaming\\Jan\\data\\models\\tinyllama-1.1b\\tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf"}
20241017 15:33:10.403000 UTC 39488 ERROR Error loading the model - llama_engine.cc:423

2024-10-17T15:33:10.406Z [CORTEX]:: Validating model tinyllama-1.1b
2024-10-17T15:33:10.406Z [CORTEX]:: Load model success with response {}
2024-10-17T15:33:10.408Z [CORTEX]:: Validate model state with response 409
2024-10-17T15:33:10.409Z [CORTEX]:: Validate model state failed with response {"message":"Model has not been loaded, please load model into cortex.llamacpp"} and status is "Conflict"
2024-10-17T15:33:10.409Z [CORTEX]::Error: Validate model status failed

Steps to Reproduce

Steps to Reproduce:

1.	Attempt to load the model using an AMD Radeon R9 200 Series GPU.
2.	Ensure Vulkan is being used.
3.	Load the model located at:

C:\Users\mauro\AppData\Roaming\Jan\data\models\tinylama-a-1.1b\tinylama-1.1b-chat-v1.0.Q4_K_M.gguf

Expected Outcome:

The model should load successfully, and GPU acceleration should work through Vulkan.

Screenshots / Logs

Environment:

•	GPU: AMD Radeon R9 200 Series
•	Driver: AMD proprietary driver
•	Vulkan: Enabled
•	Model Path:

What is your OS?

  • MacOS
  • Windows
  • Linux

What engine are you running?

  • cortex.llamacpp (default)
  • cortex.tensorrt-llm (Nvidia GPUs)
  • cortex.onnx (NPUs, DirectML)

Metadata

Metadata

Assignees

Labels

type: bugSomething isn't working

Type

No type

Projects

Status

Eng Planning

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions