-
-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Open
Labels
Description
LocalAI version:
LocalAI v4.0.0 (8e8b7df)
Environment, CPU architecture, OS, and Version:
AMD Strix Halo 395+
128Gb Unified Memory (96 VRAM)
Fedora 42
Describe the bug
I am unable to load either qwen3-tts or qwen3-asr with errors saying FlashAttention2 is not found.
To Reproduce
Docker Compose file:
localai:
container_name: localai
image: localai/localai:latest-gpu-hipblas
environment:
DEBUG=true
MODELS_PATH: /models
ports:
- "8080:8080"
volumes:
- ./backends:/backends
- ./config:/config
- ./models:/models
devices:
- /dev/dri:/dev/dri
- /dev/kfd:/dev/kfd
group_add:
- "video"
restart: unless-stopped
After that install qwen3-tts and try to run the model.
Backend will download but fail load.
Expected behavior
Models should load without errors.
Logs
Additional context
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Projects
Status
In Progress