sleeepss

Sleeepss sleeepss

Popular repositories Loading

README README Public
vllm-rdna4-container-patches vllm-rdna4-container-patches Public

Runtime patches + CPU-offload flag to run vLLM on Radeon RX 9070 XT (gfx1201/RDNA 4) in a container, including 14B Q4_K_M GGUF models on 16 GB of VRAM. Layered on top of bluefalcon13/vllm-rocm.

Python