-
Notifications
You must be signed in to change notification settings - Fork 18
AMD RX 7800XT LLM Setup
Guide is provided by Ian Wirtz
Recommended: LM Studio (
lms) tends to give the best results, but Ollama is a viable alternative.
Before LM Studio or Ollama can use your GPU via ROCm, the system needs the base user-space HIP libraries to communicate with the kernel driver.
Arch Linux / CachyOS:
sudo pacman -S rocm-hip-runtimeUbuntu / Debian:
sudo apt install rocm-hip-runtimeFedora:
sudo dnf install rocm-hip-runtimeGPU access permissions: Your user must belong to the
renderandvideogroups. Check with:groupsIf either group is missing, add yourself:
sudo usermod -aG video,render $USERYou must fully log out and back in (or reboot) for the group change to take effect.
On fast-moving distributions like Arch, the management tools are not always pulled in as a strict dependency of rocm-hip-runtime. Install them explicitly to avoid library version mismatches.
Arch Linux / CachyOS:
sudo pacman -S rocm-smi-libUbuntu / Debian:
Debian splits the CLI tool and its runtime library into separate packages.
# Command-line monitoring tool
sudo apt update && sudo apt install rocm-smi
# Runtime libraries
sudo apt update && sudo apt install librocm-smi64-1Tip: If the exact package name is uncertain, type
sudo apt install librocm-smi64and press Tab to autocomplete the current version suffix.
Fedora:
# CLI tool
sudo dnf install rocm-smi
# C/C++ development libraries and headers (equivalent to rocm-smi-lib on Arch)
sudo dnf install rocm-smi-develWhen invoking lms load, pass the hardware acceleration flags explicitly. The --gpu max flag instructs the runtime to load the entire model into VRAM.
HSA_OVERRIDE_GFX_VERSION=11.0.0 lms load tulu-3.1-8b-supernova --context-length 8192 --gpu maxThe HSA_OVERRIDE_GFX_VERSION=11.0.0 prefix tells the ROCm stack to treat the RX 7800 XT as a natively supported compute target, bypassing silent fallback failures.
To avoid prefixing every command with the environment variable, add it to your shell profile.
Bash:
echo 'export HSA_OVERRIDE_GFX_VERSION=11.0.0' >> ~/.bashrc
source ~/.bashrcFish (CachyOS default) - Option A: Universal Variable (recommended)
Set it once; Fish persists it automatically across reboots with no further configuration needed:
set -Ux HSA_OVERRIDE_GFX_VERSION 11.0.0Fish - Option B: Explicit config file entry
echo 'set -gx HSA_OVERRIDE_GFX_VERSION 11.0.0' >> ~/.config/fish/config.fish
source ~/.config/fish/config.fishOllama (systemd service):
Because Ollama runs under its own ollama system user, the variable must be injected via a systemd drop-in:
sudo mkdir -p /etc/systemd/system/ollama.service.d
sudo nano /etc/systemd/system/ollama.service.d/override.confPaste the following, then save and exit (Ctrl+O, Enter, Ctrl+X):
[Service]
Environment="HSA_OVERRIDE_GFX_VERSION=11.0.0"Then reload and restart the service:
sudo systemctl daemon-reload
sudo systemctl restart ollamaIf a command hangs, the kernel compute layer (amdkfd) may not be initialised. Check whether the system exposes your GPU as a ROCm compute platform:
rocminfoScroll to the top of the output. If you see Can't open /dev/kfd or a crash, the Linux kernel is not exposing the compute interface to user space. If you are running a custom or bleeding-edge kernel, try booting into the stock or LTS kernel (linux-lts) to rule out a driver regression.
LM Studio:
lms server startOllama:
ollama serveThen confirm the model is loaded into VRAM:
rocm-smiIdle (no model loaded):

At idle the GPU draws minimal power (~9W), clocks are near-floor, and VRAM usage is low (~44%).
Under load (model + game running simultaneously):

Under combined load you should see VRAM usage climb significantly (71% in this example), GPU utilisation rise, and power draw increase accordingly (~147W). This confirms the model is resident in VRAM and inference is running on the GPU.