AMD RX 7800XT LLM Setup

Running LLMs on AMD RX 7800 XT (ROCm Guide)

Guide is provided by Ian Wirtz

Recommended: LM Studio (lms) tends to give the best results, but Ollama is a viable alternative.

Prerequisites

Step 1 - Install `rocm-hip-runtime`

Before LM Studio or Ollama can use your GPU via ROCm, the system needs the base user-space HIP libraries to communicate with the kernel driver.

Arch Linux / CachyOS:

sudo pacman -S rocm-hip-runtime

Ubuntu / Debian:

sudo apt install rocm-hip-runtime

Fedora:

sudo dnf install rocm-hip-runtime

GPU access permissions: Your user must belong to the render and video groups. Check with:
groups
If either group is missing, add yourself:
sudo usermod -aG video,render $USER
You must fully log out and back in (or reboot) for the group change to take effect.

Step 2 - Install `rocm-smi` (may be optional)

On fast-moving distributions like Arch, the management tools are not always pulled in as a strict dependency of rocm-hip-runtime. Install them explicitly to avoid library version mismatches.

Arch Linux / CachyOS:

sudo pacman -S rocm-smi-lib

Ubuntu / Debian:

Debian splits the CLI tool and its runtime library into separate packages.

# Command-line monitoring tool
sudo apt update && sudo apt install rocm-smi

# Runtime libraries
sudo apt update && sudo apt install librocm-smi64-1

Tip: If the exact package name is uncertain, type sudo apt install librocm-smi64 and press Tab to autocomplete the current version suffix.

Fedora:

# CLI tool
sudo dnf install rocm-smi

# C/C++ development libraries and headers (equivalent to rocm-smi-lib on Arch)
sudo dnf install rocm-smi-devel

Running a Model

Step 3 - Load a Model with ROCm Acceleration

When invoking lms load, pass the hardware acceleration flags explicitly. The --gpu max flag instructs the runtime to load the entire model into VRAM.

HSA_OVERRIDE_GFX_VERSION=11.0.0 lms load tulu-3.1-8b-supernova --context-length 8192 --gpu max

The HSA_OVERRIDE_GFX_VERSION=11.0.0 prefix tells the ROCm stack to treat the RX 7800 XT as a natively supported compute target, bypassing silent fallback failures.

Step 4 - Make the Configuration Permanent

To avoid prefixing every command with the environment variable, add it to your shell profile.

Bash:

echo 'export HSA_OVERRIDE_GFX_VERSION=11.0.0' >> ~/.bashrc
source ~/.bashrc

Fish (CachyOS default) - Option A: Universal Variable (recommended)

Set it once; Fish persists it automatically across reboots with no further configuration needed:

set -Ux HSA_OVERRIDE_GFX_VERSION 11.0.0

Fish - Option B: Explicit config file entry

echo 'set -gx HSA_OVERRIDE_GFX_VERSION 11.0.0' >> ~/.config/fish/config.fish
source ~/.config/fish/config.fish

Ollama (systemd service):

Because Ollama runs under its own ollama system user, the variable must be injected via a systemd drop-in:

sudo mkdir -p /etc/systemd/system/ollama.service.d
sudo nano /etc/systemd/system/ollama.service.d/override.conf

Paste the following, then save and exit (Ctrl+O, Enter, Ctrl+X):

[Service]
Environment="HSA_OVERRIDE_GFX_VERSION=11.0.0"

Then reload and restart the service:

sudo systemctl daemon-reload
sudo systemctl restart ollama

Verification

Step 5 - Confirm the Kernel Compute Driver is Loaded

If a command hangs, the kernel compute layer (amdkfd) may not be initialised. Check whether the system exposes your GPU as a ROCm compute platform:

rocminfo

Scroll to the top of the output. If you see Can't open /dev/kfd or a crash, the Linux kernel is not exposing the compute interface to user space. If you are running a custom or bleeding-edge kernel, try booting into the stock or LTS kernel (linux-lts) to rule out a driver regression.

Step 6 - Start the Server and Verify VRAM Usage

LM Studio:

lms server start

Ollama:

ollama serve

Then confirm the model is loaded into VRAM:

rocm-smi

Idle (no model loaded):

rocm-smi output with no model running

At idle the GPU draws minimal power (~9W), clocks are near-floor, and VRAM usage is low (~44%).

Under load (model + game running simultaneously):

rocm-smi output with Elite Dangerous and EliteIntel running

Under combined load you should see VRAM usage climb significantly (71% in this example), GPU utilisation rise, and power draw increase accordingly (~147W). This confirms the model is resident in VRAM and inference is running on the GPU.

AMD RX 7800XT LLM Setup

Running LLMs on AMD RX 7800 XT (ROCm Guide)

Prerequisites

Step 1 - Install rocm-hip-runtime

Step 2 - Install rocm-smi (may be optional)

Running a Model

Step 3 - Load a Model with ROCm Acceleration

Step 4 - Make the Configuration Permanent

Verification

Step 5 - Confirm the Kernel Compute Driver is Loaded

Step 6 - Start the Server and Verify VRAM Usage

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Step 1 - Install `rocm-hip-runtime`

Step 2 - Install `rocm-smi` (may be optional)