feat(agents): multi-device support — CPU, GPU, NPU per-agent selection by kovtcharov-amd · Pull Request #1252 · amd/gaia

kovtcharov-amd · 2026-05-29T07:31:28Z

GAIA agents couldn't leverage AMD Ryzen AI NPU hardware — inference defaulted to GPU via llamacpp with no way to select an alternative device. Users with XDNA2 NPUs had no path to power-efficient local inference, and there was no framework for per-agent device selection across CPU, GPU, and NPU.

Now each agent declares which devices it supports via DeviceConfig tuples. Users pick a device per-agent — dropdown in Agent UI, --device {cpu,gpu,npu} on CLI. GPU remains the default. NPU uses the FLM backend (gemma4-it-e2b-FLM); CPU falls back automatically with a latency warning. gaia init --profile npu handles NPU detection, FLM backend installation, and model download. Eval-verified on Ryzen AI MAX+ PRO 395: NPU matches or exceeds GPU quality (personality 3/3 @ 9.5/10, context_retention 4/4 @ 9.8/10) at ~24 tok/s.

Test plan

python -m pytest tests/unit/test_npu_device_support.py -xvs — 25 tests
gaia chat --device gpu — announces device, loads correct model
gaia chat --device npu — loads FLM model on NPU hardware
gaia chat --device cpu — warns about slow response times
gaia init --profile npu — detects NPU, installs FLM backend, downloads model
gaia init --profile npu on non-NPU hardware — fails loudly
gaia eval agent --device npu --category personality — all scenarios pass
Agent UI: device dropdown visible on agent cards, filtered by detected hardware

#1220) Agents now declare which (device, model, recipe, backend) tuples they support via DeviceConfig. Users pick a device per-agent in the Agent UI dropdown or via `--device {cpu,gpu,npu}` on CLI. GPU is the default. NPU uses the FLM backend on Ryzen AI XDNA2 hardware; CPU falls back with a latency warning. Backend: DeviceConfig dataclass, GaiaConfig persistence (~/.gaia/config.json), LemonadeClient.install_backend/uninstall_backend/get_recipe_status, LemonadeManager device parameter, session-level device column + migration. CLI: `gaia init --profile npu` (NPU detection, FLM backend install, recipe- aware model download), `gaia chat --device npu`, `gaia eval agent --device npu`. Frontend: DeviceConfig type, activeDevice/detectedDevices store state, per-agent device dropdown with verified/unverified badges. Eval-verified on Ryzen AI MAX+ PRO 395: NPU matches or exceeds GPU quality (personality 3/3 9.5avg, context_retention 4/4 9.8avg) at ~24 tok/s.

- Black: reformat cli.py - Pylint: remove reimport of LemonadeClient (already at module level) - Flake8: remove unused imports (tempfile, Path) in test file

kovtcharov-amd · 2026-05-29T09:52:46Z

@itomek-amd, this feature requires manual testing.

itomek

Solid, well-layered feature: the device abstraction threads cleanly from DeviceConfig/DEFAULT_DEVICE_CONFIGS in registry.py through the CLI --device, the sessions DB migration, the agents API response, and the UI dropdown, with docs (npu.mdx + docs.json) and a 382-line test file. GPU stays the default and explicit-device requests fail loudly when hardware is missing — the right shape. One process note: this is an LLM-affecting change (it swaps the model per device), and CLAUDE.md asks for a gaia eval agent run compared to the committed baseline. The description cites strong NPU numbers but no baseline scorecard / --compare output is included — worth attaching so a reviewer can see GPU-vs-NPU parity rather than take the numbers on faith, especially since the NPU path runs at a much smaller ctx (4096 vs 32768). Two small inline notes on the device-probe except: pass and the _DEVICE_TO_MIN cpu omission. Approving on the understanding the eval evidence gets attached.

Generated by Claude Code

…untime Multi-device support (#1252) shipped non-functional: the Agent UI device dropdown and CLI --device flag never changed the model, and an unavailable device degraded silently. Fixes v0.20.0 release-review blockers B1/B2/H1 plus a GPU-tier validation bug. B1: resolve the device's DeviceConfig.model (and ctx_size) at every UI agent-build site instead of always using the session/GPU model; rewrite the session model on a device switch so eviction/recreation picks it up. A guard keeps an agent's own pinned model from being clobbered on the default GPU. B2: add a device field to ChatAgentConfig, thread it through the base Agent into LemonadeManager.ensure_ready(device=...), and fail loudly with an actionable HardwareRequirementError when the requested device is absent. H1: narrow the CLI device-probe except clause to only swallow connection/ timeout errors; a reachable-but-broken Lemonade now surfaces. Make the GPU->CPU fallback state its reason. Medium: map the gpu selector to amd_dgpu (lowest GPU tier) so a discrete- Radeon-only host satisfies an explicit GPU requirement.

github-actions Bot added documentation Documentation changes llm LLM backend changes cli CLI changes tests Test changes electron Electron app changes performance Performance-critical changes agents labels May 29, 2026

kovtcharov-amd requested a review from itomek-amd May 29, 2026 07:33

kovtcharov-amd self-assigned this May 29, 2026

kovtcharov-amd modified the milestones: v0.19 — Test & CI Hardening [OSS], v0.20 Email Agent & Platform Foundations May 29, 2026

fix(ci): resolve lint failures in NPU support PR

8594793

- Black: reformat cli.py - Pylint: remove reimport of LemonadeClient (already at module level) - Flake8: remove unused imports (tempfile, Path) in test file

itomek approved these changes May 29, 2026

View reviewed changes

Comment thread src/gaia/cli.py

Comment thread src/gaia/llm/lemonade_manager.py

kovtcharov-amd enabled auto-merge May 29, 2026 16:26

kovtcharov-amd added this pull request to the merge queue May 29, 2026

Merged via the queue into main with commit d12c79f May 29, 2026
48 checks passed

kovtcharov-amd deleted the kalin/npu-flm-profile branch May 29, 2026 16:26

This was referenced Jun 1, 2026

Release v0.20.0 #1334

Open

fix(agents): wire multi-device selection end-to-end and validate at runtime #1338

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agents): multi-device support — CPU, GPU, NPU per-agent selection#1252

feat(agents): multi-device support — CPU, GPU, NPU per-agent selection#1252
kovtcharov-amd merged 2 commits into
mainfrom
kalin/npu-flm-profile

kovtcharov-amd commented May 29, 2026

Uh oh!

kovtcharov-amd commented May 29, 2026

Uh oh!

itomek left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kovtcharov-amd commented May 29, 2026

Test plan

Uh oh!

kovtcharov-amd commented May 29, 2026

Uh oh!

itomek left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants