Skip to content

feat: guard Agent UI against unsupported devices#593

Merged
kovtcharov-amd merged 12 commits intomainfrom
feature/strix-halo-device-guard
Mar 19, 2026
Merged

feat: guard Agent UI against unsupported devices#593
kovtcharov-amd merged 12 commits intomainfrom
feature/strix-halo-device-guard

Conversation

@kovtcharov
Copy link
Collaborator

@kovtcharov kovtcharov commented Mar 19, 2026

Summary

  • Guards gaia --ui / gaia chat --ui against devices that lack the memory to run large local LLMs
  • Supported devices: AMD Ryzen AI Max (Strix Halo, unified HBM) and AMD Radeon GPUs with ≥ 24 GB VRAM
  • Adds comprehensive UX guidance so first-time users know exactly what to run before opening the UI
  • Unsupported devices get a clear banner in the UI with the detected processor name and a GitHub feature-request link
  • Bypass options: --base-url (remote Lemonade server) or GAIA_SKIP_DEVICE_CHECK=1

Changes

New module: src/gaia/device.py

  • get_processor_name() — reads CPU name from Windows registry (instant, no subprocess); falls back to platform.processor() on non-Windows
  • get_gpu_info() — reads GPU name + VRAM via the HardwareInformation.qwMemorySize QWORD registry key (accurate; bypasses the Win32_VideoController 4 GB cap)
  • check_device_supported() — checks CPU first (Ryzen AI Max), then AMD Radeon GPU ≥ 24 GB VRAM; fail-open on unknown hardware

CLI (src/gaia/cli.py)

  • _launch_agent_ui() — prints a prerequisites checklist on startup (gaia init --profile chat + lemonade-server serve)
  • Specific OSError handler for port conflicts: friendly message with --ui-port 8080 suggestion
  • --base-url flag on top-level parser (bypasses device check — inference runs remotely)

Backend API (src/gaia/ui/)

  • models.py: SystemStatus gains processor_name, device_supported, initialized, and extended Lemonade fields (lemonade_version, model_size_gb, model_device, model_context_size, model_labels, gpu_name, gpu_vram_gb, tokens_per_second, time_to_first_token); also restores ModelStatus/SettingsResponse/SettingsUpdateRequest (accidentally stripped by Fix Agent UI Round 5: hide post-tool thinking, FileListView, text spacing #566)
  • routers/system.py: /api/system/status populates all fields; GAIA_SKIP_DEVICE_CHECK env var and remote --base-url both bypass the device check for UI banner consistency

Frontend (src/gaia/apps/webui/)

  • ConnectionBanner.tsx: four-case banner system (in priority order):
    1. Backend unreachable → gaia chat --ui
    2. Device unsupported → processor name + GitHub feature-request link (dismissible)
    3. Lemonade not running → gaia init --profile chat (first time) or lemonade-server serve (already set up)
    4. Lemonade running but no model loaded → gaia init --profile chat + disk space warning if < 30 GB free
  • WelcomeScreen.tsx: reads systemStatus from store; shows an amber hint box when initialized === false or no model is loaded, with the exact command to run
  • SettingsModal.tsx: inline fix hints on failing status rows — Lemonade not running, model not loaded, disk space warning; all pointing to gaia init --profile chat
  • ConnectionBanner.css, WelcomeScreen.css, SettingsModal.css: styles for new UI elements (light + dark mode)
  • types/index.ts: SystemStatus interface updated with all new and restored fields

Docs

  • docs/guides/agent-ui.mdx:
    • New "Before You Start" section with explicit 4-step setup flow: install → gaia init --profile chatlemonade-server servegaia chat --ui
    • Hardware requirements Warning block updated with remote server bypass option
    • Troubleshooting accordions expanded: "No model loaded" uses gaia init --profile chat, "Unsupported device" documents --base-url and GAIA_SKIP_DEVICE_CHECK
  • docs/reference/cli.mdx: --base-url documented in top-level flag table

Tests

  • tests/unit/test_device_check.py: 27 unit tests covering CPU detection, GPU VRAM boundary (23.9 / 24.0 / 24.1 GB), multi-GPU, unknown device, NVIDIA rejection, registry fallback, and non-Windows platform
  • tests/unit/chat/ui/test_server.py: system status endpoint tests including initialized and device_supported fields

Command Consistency

All user-facing surfaces consistently use gaia init --profile chat (the profile that downloads the LLM + embedding + VLM models the Agent UI needs):

Surface Condition Guidance shown
ConnectionBanner !initialized && !lemonade_running gaia init --profile chat
ConnectionBanner initialized && !lemonade_running lemonade-server serve
ConnectionBanner lemonade_running && !model_loaded gaia init --profile chat
WelcomeScreen hint !initialized gaia init --profile chat
WelcomeScreen hint !model_loaded gaia init --profile chat
Settings modal Lemonade not running (first time) gaia init --profile chat
Settings modal Lemonade not running (set up) lemonade-server serve
Settings modal Model not loaded gaia init --profile chat
CLI startup Always Prerequisites checklist

Test plan

  • gaia --ui on a Strix Halo machine → launches normally, no banner
  • gaia --ui on an unsupported machine → blue device banner with processor name + GitHub link
  • gaia --ui --base-url http://remote:8000/api/v1 → bypasses device check, launches
  • set GAIA_SKIP_DEVICE_CHECK=1 && gaia --ui → bypasses device check
  • Stop Lemonade → yellow banner: "LLM server is not running" + lemonade-server serve
  • Stop Lemonade on uninitialized machine → banner shows gaia init --profile chat instead
  • Lemonade running, no model loaded → yellow banner: "No model loaded" + gaia init --profile chat
  • Disk < 30 GB free + no model → inline disk space warning in banner
  • Welcome Screen with initialized=false → amber hint box with gaia init --profile chat
  • Settings modal with Lemonade down → hint row shows fix command
  • Port 4200 in use → friendly error with --ui-port 8080 suggestion
  • pytest tests/unit/test_device_check.py → 27 passed
  • pytest tests/unit/chat/ui/test_server.py → passed

@github-actions github-actions bot added cli CLI changes tests Test changes electron Electron app changes labels Mar 19, 2026
@kovtcharov kovtcharov force-pushed the feature/strix-halo-device-guard branch from 2604363 to 5f9612b Compare March 19, 2026 16:26
@github-actions github-actions bot added the documentation Documentation changes label Mar 19, 2026
@kovtcharov kovtcharov force-pushed the feature/strix-halo-device-guard branch from 95ec6df to 5ff6124 Compare March 19, 2026 18:31
kovtcharov and others added 3 commits March 19, 2026 13:19
Add a device compatibility check before starting the Agent UI that
allows only AMD Ryzen AI Max (Strix Halo) CPUs and AMD Radeon GPUs
with >= 24 GB VRAM — the minimum required to run Qwen3-Coder-30B.

- CLI: new _get_processor_name() reads the CPU name from the Windows
  registry (instant, no subprocess).  _get_gpu_info() reads GPU name
  and accurate VRAM via the HardwareInformation.qwMemorySize QWORD
  registry key (bypasses the Win32_VideoController 4 GB cap).
  _check_device_supported() returns (bool, device_name) and is called
  inside _launch_agent_ui() before the server starts.
- Bypass options: --skip-device-check flag (top-level + chat subcommand),
  --base-url (remote Lemonade — inference on remote machine), or
  GAIA_SKIP_DEVICE_CHECK=1 env var.
- API: SystemStatus model gains processor_name and device_supported
  fields populated by the system/status endpoint.
- Frontend: ConnectionBanner shows a dismissible info banner on
  unsupported devices with a pre-filled GitHub feature-request link.
  SettingsModal shows the detected processor in the System Status grid.
- Tests: 27 unit tests cover processor detection, GPU VRAM gating,
  edge cases (unknown device, multi-GPU, NVIDIA rejection, etc.).
- Extract device detection logic (_get_processor_name, _get_gpu_info,
  _check_device_supported) into lightweight gaia.device module to
  eliminate circular import risk from gaia.ui.routers.system → gaia.cli
- Fix winreg key handle leak: use context-manager (with) in all registry
  reads so keys are closed even if QueryValueEx raises
- Fix CR1: add --base-url to the top-level parser so gaia --ui --base-url
  works and the printed bypass instructions are correct; forward base_url
  to _launch_agent_ui at both top-level call sites
- Fix copyright year in system.py (2024-2025 → 2024-2026)
- Respect GAIA_SKIP_DEVICE_CHECK env var in system/status endpoint so the
  UI banner is consistent with the CLI bypass decision
- Simplify ConnectionBanner dismissed-state logic (remove dead early return)
- Update unit tests to import from gaia.device and patch at the correct
  module path
- Update agent-ui.mdx Warning block to list both supported hardware
  configurations (Ryzen AI Max and Radeon >= 24 GB VRAM)
- Add troubleshooting accordion for unsupported device error with
  three bypass options (--base-url, --skip-device-check, env var)
- Add --skip-device-check and --base-url to cli.mdx top-level flags
  table and gaia chat flags table
- Add Note clarifying GAIA_SKIP_DEVICE_CHECK env var bypass

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@kovtcharov kovtcharov force-pushed the feature/strix-halo-device-guard branch from 5ff6124 to 2c1cf93 Compare March 19, 2026 20:20
kovtcharov and others added 4 commits March 19, 2026 13:25
PR #566 (Fix Agent UI Round 5) accidentally stripped the extended
Lemonade info fields and Settings endpoints from the Agent UI:

- Restore SystemStatus fields: lemonade_version, model_size_gb,
  model_device, model_context_size, model_labels, gpu_name, gpu_vram_gb,
  tokens_per_second, time_to_first_token
- Restore ModelStatus, SettingsResponse, SettingsUpdateRequest models
- Restore _check_model_status(), /api/settings GET and PUT endpoints
  in system.py, and the 10s Lemonade timeout
- Restore extended status rows in SettingsModal.tsx
- Restore extended fields in types/index.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Move `from gaia.device import ...` from mid-file (line 686, after
  function definitions) to the top of cli.py to satisfy Pylint C0413
  (wrong-import-position)
- Remove unused `_get_gpu_info` and `_get_processor_name` aliases from
  cli.py — only `check_device_supported` and the URL constant are used
  directly in cli.py; the gpu/processor helpers are called internally
  by check_device_supported inside gaia.device
- Remove unused `_get_gpu_info` import from test_device_check.py
- Auto-fix Black formatting and isort order

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
tempfile creates files in /tmp on Linux CI, which is outside the user's
home directory. The safe_open_document() security check correctly
rejects paths outside home with 403, causing the tests to fail in CI
while passing locally on Windows (where temp files land under the home
directory).

Fix: pass dir=Path.home() to NamedTemporaryFile and TemporaryDirectory
in the three affected tests so they always land inside home on all
platforms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ask about "my pet" explicitly (not "Max") so tiny models
connect the question to the established dog context. Broaden
assertion keywords to include terms small models commonly use
(canine, companion, puppy) when describing pet type.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@itomek itomek self-assigned this Mar 19, 2026
@kovtcharov kovtcharov self-assigned this Mar 19, 2026
…sers

- ConnectionBanner: add Case 4 for Lemonade running but no model loaded;
  Case 3 now shows `gaia init --profile chat` for first-time users and
  `lemonade-server serve` for users who are already initialized
- WelcomeScreen: surface first-run setup hints from system status —
  `gaia init --profile chat` when not initialized, model hint when no
  model is loaded
- SettingsModal: show inline fix hints on failing status rows (Lemonade
  not running, model not loaded, low disk space when model missing)
- cli.py: print prerequisites checklist on Agent UI startup; add specific
  OSError handler for port conflicts with `--ui-port` suggestion
- docs/guides/agent-ui.mdx: add "Before You Start" steps section with
  explicit `gaia init --profile chat` command, disk space note, and
  expanded troubleshooting for device bypass and model download

All setup guidance consistently uses `gaia init --profile chat` (the
profile that downloads LLM + embedding + VLM models needed by the UI).
kovtcharov and others added 4 commits March 19, 2026 14:51
- device.py: detect AMD Ryzen AI Max (Strix Halo) and AMD Radeon GPUs
  with >= 24 GB VRAM via Windows registry; fail-open on unknown hardware
- routers/system.py: populate processor_name and device_supported in
  /api/system/status; respect GAIA_SKIP_DEVICE_CHECK and remote base-url
- cli.mdx, agent-ui.mdx (sdk): document bypass flags and system status
  fields; restore ModelStatus/SettingsResponse API classes
- test_device_check.py: 27 unit tests for CPU/GPU detection and VRAM
  boundary conditions
- test_server.py: system status endpoint tests for device fields
- Patch sys.platform to "win32" in all TestCheckDeviceSupported hardware
  tests so they run correctly on Linux CI (OS check fires before mocks)
- Remove f-strings without interpolation in cli.py (Pylint W1309)
- Auto-fix Black formatting in cli.py and test_server.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Suppress local prerequisites hint when --base-url (remote server) is used
- Fix GAIA_SKIP_DEVICE_CHECK to require explicit truthy value ("1"/"true"/"yes")
  instead of any non-empty string (avoids GAIA_SKIP_DEVICE_CHECK=0 being truthy)
- Add 0.0.0.0 to local-address list so LEMONADE_BASE_URL=http://0.0.0.0:8000
  is not misclassified as a remote server

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix is_remote URL check to use urlparse hostname comparison instead of
  fragile substring matching (e.g. 'not-localhost.example.com' no longer
  matches 'localhost')
- Reduce timeout for optional /stats and /system-info calls to 3s so
  system status endpoint stays fast when these endpoints are slow
- Fix copyright year in _chat_helpers.py (2025 -> 2026)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kovtcharov-amd kovtcharov-amd added this pull request to the merge queue Mar 19, 2026
Merged via the queue into main with commit 5dd71a2 Mar 19, 2026
19 checks passed
@kovtcharov-amd kovtcharov-amd deleted the feature/strix-halo-device-guard branch March 19, 2026 22:17
itomek added a commit that referenced this pull request Mar 23, 2026
PR #566 squash-merged a stale branch that had resolved merge conflicts by
keeping older file versions, reverting 3 previously-merged PRs from main:
- PR #564: TOCTOU upload locking security fix
- PR #565: Tool execution guardrails with confirmation popup
- PR #568: Agent UI overhaul (CSS design system, animations, UX polish)

Follow-up PRs #593/#604/#605 partially restored functionality. This PR
restores all remaining missing changes while preserving those follow-ups.

Changes:
- 24 files: clean restore from pre-revert commit (CSS, components, utils)
- Security: restore per-file asyncio.Lock upload guard (dependencies.py,
  documents.py, server.py)
- SSE handler: restore <think> block state machine, UUID-scoped confirms,
  timeout parameter, friendly error messages
- Frontend: restore AnimatedPresence, session hash badge, smooth streaming
  exit, custom model override UI, terminal typing animation, inference stats
- Backend: restore custom_model DB override, Lemonade stats fetching,
  friendlier user-facing error messages
- Tests: 497 passing, TypeScript build clean (1845 modules)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
itomek added a commit that referenced this pull request Mar 23, 2026
PR #566 squash-merged a stale branch that had resolved merge conflicts by
keeping older file versions, reverting 3 previously-merged PRs from main:
- PR #564: TOCTOU upload locking security fix
- PR #565: Tool execution guardrails with confirmation popup
- PR #568: Agent UI overhaul (CSS design system, animations, UX polish)

Follow-up PRs #593/#604/#605 partially restored functionality. This PR
restores all remaining missing changes while preserving those follow-ups.

Changes:
- 24 files: clean restore from pre-revert commit (CSS, components, utils)
- Security: restore per-file asyncio.Lock upload guard (dependencies.py,
  documents.py, server.py)
- SSE handler: restore <think> block state machine, UUID-scoped confirms,
  timeout parameter, friendly error messages
- Frontend: restore AnimatedPresence, session hash badge, smooth streaming
  exit, custom model override UI, terminal typing animation, inference stats
- Backend: restore custom_model DB override, Lemonade stats fetching,
  friendlier user-facing error messages
- Tests: 497 passing, TypeScript build clean (1845 modules)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
github-merge-queue bot pushed a commit that referenced this pull request Mar 23, 2026
… (#608)

## Summary

PR #566 was accidentally merged with stale conflict resolutions that
reverted 3 previously-merged PRs. Follow-up PRs #593/#604/#605 partially
restored functionality. This PR restores all remaining missing changes.

**Root cause:** During a `git merge origin/main` into the branch (commit
`f07b932`), conflict resolution kept the branch's older file versions,
discarding work from 3 PRs. The squash merge then propagated this to
main.

**Reverted PRs restored by this PR:**
- **#564** — TOCTOU race condition fix: per-file `asyncio.Lock` for
document uploads (`dependencies.py`, `routers/documents.py`,
`server.py`)
- **#565** — Tool execution guardrails: `<think>` block state machine,
UUID-scoped confirms, inference stats, custom model override, friendly
error messages (`sse_handler.py`, `_chat_helpers.py`, `models.py`)
- **#568** — Agent UI overhaul: CSS design system (glassmorphism,
animations), AnimatedPresence, session hash badge, smooth streaming
exit, terminal typing animation, custom model override UI,
`appendThinkingContent`, `format.ts` utilities (`App.tsx`,
`ChatView.tsx`, `AgentActivity.tsx`, `SettingsModal.tsx/css`,
`WelcomeScreen.tsx/css`, `Sidebar.tsx/css`, `MessageBubble.tsx/css`,
`chatStore.ts`, 12 other CSS files, `shell_tools.py`, `database.py`)

**Preserved follow-up PR additions:**
- #593: Device support banners, processor name display, Lemonade hints
- #604: `permission_request` events, `confirmTool` API, `fileList`
pass-through, PermissionPrompt
- #605: RAG indexing guards

## Test plan

- [x] `python -m pytest tests/unit/chat/ui/ --tb=short` — 497 passed
- [x] `python util/lint.py --black --isort` — all checks pass
- [x] `npm run build` in `src/gaia/apps/webui/` — 1,845 modules, no
TypeScript errors
- [ ] Smoke test: `gaia chat --ui` — verify UI loads, settings modal
shows custom model override, welcome screen has typing animation, chat
streams correctly
- [ ] Verify concurrent document uploads use per-file locking

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli CLI changes documentation Documentation changes electron Electron app changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants