Skip to content

Fix Agent UI guardrails, rendering, LRU eviction, and Windows paths#604

Merged
itomek merged 7 commits intomainfrom
fix/agent-ui-guardrails-438
Mar 19, 2026
Merged

Fix Agent UI guardrails, rendering, LRU eviction, and Windows paths#604
itomek merged 7 commits intomainfrom
fix/agent-ui-guardrails-438

Conversation

@itomek
Copy link
Collaborator

@itomek itomek commented Mar 19, 2026

Summary

Fixes multiple bugs found during Windows UI testing with gaia chat --ui.

Closes #438

  • Tool confirmation flow: Expand TOOLS_REQUIRING_CONFIRMATION to cover all write/execute tools. Implement full SSE-based confirmation — SSEOutputHandler emits permission_request events and blocks until frontend responds via POST /api/chat/confirm-tool. Mount <PermissionPrompt /> in App.tsx.
  • Garbled ]]]]] output: Tighten TOOL_CALL_JSON_SAFETY_RE regex to respect JSON string boundaries. Fix linkifyPaths bracket trimming.
  • LRU eviction: Forward --max-indexed-files to UI server via env var. Add DB-level capacity check with LRU eviction. Add O_BINARY flag on Windows.
  • WSL path generation: Add platform context to ChatAgent system prompt so the LLM uses native Windows paths instead of /mnt/c/....
  • New conversation flash: Clear messages before switching session ID.
  • Write file errors: Specific error messages for PermissionError, FileNotFoundError, OSError with actionable guidance.

Test plan

  • Unit tests pass (1059 passed, 28 skipped)
  • Frontend builds cleanly
  • Manual: gaia chat --ui — ask agent to write a file, confirm popup appears
  • Manual: Click + for new conversation, verify old messages don't flash
  • Manual: Ask agent to create file on Desktop, verify native Windows path used
  • Manual: Upload 3 files with --max-indexed-files 2, verify LRU eviction works

itomek added 4 commits March 19, 2026 16:59
…SSE confirmation flow (#565)

Expand TOOLS_REQUIRING_CONFIRMATION from just run_shell_command to all
write/execute tools (write_file, edit_file, write_python_file, etc.).
Implement the full backend-to-frontend confirmation flow: SSEOutputHandler
now overrides confirm_tool_execution() to emit permission_request events
and block until the frontend responds via a new POST /api/chat/confirm-tool
endpoint. Frontend wires SSE events to the existing PermissionPrompt UI.
…racket trimming (#566)

Tighten TOOL_CALL_JSON_SAFETY_RE to respect JSON string boundaries instead
of crossing unescaped quotes with [^}]*. Fix linkifyPaths bracket trimming
to only strip ) and } (not ]) since the path regex already excludes ].
Forward --max-indexed-files to UI server via GAIA_MAX_INDEXED_FILES env
var. Add DB-level capacity check with LRU eviction in upload_by_path()
since per-upload RAGSDK instances can't track cross-upload state. Add
O_BINARY flag on Windows for safe_open_document() and RAGSDK._safe_open()
to prevent binary/text mode issues with fd-based file reads.
…ion flash

- Add platform/environment context to ChatAgent system prompt so the LLM
  uses native Windows paths instead of WSL /mnt/c/... paths
- Improve write_file error handling with specific messages for
  PermissionError, FileNotFoundError, and OSError
- Fix new conversation showing old messages by clearing messages before
  switching session ID
- Mount PermissionPrompt component in App.tsx so tool confirmation popup
  actually renders
@itomek itomek requested a review from kovtcharov-amd as a code owner March 19, 2026 21:41
@itomek itomek self-assigned this Mar 19, 2026
@github-actions github-actions bot added agents Agent system changes rag RAG system changes cli CLI changes performance Performance-critical changes labels Mar 19, 2026
…est rewrite

- Replace exponential-backtracking regex in ChatView.tsx with bounded [^}]* pattern
- Replace .* with [^}]* in sse_handler.py _TOOL_CALL_JSON_RE (removes re.DOTALL)
- Use home-directory temp files in upload tests to pass safe_open_document() check on Linux CI
- Rewrite test_sse_confirmation.py for current confirmation API (permission_request events,
  resolve_tool_confirmation, /api/chat/confirm-tool endpoint)
@github-actions github-actions bot added the tests Test changes label Mar 19, 2026
…r checkbox

- Add 120s safety-net timeout to confirm_tool_execution (time.monotonic)
- Clean up _active_sse_handlers on exception in _stream_chat_response
- Remove "Remember this choice" checkbox (not wired to backend)
- Add timeout unit test for confirm_tool_execution
- Remove dead app.state.active_sse_handlers from test fixture
@github-actions github-actions bot added the electron Electron app changes label Mar 19, 2026
…ls-438

# Conflicts:
#	tests/unit/chat/ui/test_server.py
@itomek itomek enabled auto-merge March 19, 2026 22:58
@itomek itomek added this pull request to the merge queue Mar 19, 2026
Merged via the queue into main with commit 95b304f Mar 19, 2026
34 checks passed
@itomek itomek deleted the fix/agent-ui-guardrails-438 branch March 19, 2026 23:37
itomek added a commit that referenced this pull request Mar 23, 2026
PR #566 squash-merged a stale branch that had resolved merge conflicts by
keeping older file versions, reverting 3 previously-merged PRs from main:
- PR #564: TOCTOU upload locking security fix
- PR #565: Tool execution guardrails with confirmation popup
- PR #568: Agent UI overhaul (CSS design system, animations, UX polish)

Follow-up PRs #593/#604/#605 partially restored functionality. This PR
restores all remaining missing changes while preserving those follow-ups.

Changes:
- 24 files: clean restore from pre-revert commit (CSS, components, utils)
- Security: restore per-file asyncio.Lock upload guard (dependencies.py,
  documents.py, server.py)
- SSE handler: restore <think> block state machine, UUID-scoped confirms,
  timeout parameter, friendly error messages
- Frontend: restore AnimatedPresence, session hash badge, smooth streaming
  exit, custom model override UI, terminal typing animation, inference stats
- Backend: restore custom_model DB override, Lemonade stats fetching,
  friendlier user-facing error messages
- Tests: 497 passing, TypeScript build clean (1845 modules)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
itomek added a commit that referenced this pull request Mar 23, 2026
PR #566 squash-merged a stale branch that had resolved merge conflicts by
keeping older file versions, reverting 3 previously-merged PRs from main:
- PR #564: TOCTOU upload locking security fix
- PR #565: Tool execution guardrails with confirmation popup
- PR #568: Agent UI overhaul (CSS design system, animations, UX polish)

Follow-up PRs #593/#604/#605 partially restored functionality. This PR
restores all remaining missing changes while preserving those follow-ups.

Changes:
- 24 files: clean restore from pre-revert commit (CSS, components, utils)
- Security: restore per-file asyncio.Lock upload guard (dependencies.py,
  documents.py, server.py)
- SSE handler: restore <think> block state machine, UUID-scoped confirms,
  timeout parameter, friendly error messages
- Frontend: restore AnimatedPresence, session hash badge, smooth streaming
  exit, custom model override UI, terminal typing animation, inference stats
- Backend: restore custom_model DB override, Lemonade stats fetching,
  friendlier user-facing error messages
- Tests: 497 passing, TypeScript build clean (1845 modules)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
github-merge-queue bot pushed a commit that referenced this pull request Mar 23, 2026
… (#608)

## Summary

PR #566 was accidentally merged with stale conflict resolutions that
reverted 3 previously-merged PRs. Follow-up PRs #593/#604/#605 partially
restored functionality. This PR restores all remaining missing changes.

**Root cause:** During a `git merge origin/main` into the branch (commit
`f07b932`), conflict resolution kept the branch's older file versions,
discarding work from 3 PRs. The squash merge then propagated this to
main.

**Reverted PRs restored by this PR:**
- **#564** — TOCTOU race condition fix: per-file `asyncio.Lock` for
document uploads (`dependencies.py`, `routers/documents.py`,
`server.py`)
- **#565** — Tool execution guardrails: `<think>` block state machine,
UUID-scoped confirms, inference stats, custom model override, friendly
error messages (`sse_handler.py`, `_chat_helpers.py`, `models.py`)
- **#568** — Agent UI overhaul: CSS design system (glassmorphism,
animations), AnimatedPresence, session hash badge, smooth streaming
exit, terminal typing animation, custom model override UI,
`appendThinkingContent`, `format.ts` utilities (`App.tsx`,
`ChatView.tsx`, `AgentActivity.tsx`, `SettingsModal.tsx/css`,
`WelcomeScreen.tsx/css`, `Sidebar.tsx/css`, `MessageBubble.tsx/css`,
`chatStore.ts`, 12 other CSS files, `shell_tools.py`, `database.py`)

**Preserved follow-up PR additions:**
- #593: Device support banners, processor name display, Lemonade hints
- #604: `permission_request` events, `confirmTool` API, `fileList`
pass-through, PermissionPrompt
- #605: RAG indexing guards

## Test plan

- [x] `python -m pytest tests/unit/chat/ui/ --tb=short` — 497 passed
- [x] `python util/lint.py --black --isort` — all checks pass
- [x] `npm run build` in `src/gaia/apps/webui/` — 1,845 modules, no
TypeScript errors
- [ ] Smoke test: `gaia chat --ui` — verify UI loads, settings modal
shows custom model override, welcome screen has typing animation, chat
streams correctly
- [ ] Verify concurrent document uploads use per-file locking

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent system changes cli CLI changes electron Electron app changes performance Performance-critical changes rag RAG system changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tool execution guardrails: confirmation popup before dangerous commands

2 participants