Conversation
bbd0cfd to
4fcddd5
Compare
|
@claude Please do a full code review of this PR following the GAIA review checklist. Focus on: security vulnerabilities, code quality, architecture consistency, testing coverage, and documentation. Report all 🔴 Critical and 🟡 Important findings with suggested fixes. |
|
Claude finished @itomek's task —— View job Code Review Complete
SummaryThis PR implements a tool execution guardrail system for the Agent UI, requiring user confirmation before Overall Assessment: Well-designed implementation that follows GAIA patterns. The defense-in-depth approach (existing shell command whitelist + new confirmation popup) is excellent. A few issues need attention. 🔴 Critical Findings1. Input Validation Gap on
|
- Rewrite agent-ui.mdx: remove redundant Get Started section (lives in quickstart), add "Search and Browse Files" + "Analyze Documents" capabilities, replace ASCII architecture with Mermaid diagram - Update quickstart.mdx: task-focused intro (not "chatting"), simplify npm tab to published package, add auto-bootstrap note, remove redundant CLI tab and Desktop Installer content - Move guides/agent-ui from Getting Started to Python Framework in docs.json navigation - Add auto-bootstrap to gaia-ui.mjs: when Python backend not found, automatically runs install scripts (install.ps1/install.sh) to set up uv + Python 3.12 + amd-gaia into ~/.gaia/venv - Add "gaia" bin entry to package.json alongside "gaia-ui" - Guard against infinite spawn loop when npm "gaia" shadows Python CLI - Update sdk/sdks/agent-ui.mdx: fix naming (Agent UI not GAIA Chat, Agent SDK not Chat SDK), update npm section to show gaia command - Remove "Generative AI Is Awesome" from sdk/index.mdx
… GAIA personality CLI: - Add --ui, --cli, --ui-port top-level flags to gaia command - Bare `gaia` defaults to launching Agent UI - Interactive menu with [1] Agent UI [2] CLI [3] Help - Extract _launch_agent_ui(), _launch_interactive_cli() helpers - gaia chat --ui preserved as backward-compatible alias npm package (gaia-ui): - Auto-install Python backend on first run (uv + Python 3.12 + amd-gaia[ui]) - Pin Python package version to match npm package version - Auto-update backend on version mismatch - Run gaia init --profile minimal after first install - --gaia-version flag for installing specific versions - Clear error messages with manual install instructions at every step - Remove "gaia" bin alias to avoid conflict with Python CLI Agent personality: - Rewrite ChatAgent system prompt for natural, direct personality - GAIA speaks like a smart friend, not a corporate assistant - Pushes back on wrong claims, avoids sycophancy - No filler phrases, no "As an AI assistant" self-references Documentation: - Promote Agent UI to top-level under Python Framework in docs.json - Rewrite guides/agent-ui.mdx as concise getting started page - Update quickstart with npm install, update, uninstall, Node.js install instructions - Add Top-Level Flags section to CLI reference - Update deployment/ui.mdx with coming-soon warning for desktop installer - Fix GaiaAgent -> ChatAgent in SDK docs - Update index.mdx cards for Agent UI Tests: - Add chat concurrency, chat helpers, and utils helpers unit tests - Update database tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Before this change, run_shell_command executed silently whenever the LLM
decided to use it — users had no visibility or consent. This adds a
blocking confirmation popup in the Agent UI before any shell command runs.
Backend changes:
- console.py: Add confirm_tool_execution() to OutputHandler (default
auto-approves, preserving CLI behaviour)
- agent.py: Add TOOLS_REQUIRING_CONFIRMATION set and guardrail check in
_execute_tool(); returns {"status":"denied"} if user declines
- sse_handler.py: SSEOutputHandler.confirm_tool_execution() emits a
tool_confirm SSE event then blocks (0.5 s poll loop, 60 s timeout)
until resolve_confirmation() is called or cancellation fires
- server.py: Add app.state.active_sse_handlers for per-session handler
lookup by the confirm endpoint
- _chat_helpers.py: Register/unregister SSE handler around the stream;
pass http_request through to enable the registry
- routers/chat.py: Add POST /api/chat/confirm endpoint that routes user
Allow/Deny back to the blocked agent thread
- models.py: Add ToolConfirmRequest model
Frontend changes:
- types/index.ts: Add tool_confirm to StreamEventType; add confirm_id
and timeout_seconds fields to StreamEvent
- api.ts: Add tool_confirm to AGENT_EVENT_TYPES; add
confirmToolExecution() helper
- ChatView.tsx: Handle tool_confirm events — auto-approve if tool is in
localStorage always-allow list, otherwise push a GaiaNotification to
show the existing PermissionPrompt modal
- notificationStore.ts: HTTP fallback in respondToPermission() (routes
to /api/chat/confirm when not in Electron); persist Always Allow in
localStorage key gaia_always_allow_tools
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…SSE confirmation flow (#565) Expand TOOLS_REQUIRING_CONFIRMATION from just run_shell_command to all write/execute tools (write_file, edit_file, write_python_file, etc.). Implement the full backend-to-frontend confirmation flow: SSEOutputHandler now overrides confirm_tool_execution() to emit permission_request events and block until the frontend responds via a new POST /api/chat/confirm-tool endpoint. Frontend wires SSE events to the existing PermissionPrompt UI.
…defensive guards - models.py: action field uses Literal["allow","deny"] — rejects invalid values via Pydantic - _chat_helpers.py: log warning when SSE handler overwrites existing session registration - routers/chat.py: add security comment on /api/chat/confirm (local-only assumption) - sse_handler.py: extract TOOL_CONFIRM_TIMEOUT_SECONDS = 60 constant, use everywhere - ChatView.tsx: add guard for missing confirm_id before non-null assertion - notificationStore.ts: export ALWAYS_ALLOW_TOOLS_KEY constant (ChatView imports it) - test_sse_confirmation.py: 12 new unit tests for confirmation timeout/allow/deny/endpoint Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PR #566 squash-merged a stale branch that had resolved merge conflicts by keeping older file versions, reverting 3 previously-merged PRs from main: - PR #564: TOCTOU upload locking security fix - PR #565: Tool execution guardrails with confirmation popup - PR #568: Agent UI overhaul (CSS design system, animations, UX polish) Follow-up PRs #593/#604/#605 partially restored functionality. This PR restores all remaining missing changes while preserving those follow-ups. Changes: - 24 files: clean restore from pre-revert commit (CSS, components, utils) - Security: restore per-file asyncio.Lock upload guard (dependencies.py, documents.py, server.py) - SSE handler: restore <think> block state machine, UUID-scoped confirms, timeout parameter, friendly error messages - Frontend: restore AnimatedPresence, session hash badge, smooth streaming exit, custom model override UI, terminal typing animation, inference stats - Backend: restore custom_model DB override, Lemonade stats fetching, friendlier user-facing error messages - Tests: 497 passing, TypeScript build clean (1845 modules) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PR #566 squash-merged a stale branch that had resolved merge conflicts by keeping older file versions, reverting 3 previously-merged PRs from main: - PR #564: TOCTOU upload locking security fix - PR #565: Tool execution guardrails with confirmation popup - PR #568: Agent UI overhaul (CSS design system, animations, UX polish) Follow-up PRs #593/#604/#605 partially restored functionality. This PR restores all remaining missing changes while preserving those follow-ups. Changes: - 24 files: clean restore from pre-revert commit (CSS, components, utils) - Security: restore per-file asyncio.Lock upload guard (dependencies.py, documents.py, server.py) - SSE handler: restore <think> block state machine, UUID-scoped confirms, timeout parameter, friendly error messages - Frontend: restore AnimatedPresence, session hash badge, smooth streaming exit, custom model override UI, terminal typing animation, inference stats - Backend: restore custom_model DB override, Lemonade stats fetching, friendlier user-facing error messages - Tests: 497 passing, TypeScript build clean (1845 modules) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… (#608) ## Summary PR #566 was accidentally merged with stale conflict resolutions that reverted 3 previously-merged PRs. Follow-up PRs #593/#604/#605 partially restored functionality. This PR restores all remaining missing changes. **Root cause:** During a `git merge origin/main` into the branch (commit `f07b932`), conflict resolution kept the branch's older file versions, discarding work from 3 PRs. The squash merge then propagated this to main. **Reverted PRs restored by this PR:** - **#564** — TOCTOU race condition fix: per-file `asyncio.Lock` for document uploads (`dependencies.py`, `routers/documents.py`, `server.py`) - **#565** — Tool execution guardrails: `<think>` block state machine, UUID-scoped confirms, inference stats, custom model override, friendly error messages (`sse_handler.py`, `_chat_helpers.py`, `models.py`) - **#568** — Agent UI overhaul: CSS design system (glassmorphism, animations), AnimatedPresence, session hash badge, smooth streaming exit, terminal typing animation, custom model override UI, `appendThinkingContent`, `format.ts` utilities (`App.tsx`, `ChatView.tsx`, `AgentActivity.tsx`, `SettingsModal.tsx/css`, `WelcomeScreen.tsx/css`, `Sidebar.tsx/css`, `MessageBubble.tsx/css`, `chatStore.ts`, 12 other CSS files, `shell_tools.py`, `database.py`) **Preserved follow-up PR additions:** - #593: Device support banners, processor name display, Lemonade hints - #604: `permission_request` events, `confirmTool` API, `fileList` pass-through, PermissionPrompt - #605: RAG indexing guards ## Test plan - [x] `python -m pytest tests/unit/chat/ui/ --tb=short` — 497 passed - [x] `python util/lint.py --black --isort` — all checks pass - [x] `npm run build` in `src/gaia/apps/webui/` — 1,845 modules, no TypeScript errors - [ ] Smoke test: `gaia chat --ui` — verify UI loads, settings modal shows custom model override, welcome screen has typing animation, chat streams correctly - [ ] Verify concurrent document uploads use per-file locking 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
run_shell_commandexecutes, so users can Allow, Deny, or Always Allow each shell command the agent wants to rungaia chat) is unaffected — auto-approves as beforePermissionPrompt.tsx/GaiaNotification/notificationStoreinfrastructure already in the codebaseArchitecture
Files Changed
agents/base/console.pyconfirm_tool_execution()onOutputHandler(defaultTrue)agents/base/agent.pyTOOLS_REQUIRING_CONFIRMATION+ guardrail in_execute_tool()ui/sse_handler.pyconfirm_tool_execution()+resolve_confirmation()ui/server.pyapp.state.active_sse_handlersregistryui/_chat_helpers.pyhttp_requestui/routers/chat.pyPOST /api/chat/confirmendpointui/models.pyToolConfirmRequestmodelwebui/src/types/index.tstool_confirmevent type + fieldswebui/src/services/api.tsconfirmToolExecution()+ event routingwebui/src/components/ChatView.tsxtool_confirm, localStorage auto-approvewebui/src/stores/notificationStore.tsFiles Reused (no changes)
PermissionPrompt.tsx— full modal UI with countdown, Allow/Deny/Always Allow, keyboard shortcutsGaiaNotificationtype — already hastool,toolArgs,timeoutSecondsfieldsTest plan
Manual (Agent UI):
gaia chat --ui→ ask "run ls /tmp" → confirm popup appears with command shownCLI regression:
gaia chat→ same shell command prompt → no popup, executes immediatelyFixes #438