C++ framework: docs, CI integration tests, health agent improvements#425
C++ framework: docs, CI integration tests, health agent improvements#425kovtcharov merged 21 commits intomainfrom
Conversation
…roadmap - Split cpp.mdx into landing page, quickstart.mdx, and expanded overview.mdx - Add Error Handling, Thread Safety, Security Model, Production Deployment, and API Quick Reference sections to overview.mdx - Add Step 7 (Embedding in Your Application) to custom-agent.mdx with headless, background thread, and custom OutputHandler patterns - Hyperlink all Lemonade references to https://lemonade-server.ai - Add C++ Framework Production Readiness to Q2 2026 roadmap
wifi-agent.mdx: - Update runShell() snippet to match source (PipeCloser struct) - Add isSafeShellArg() security validation to ping_host example - Add "Input Validation" subsection documenting shell injection prevention - Fix system prompt excerpt to use real section headers (AVAILABLE DIAGNOSTIC SEQUENCE, FIXING ISSUES, FINAL ANSWER) - Replace non-existent mapMenuSelection() with actual inline menu logic overview.mdx: - Mark streaming config field as planned/not yet implemented Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tion guide - cpp.mdx: Replace inline FetchContent snippet with link to integration guide; add LLM Backend section noting Lemonade is tested/recommended - overview.mdx: Update baseUrl description to recommend Lemonade and note other servers are untested Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace getenv with _dupenv_s in types.h and test_integration_llm.cpp to fix C4996 warnings-as-errors on MSVC. Split multi-command code blocks in quickstart, wifi-agent, and integration docs so each command can be copied individually. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Create unified tests_integration binary with interactive menu + CLI flags (--llm, --mcp, --wifi, --health, --all, --model, --url) - Add MCP integration tests: connection, tool discovery, reconnect, prompt rebuild - Add WiFi agent tests: real PowerShell diagnostics (netsh, ipconfig, DNS, connectivity) - Add Health agent tests: full LLM + MCP + PowerShell stack (memory, CPU, disk) - Rename test binaries: tests_mock (158 mock) + tests_integration (17 integration) - Enable GAIA_BUILD_INTEGRATION_TESTS=ON by default - Increase integration test timeout to 300s for agent tests
…n STX - Disable integration tests on cloud runners (no Lemonade available) - Use Qwen3-4B-Instruct-2507-GGUF model (matches wifi/health agent defaults) - Add uvx verification step for MCP/Health tests - Increase STX timeout to 20 minutes for broader test coverage - Update job names and summary labels to reflect full scope Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New files referenced by CMakeLists.txt but not previously committed: - clean_console.h/.cpp: ANSI color TUI with shared gaia::color namespace - health_agent.cpp: Windows system health agent using MCP - test_clean_console.cpp: CleanConsole unit tests - test_tool_integration.cpp: Tool registry integration tests - Remove simple_agent.cpp (replaced by health_agent) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The loop detector only compared tool names, causing false positives when the same MCP tool (e.g. mcp_windows_Shell) was called with different arguments. Now requires both name AND args to match 3 consecutive times before triggering the loop break. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- test_main.cpp: Use _WIN32 guard instead of _MSC_VER for _putenv_s (MinGW on STX defines _WIN32 but not _MSC_VER) - health_agent.cpp: Replace clipboard+paste Notepad approach with direct file write + open. Use array-of-lines pattern to avoid literal \n chars. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add _MSC_VER guards to wifi/mcp/health integration tests for _dupenv_s (MinGW uses getenv instead) - Increase health agent contextSize to 32768 for "Run ALL diagnostics" - Tighten tool result truncation (4K chars) to prevent context overflow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- CtxSize 8192 -> 32768 for health agent multi-step tests - Timeout 20 -> 30 minutes for model loading + 17 integration tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Integration tests run short queries (1-3 tool calls each), not the multi-step "Run ALL" flow. 32K context was slowing model loading. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…5 is full Notepad report - Option 1: renamed to "Quick health check (console summary)" - gathers 4 core metrics (memory, disk, CPU, GPU) and gives a text summary in terminal only - Option 15: unchanged "Run ALL diagnostics + generate report" - gathers all 12 metrics and writes a formatted plain-text report to Notepad - Removed HTML report generation from system prompt - Moved Notepad report pattern to comprehensive diagnostics protocol
…e diagram - Rename simple_agent references to health_agent across all C++ guide pages - Fix health_agent architecture diagram: replace clipboard/paste approach with file-write + Notepad open, add GPU check step - Fix Get-WmiObject to Get-CimInstance (correct PowerShell API) - Update project structure: add clean_console and test_tool_integration files - Update test suite description: six modules to eight - Update health_agent description: console summary or Notepad report
Health and WiFi integration tests had maxSteps=10, but each test only needs 1-2 tool calls. With 17 tests each creating a new agent (and MCP connection for health tests), the cumulative LLM inference time exceeded the 20-minute CI timeout on STX hardware. Also clean up C++ intro page (remove redundant LLM Backend and Integration sections already covered in subpages).
The STX integration tests were timing out because: 1. Each MCP test spawns uvx windows-mcp which downloads 117 Python packages (~30s per test, 8 MCP-dependent tests = ~4 min wasted) 2. Health/WiFi tests had maxSteps=10 but only need 1-2 tool calls Fixes: - Add pre-warm step in CI to install windows-mcp packages once before tests run, so subsequent uvx invocations use the cached environment - Reduce maxSteps from 10 to 3 for health and WiFi integration tests
uvx creates a temporary venv and installs 117 packages on every invocation. Using uv tool install puts windows-mcp in a persistent location so uvx reuses it instead of reinstalling each time.
uv writes progress messages to stderr, which PowerShell treats as NativeCommandError and kills the step. Capture output and check $LASTEXITCODE instead.
PowerShell 5.1 treats any stderr output from native commands as a terminating NativeCommandError, even when captured with 2>&1. Use $ErrorActionPreference = SilentlyContinue and 2>$null instead.
|
@claude review this PR. |
|
Claude finished @kovtcharov-amd's task —— View job PR Review: C++ Framework UpdatesOverall Assessment: 🟢 Approved — This is a well-structured PR with comprehensive improvements to the C++ framework. SummaryThis PR delivers significant enhancements across three areas:
Code Review Findings🟢 Well-Done Sections1. Loop Detection Fix ( if (toolCallHistory[i].first != toolName ||
toolCallHistory[i].second != toolArgs) {
allSame = false;
break;
}This prevents false positives when consecutive 2. Tool Result Truncation ( 3. Shell Injection Prevention ( if (c == ';' || c == '|' || c == '&' || c == '`' || c == '$'
|| c == '(' || c == ')' || c == '{' || c == '}' || c == '<'
|| c == '>' || c == '"' || c == '\n' || c == '\r') {
return false;
}4. CleanConsole Implementation (
5. CI Workflow ( $ErrorActionPreference = "SilentlyContinue"
uv tool install windows-mcp 2>$null
$installExit = $LASTEXITCODE
$ErrorActionPreference = "Stop"6. MinGW Portability 🟡 Suggestions (Non-Blocking)1. Context Size Hardcoded config.contextSize = 32768; // 32K needed for "Run ALL diagnostics"Consider making this configurable or at least documenting why this value was chosen. 2. Test Integration Module Count 🟢 Test CoverageThe
The integration tests in Files Reviewed
ConclusionThis PR is ready to merge. The code is well-structured, properly tested, and the documentation accurately reflects the implementation changes. |
uv tool install creates a broken Python venv on the STX runner (pyvenv.cfg missing), causing all MCP and Health tests to fail. Revert to letting uvx manage its own temporary environments and uninstall any broken persistent installation from previous runs.
…425) ## Summary ### Documentation - Rename `simple_agent` to `health_agent` across all C++ guide pages - Fix architecture diagram: replace clipboard/paste approach with file-write + Notepad open, add GPU check - Fix `Get-WmiObject` to `Get-CimInstance` (correct PowerShell API) - Add CleanConsole and test_tool_integration to project structure listing - Update test suite description: six modules to eight - Fix doc code snippets to match actual source (`PipeCloser`, `isSafeShellArg`, `kDiagnosticMenu`) - Add shell injection prevention section to wifi-agent guide - Clarify Lemonade Server as recommended/tested LLM backend - Split multi-command code blocks for easy copy-paste - Clean up C++ intro page (remove redundant LLM Backend and Integration sections) ### CI/CD (build_cpp.yml) - Add integration test suite on STX hardware: LLM + MCP + WiFi + Health tests - Use `Qwen3-4B-Instruct-2507-GGUF` model (matches wifi and health agents) - Add uvx verification step before test execution - Fix MinGW portability: `_dupenv_s` guards, `_WIN32` vs `_MSC_VER`, `_putenv_s` ### Health Agent (health_agent.cpp) - Simplify menu: option 1 is quick console-only summary (4 metrics, no Notepad), option 15 is full diagnostics + Notepad report - Replace clipboard/paste with direct file-write + `Start-Process notepad` approach - Use array-of-lines pattern with `[Environment]::NewLine` for proper newlines - Increase context size to 32K for comprehensive diagnostics (12+ tool calls) - Remove HTML report generation (unreliable with small LLMs) ### Agent Core (agent.cpp) - Fix loop detection false positive: compare both tool name AND arguments (was name-only, triggering on consecutive `mcp_windows_Shell` calls with different args) - Reduce tool result truncation from 20K to 4K chars to prevent context overflow ### New Files - `cpp/include/gaia/clean_console.h` + `cpp/src/clean_console.cpp` — polished TUI with ANSI colors and word-wrap - `cpp/examples/health_agent.cpp` — Windows system health agent using MCP - `cpp/tests/test_clean_console.cpp` — CleanConsole unit tests - `cpp/tests/test_tool_integration.cpp` — Tool registry integration tests - `cpp/tests/integration/test_integration_mcp.cpp` — MCP connectivity tests - `cpp/tests/integration/test_integration_wifi.cpp` — WiFi diagnostic tests - `cpp/tests/integration/test_integration_health.cpp` — Health monitoring tests ## Test plan - [ ] All 6 cloud CI jobs pass (ubuntu + windows: mock tests, install test, shared lib) - [ ] STX integration tests build and run (LLM + MCP + WiFi + Health) - [ ] `health_agent.exe` option 1: quick console summary (no Notepad) - [ ] `health_agent.exe` option 15: full diagnostics + Notepad report - [ ] `wifi_agent.exe` full network diagnostic works - [ ] Mock tests pass locally: `gaia_tests.exe --gtest_color=yes` - [ ] Documentation renders correctly on Mintlify --------- Co-authored-by: Claude Code <claude-code@anthropic.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Documentation
simple_agenttohealth_agentacross all C++ guide pagesGet-WmiObjecttoGet-CimInstance(correct PowerShell API)PipeCloser,isSafeShellArg,kDiagnosticMenu)CI/CD (build_cpp.yml)
Qwen3-4B-Instruct-2507-GGUFmodel (matches wifi and health agents)_dupenv_sguards,_WIN32vs_MSC_VER,_putenv_sHealth Agent (health_agent.cpp)
Start-Process notepadapproach[Environment]::NewLinefor proper newlinesAgent Core (agent.cpp)
mcp_windows_Shellcalls with different args)New Files
cpp/include/gaia/clean_console.h+cpp/src/clean_console.cpp— polished TUI with ANSI colors and word-wrapcpp/examples/health_agent.cpp— Windows system health agent using MCPcpp/tests/test_clean_console.cpp— CleanConsole unit testscpp/tests/test_tool_integration.cpp— Tool registry integration testscpp/tests/integration/test_integration_mcp.cpp— MCP connectivity testscpp/tests/integration/test_integration_wifi.cpp— WiFi diagnostic testscpp/tests/integration/test_integration_health.cpp— Health monitoring testsTest plan
health_agent.exeoption 1: quick console summary (no Notepad)health_agent.exeoption 15: full diagnostics + Notepad reportwifi_agent.exefull network diagnostic worksgaia_tests.exe --gtest_color=yes