We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
QA: add gemini-3.1-pro-preview BI results (claude/codex/pi Ubuntu PASS), update date to 2026-06-06
QA: pi/gemini-3.1-pro-preview Ubuntu NI PASS (14/14 vanilla NI complete)
QA: add gemini-3.1-pro-preview Vanilla NI results (13/14 PASS)
Add gemini-3.1-pro-preview (Google) sections to all test matrices
Add antigravity rows to QA Evidence matrix (all sections, not yet tested)
Update QA Evidence: date to 2026-06-04, tula→genty rename
BP/Resume: ALL 18/18 PASS! Tula 3/3 PASS — full matrix complete!
BP/Resume: hermes 3/3 PASS! (15/18 — only tula pending)
BP/Resume: gemini 3/3 PASS! (12/15 — hermes macOS+Windows pending)
BP/Resume: gemini PASS macOS! Build fix works. (11/15)
BP/Resume: hermes PASS Ubuntu! All 5 agents pass on Ubuntu now.
BP/Resume: gemini Ubuntu PASS, macOS install failures for gemini+hermes
Update BP/Resume: gemini PASS Ubuntu! Evidence ReferenceError was root cause.
Update BP/Resume: 9/15 PASS (codex+claude+pi 3/3; gemini+hermes 0/3)
Hermes Windows BP: SKIPPED — ConPTY >60 min, needs native stdin support. 17/18 PASS.
Hermes Windows: known ConPTY limitation (>45 min for BP tasks). 17/18 PASS.
Update: hermes Windows switched to predefined mode (create too slow for ConPTY)
Update BP/Resume: codex 3/3, claude 2/3, pi 1/3 (7/15 PASS)
Tula 3/3 PASS! (17/18 BP/Create — hermes Windows only remaining)
TULA PASS on Ubuntu! (17/18 BP/Create) — autonomous host loop works
Add tula to all Vanilla and BP matrix tables (alongside hermes, codex, pi, etc.)
Update: BP/Resume codex+claude PASS Ubuntu; BP/Create 16/18
Update: claude+gpt-5.5 PASS all 3 OS! (16/18 BP/Create)
Update: claude+gpt-5.5 PASS macOS (15/18 BP/Create)
Update: claude+gpt-5.5 PASS on Ubuntu (14/18 BP/Create); BP/Resume switched to gpt-5.5
Switch claude-code to gpt-5.5 in primary matrix (Anthropic credits depleted)
rename: omni → tula in wiki pages
Update: BP/Resume failing across board; hermes Windows CI token issue noted
Update: omni fails — agent-core+gpt-5.5 can't follow babysitter tool protocol
Update: omni still fails (agent-core can't drive SDK loop even with 5 stalls)