v1.7.0: Contract screening, viz game, llama.cpp, 164 commits
Highlights
- Contract screening system for separating equilibrium analysis with multi-seed sweep, collusion/sybil detection, and red-team blog posts (#234)
- Interactive isometric visualization game — browser-based SWARM simulation with Gemini Imagen 4 sprites, compare mode, sweep, leaderboard, and governance intervention controls (#182, #212)
- llama.cpp local inference provider with server setup, health checks, and SSRF hardening (#232)
- LangGraph governed handoff study — 4-agent Claude swarm, 32-config sweep
- Memori semantic memory middleware for LLM agents (#217)
- Loop detector governance lever with graduated enforcement (#198)
- Agent API Phase 1–3: scoped permissions, trace IDs, approval workflows
- SQLite persistence for simulations, governance, and scenarios (lazy-init to fix CI xdist contention)
- SciAgentGym bridge restored — tool substrate integration for scientific workflow agents (9 modules, 44 tests)
- Multiple SSRF/security fixes (#223, #225, #230, #236, #238, #239, #242)
Added
- SciAgentGym bridge restored — tool substrate integration for scientific workflow agents with environment management, workspace isolation, toolkit, governance hooks, and provider abstraction (reverts removal from #209)
- Contract screening system for separating equilibrium analysis with lock-in semantics, welfare metric, multi-seed sweep (10 seeds), collusion detection, sybil detection, and plot script (#234)
- LangGraph governed handoff study with 4-agent Claude swarm, 32-config sweep (seed 42), and sweep overview plot
- Hodoscope trajectory analysis bridge for agent trace inspection
- SQLite persistence for simulations, governance state, and scenarios with lazy-init singletons
- SoftMetrics wired into Web API
/api/v1/metricsendpoint - llama.cpp local inference provider with server setup script, health checks, seed validation, and SSRF/path-traversal hardening (#232)
- Interactive isometric visualization game (
viz/): Next.js browser-based SWARM simulation with client-side engine, Gemini Imagen 4 sprite assets, compare mode, parameter sweep, leaderboard, governance intervention controls, preset scenarios, narrative annotations, and data export (#182, #212) - Memori semantic memory middleware for LLM agents with persistent fact recall, SQLite-backed storage, and OpenRouter scenario variant (#217)
- Loop detector governance lever with graduated enforcement (#198)
- Agent API Phase 1–3: scoped permissions, trace IDs, structured errors, PATCH endpoints, filtering, validation, agent approval workflow
- SciAgentBench harness with topology matrix support (#200)
- Evaluation metrics suite for success rate, efficiency, and detection (#201)
- SciForge-style trace-to-task synthesis with replay verification (#203)
- Parameter validation and clamping diagnostics for proxy computation (#176)
- MetricsAggregator wired into CLI and example export (#212)
- Reproducibility documentation with one-command run workflow (#204)
- Integration tests for runtime environment lifecycle with leak detection (#197)
- EPIC tracking infrastructure for bridge integrations (#194)
- Collaborative chemistry under budget and audits scenario (#202)
- E2E integration tests for Web API simulation lifecycle
- Blog posts: Qwen3-30B SWARM Economy v0.2, contract screening separating equilibrium, multi-seed results, red-team findings
- Slash commands:
/build_game,/obsidian,/sync_artifacts,/security-review,/audit_docs,/check_nav,/bump_version - Streamlit Cloud deployment and HF Spaces sandbox link
- Social preview image (1280x640)
Changed
- README audit: Updated all counts to match codebase (4603 tests, 78 scenarios, 29 agent modules, 27 governance modules, 95 bridge files)
- LLM provider list expanded to all 9 supported providers
- Consolidated slash commands: merged related commands into
/ship,/merge_session,/sync,/fix_pr,/analyze_experiment - Moved pytest from pre-commit to pre-push hook (#177)
- Removed
abs()fromProxyWeights.normalize()(#178) - Updated crewai
>=0.80.0,<2.0(#221), bumped action-download-artifact to 15 (#220) - Pinned langgraph and langchain-core to exact versions
Fixed
- SQLite lock contention in CI: Lazy-init store singletons to prevent
database is lockedunder pytest-xdist - SSRF hardening: 4 separate fixes (#223, #225, #230, #236, #238, #242)
- Information exposure in AWM adapter (#239)
- 7 security vulnerabilities in contract screening
- mypy
method-assignerror in simulations router - SkillRL refinement governance bypass (#214)
- 77 Ruff linting errors (#218), mypy errors across multiple modules
- Flaky test stabilized with deterministic RNG seeds
- Static asset paths for viz game deployment
- 8 missing blog posts in mkdocs nav
Full Changelog: v1.6.0...v1.7.0
166 commits