Skip to content

v1.8.0: Soft-vs-binary detection, platform bridges, governance studies

Choose a tag to compare

@rsavitt rsavitt released this 26 May 01:43
· 81 commits to main since this release
d60b9be

v1.8.0 — Soft-vs-binary detection, platform bridges, governance studies

489 commits since v1.7.0. Highlights:

Soft-vs-binary detection framework (swarm/detection/)

Turns the self-optimizing-agent vignette into a real experiment: every soft metric paired with its thresholded binary twin, scored as a classifier. AUROC / AUPRC / partial-AUROC, time-to-detection at fixed FPR, market-level adverse selection, calibration, and paired significance testing. Adds a 2D sensitivity-grid runner (run_detection_sensitivity_2d.py --preset heterogeneous) and a heterogeneous "informative" regime that avoids the AUROC=1.0 generator ceiling, plus a companion blog post.

External-platform bridges

MiroShark (social-cascade sim + SoftMetrics judging), LangChain, AutoGPT, CrewAI, Mesa ABM, RAG/LEANN, Hyperspace DAG domain, LabOS Toolmaker→Critic.

Governance & misalignment studies

Adaptive governance controller, governance parameter/sensitivity sweeps, misalignment module + sweeps, Tierra artificial-life scenario + hardening, evolutionary game handler, capability–safety Pareto frontiers, causal-credit propagation, and the triangle (misalignment × causal credit × toxicity) study. Plus escalation-sandbox LLM studies (temperature, prompt framing, model size, cooperation window).

New agent types & mechanisms

ThresholdDancer adversary, behavioral agent types, hyperagent self-modification, dynamic toxicity feedback, artifact registry + cascade-risk governance, PerformanceTracker, net-social-welfare metric.

On-chain

SwarmGym safety auditor CLI + SafetyAttestation contract (Base) + web3 client.

Other

Orchestrator pipeline/middleware refactor (god-object → middleware pipeline + handler factory + scheduler); numerous case-study blog posts.

See CHANGELOG.md for the full itemized list.

Quick Start

python -m pip install -e ".[dev,runtime]"
python -m swarm run scenarios/baseline.yaml --seed 42 --epochs 10 --steps 10