Releases: aaravanmay/faultline
Releases · aaravanmay/faultline
v0.4.2 — zero-config scan, 5 framework adapters, hardening
Zero-config testing, framework integrations, and a hardening pass. Everything additive.
Highlights
faultline scan my_agent.py:agent— zero-config: discovers your wrapped tools, breaks them, gates CI. Plusfaultline doctor(preflight) andfaultline init(scaffold a suite + CI workflow).- One-line framework integration:
fl.instrument(...)for LangGraph, LangChain, LlamaIndex, pydantic-ai, crewAI — each verified against the real installed library. fl.tools_really_called([...])— catches an agent answering from a tool it never actually called.replay(transform=...)— catches silent behavioral drift after a context change.fl.assert_resilient(agent, task)— drop into any pytest/unittest suite.scan --explain— shows exactly what was corrupted and why each FAIL.- Hardening from 3 adversarial passes (intermittent silent failures now gate;
$6,000-style figures now detected;Decimalmoney values now corruptible).
Full notes: CHANGELOG.md · honest scope: CAPABILITIES.md
pip install -U faultline
faultline v0.4.1
Runtime guard (shadow/enforce seatbelt) + tamper-evident attestation report (attest / verify). 11 test suites green. Benchmark: 97.5% recall, 2.2% false-alarm, 100% deterministic (see benchmark/MEASUREMENT.md). pip install faultline
faultline v0.4.0
Deterministic silent-failure testing for AI agents — no LLM judge.
Highlights
- Token-only hosted push (
FAULTLINE_TOKENis all you need) - Loud, false-green-proof results (
.ok/.failed/.assert_ok()) - Six CLI modes that all gate CI: run, probe, fuzz, scenarios, replay, mine
- Marketplace-grade GitHub Action (mode + fail-on-silent inputs, verdict outputs)
- Detector upgrades: display-arg FP fix, derived-value FN layer, pandas + dict support
Measured (85-case adversarial benchmark, independently audited, reproducible — see benchmark/MEASUREMENT.md): 97.5% recall · 2.2% false-alarm · 100% deterministic.
pip install faultline