v0.5.1 — Custom pass/fail thresholds
What's new in v0.5.1
⚙️ Custom pass/fail thresholds
Override the default PASS/FAIL thresholds for quadrant classification:
# Stricter standard: require 85% retention to be "deployment_ready"
deltatau-audit audit-sb3 --model m.zip --algo ppo --env CartPole-v1 \
--deploy-threshold 0.85 \
--stress-threshold 0.60Available on all audit subcommands: audit-sb3, audit-cleanrl, audit-hf, audit.
| Flag | Default | Effect |
|---|---|---|
--deploy-threshold |
0.80 |
Minimum deployment return ratio for non-fragile quadrant |
--stress-threshold |
0.50 |
Minimum stress return ratio (stored in summary.json) |
Both thresholds are saved in summary.json as deploy_threshold / stress_threshold for full audit traceability.
Python API:
result = run_full_audit(
adapter, env_factory,
deploy_threshold=0.85,
stress_threshold=0.60,
)🧪 Tests
9 new tests — 235 total.
See CHANGELOG.md for full history.
Full Changelog: v0.5.0...v0.5.1