Skip to content

v0.5.1 — Custom pass/fail thresholds

Choose a tag to compare

@maruyamakoju maruyamakoju released this 19 Feb 06:13
· 42 commits to main since this release

What's new in v0.5.1

⚙️ Custom pass/fail thresholds

Override the default PASS/FAIL thresholds for quadrant classification:

# Stricter standard: require 85% retention to be "deployment_ready"
deltatau-audit audit-sb3 --model m.zip --algo ppo --env CartPole-v1 \
    --deploy-threshold 0.85 \
    --stress-threshold 0.60

Available on all audit subcommands: audit-sb3, audit-cleanrl, audit-hf, audit.

Flag Default Effect
--deploy-threshold 0.80 Minimum deployment return ratio for non-fragile quadrant
--stress-threshold 0.50 Minimum stress return ratio (stored in summary.json)

Both thresholds are saved in summary.json as deploy_threshold / stress_threshold for full audit traceability.

Python API:

result = run_full_audit(
    adapter, env_factory,
    deploy_threshold=0.85,
    stress_threshold=0.60,
)

🧪 Tests

9 new tests — 235 total.


See CHANGELOG.md for full history.

Full Changelog: v0.5.0...v0.5.1