Release v0.5.3: Adaptive episode sampling · maruyamakoju/deltatau-audit

What's new in v0.5.3

Adaptive episode sampling: CI-width-guaranteed convergence

Previously, `n_episodes` was fixed at a single number. With `--adaptive`, the audit now runs in batches and keeps collecting until every scenario's 95% bootstrap CI on the return ratio is narrow enough — giving you statistical guarantees without over- or under-sampling.

Usage:
```bash
deltatau-audit audit-sb3 --model m.zip --algo ppo --env CartPole-v1
--adaptive --target-ci-width 0.05 --max-episodes 300
```

New flags (on `audit`, `audit-sb3`, `audit-cleanrl`, `audit-hf`):

Flag	Default	Description
`--adaptive`	off	Enable adaptive sampling
`--target-ci-width`	0.10	Target 95% CI width on return ratio
`--max-episodes`	500	Hard cap on episodes per scenario

Python API:
```python
result = run_full_audit(
adapter, env_factory,
adaptive=True,
target_ci_width=0.05,
max_episodes=300,
)
n_used = result["robustness"]["n_episodes_used"] # per-scenario count
```

When `adaptive=True`, `result["robustness"]["n_episodes_used"]` contains the actual number of episodes used per scenario. The non-adaptive default path is unchanged.

Flaky test fix

`test_run_full_audit_strict_threshold_changes_quadrant` now uses `seed=42` for deterministic results.

11 new tests (263 total)

```
pip install -U deltatau-audit
```

Full Changelog: v0.5.2...v0.5.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.3: Adaptive episode sampling

Choose a tag to compare

Sorry, something went wrong.