v0.5.2: Richer failure diagnostics
What's new in v0.5.2
Failure diagnostics: pattern → root cause → fix
Every audit now produces a structured failure analysis. When the agent fails or degrades under a timing perturbation, you immediately see why and how to fix it — not just a bare score.
5 named failure patterns:
| Pattern | Scenario | Fix summary |
|---|---|---|
| Speed Jitter Sensitivity | jitter |
Retrain with JitterWrapper |
| Observation Recency Dependency | delay |
Add ObservationDelayWrapper to training |
| Frequency Spike Fragility | spike |
Train with PiecewiseSwitchWrapper |
| Observation Noise Sensitivity | obs_noise |
Add ObsNoiseWrapper to training |
| Extreme Frequency Fragility | speed_5x |
Wide speed randomization + Δτ module |
Terminal output (_print_summary) now shows:
Failure Analysis (1 FAIL scenario: jitter)
──────────────────────────────────────────────────────
Pattern: Speed Jitter Sensitivity [FAIL]
Cause: The agent is sensitive to step-by-step speed fluctuations...
Fix: Retrain with speed jitter: randomize the environment speed...
--format markdown outputs a blockquote section with pattern, cause, and fix.
HTML reports include a styled amber Failure Analysis card after the Prescription section.
run_full_audit() result gains a diagnosis key:
result["diagnosis"]["primary_pattern"] # "Speed Jitter Sensitivity"
result["diagnosis"]["root_cause"] # explanation string
result["diagnosis"]["fix_recommendation"] # fix string
result["diagnosis"]["failing_scenarios"] # ["jitter"]
result["diagnosis"]["issues"] # full list, sorted by severity17 new tests (252 total)
pip install -U deltatau-audit
Full Changelog: v0.5.1...v0.5.2