-
Notifications
You must be signed in to change notification settings - Fork 2
Issue code reference
Iman edited this page Jun 21, 2026
·
1 revision
Every issue tsauditor raises has a short code. Use these codes to filter the report programmatically or to look up what a specific finding means.
# Filter to a specific code
report.filter(code="LEK001")| Code | Severity | Trigger | What to do |
|---|---|---|---|
| PRF001 | WARNING | A gap between consecutive timestamps exceeds the domain threshold (5 calendar days for finance, 3× median gap otherwise) | Resample to a regular frequency, or document why irregular timestamps are expected |
| PRF002 | WARNING | 3+ consecutive NaN values in a column (sensor) or 5+ (finance) | Handle the affected span via interpolation, forward-fill with a limit, or explicit dropping — check whether the cluster corresponds to a known outage |
| PRF003 | INFO | ADF test p-value > 0.05 — column is non-stationary | Consider differencing or log-transforming before modeling; non-stationarity is expected for price series but can bias many ML methods |
| PRF004 | CRITICAL | Duplicate timestamps in the DataFrame index | Remove or aggregate duplicates before any further processing — duplicate timestamps silently corrupt rolling, lag, and resampling operations |
| PRF005 | WARNING | A run of 2+ consecutive large timestamp gaps | Review the cluster — likely a data feed outage or collection failure requiring explicit handling |
| PRF006 | WARNING | A column has more than 30% missing values overall | Consider dropping the column or imputing carefully; check whether the missingness is informative (i.e. whether absence itself carries signal) |
| Code | Severity | Trigger | What to do |
|---|---|---|---|
| ANO001 | WARNING | The same value repeats consecutively beyond the stuck-value window (5 for finance, 3 for sensor) | Investigate whether this is a sensor failure, a forward-fill artefact, or genuinely flat data — the evidence dict includes max_stuck_duration
|
| ANO002 | WARNING | A value exceeds the z-score threshold (5.0 for finance, 3.5 for sensor, 4.0 default) or falls outside 1.5×IQR | Decide whether to winsorize, transform, or treat as a data error; finance domain uses a wider threshold to avoid flagging legitimate fat-tail events |
| ANO003 | WARNING | A value deviates sharply from its immediate neighbors (rolling z-score > spike threshold) | Examine the flagged timestamps — these are contextually extreme even if globally plausible, often indicating data-entry or data-feed errors |
| Code | Severity | Trigger | What to do |
|---|---|---|---|
| LEK001 | CRITICAL | Feature–target AUC >= 0.80 (binary target) or Spearman | ρ |
| LEK002 | WARNING | Feature's peak cross-correlation with the target falls at a positive lag | Inspect how the feature is constructed — a positive-lag peak means the feature aligns most strongly with the future target, suggesting it encodes information not yet available at prediction time |
| LEK003 | WARNING | Feature correlates with the future target beyond what the target's own autocorrelation explains | Verify the feature uses only past data; this pattern is the signature of a forward-looking or centered rolling window |
| Severity | Meaning |
|---|---|
| CRITICAL | Must be resolved before modeling — the data quality issue will directly corrupt model training or evaluation |
| WARNING | Worth reviewing — may or may not require action depending on domain context |
| INFO | Informational — no immediate action required, but worth knowing before proceeding |
If you're contributing a new check, add a corresponding entry to tsauditor/report/remediation.py following the existing pattern. See Contributing for details.