Skip to content

Issue code reference

Iman edited this page Jun 21, 2026 · 1 revision

Every issue tsauditor raises has a short code. Use these codes to filter the report programmatically or to look up what a specific finding means.

# Filter to a specific code
report.filter(code="LEK001")

Profiler codes (PRF)

Code Severity Trigger What to do
PRF001 WARNING A gap between consecutive timestamps exceeds the domain threshold (5 calendar days for finance, 3× median gap otherwise) Resample to a regular frequency, or document why irregular timestamps are expected
PRF002 WARNING 3+ consecutive NaN values in a column (sensor) or 5+ (finance) Handle the affected span via interpolation, forward-fill with a limit, or explicit dropping — check whether the cluster corresponds to a known outage
PRF003 INFO ADF test p-value > 0.05 — column is non-stationary Consider differencing or log-transforming before modeling; non-stationarity is expected for price series but can bias many ML methods
PRF004 CRITICAL Duplicate timestamps in the DataFrame index Remove or aggregate duplicates before any further processing — duplicate timestamps silently corrupt rolling, lag, and resampling operations
PRF005 WARNING A run of 2+ consecutive large timestamp gaps Review the cluster — likely a data feed outage or collection failure requiring explicit handling
PRF006 WARNING A column has more than 30% missing values overall Consider dropping the column or imputing carefully; check whether the missingness is informative (i.e. whether absence itself carries signal)

Anomaly codes (ANO)

Code Severity Trigger What to do
ANO001 WARNING The same value repeats consecutively beyond the stuck-value window (5 for finance, 3 for sensor) Investigate whether this is a sensor failure, a forward-fill artefact, or genuinely flat data — the evidence dict includes max_stuck_duration
ANO002 WARNING A value exceeds the z-score threshold (5.0 for finance, 3.5 for sensor, 4.0 default) or falls outside 1.5×IQR Decide whether to winsorize, transform, or treat as a data error; finance domain uses a wider threshold to avoid flagging legitimate fat-tail events
ANO003 WARNING A value deviates sharply from its immediate neighbors (rolling z-score > spike threshold) Examine the flagged timestamps — these are contextually extreme even if globally plausible, often indicating data-entry or data-feed errors

Leakage codes (LEK)

Code Severity Trigger What to do
LEK001 CRITICAL Feature–target AUC >= 0.80 (binary target) or Spearman ρ
LEK002 WARNING Feature's peak cross-correlation with the target falls at a positive lag Inspect how the feature is constructed — a positive-lag peak means the feature aligns most strongly with the future target, suggesting it encodes information not yet available at prediction time
LEK003 WARNING Feature correlates with the future target beyond what the target's own autocorrelation explains Verify the feature uses only past data; this pattern is the signature of a forward-looking or centered rolling window

Severity levels

Severity Meaning
CRITICAL Must be resolved before modeling — the data quality issue will directly corrupt model training or evaluation
WARNING Worth reviewing — may or may not require action depending on domain context
INFO Informational — no immediate action required, but worth knowing before proceeding

Adding a new code

If you're contributing a new check, add a corresponding entry to tsauditor/report/remediation.py following the existing pattern. See Contributing for details.

Clone this wiki locally