# Module 1.8: Timeseries Diagnostics

> **Goal:** Explore characteristics of the M5 dataset using tsfeatures + tsforge.

This module teaches you to:
1. Load data
2. Compute diagnostics at the most granular "unique_id" level. 
3. Motivate the focus on the "Lie Detector Six" metric set.
    * getting a feel for the forecastability, quality and characteristics of the data BEFORE we start forecasting.


## 1. Setup

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from itertools import combinations
from pathlib import Path
from tsforge import load_m5
import tsforge as tsf
import seaborn as sns

# Configuration
import warnings
warnings.filterwarnings('ignore')
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

Notebook agenda 

* Reconfirm the unique_id definition
* Compute tsfeatures + tsforge diagnostics per unique_id
* Highlight the “Lie Detector Six”

In [3]:
# read in data 
weekly_df = pd.read_parquet(
    "/Users/jackrodenberg/Desktop/real-world-forecasting-foundations/modules/output/m5_weekly_clean.parquet",
)

In [4]:
weekly_df.head()

Unnamed: 0,unique_id,ds,y
0,FOODS_1_001_CA_1,2011-01-29,3.0
1,FOODS_1_001_CA_1,2011-02-05,9.0
2,FOODS_1_001_CA_1,2011-02-12,7.0
3,FOODS_1_001_CA_1,2011-02-19,8.0
4,FOODS_1_001_CA_1,2011-02-26,14.0


In [5]:
from tsforge.eda.ts_features_extension import permutation_entropy,MI_top_k_lags,ADI
from tsfeatures import tsfeatures,lumpiness,stl_features,statistics
# using nixtla's tsfeatures 
id_lvl_feats = tsfeatures(

    ts = weekly_df,
    # frequency of data is weekly, so here we input 52     
    freq=52,

    # COMPUTE LIE detector six 
    features=[
        statistics,
        lumpiness, # variance of variances 
        permutation_entropy, # permutation entropy 
        MI_top_k_lags, # sum of MI over top 5 lags 
        stl_features, # STL decomposition Features (Trend, Seasonal Strength)
        ADI, # Avg Demand Interval
        ],

        scale=False # ENSURE YOU TURN THIS OFF for accurate statistics, otherwise outputs are standard scaled for model training.. 
)

* taking a closter look at the table we the "Lie Detector 6". 

    - Lumpiness: Variance of Variances 
    - Entropy (Permutation Entropy)
    - Seasonal Strength 
    - Trend Strength
    - MI Top K Lags: Mutual Information Top K Lags (K = 5)
        - for more clarity this is the sum of the Mutual Information of the top 5 lags from lags 1-freq
    - ADI: Average Demand Interval (time between demands)

In [6]:
id_lvl_feats[["unique_id","lumpiness", "permutation_entropy", "seasonal_strength", "trend", "MI_top_k_lags", "adi"]].head()

Unnamed: 0,unique_id,lumpiness,permutation_entropy,seasonal_strength,trend,MI_top_k_lags,adi
0,FOODS_1_001_CA_1,87.235596,0.969347,0.376623,0.20445,0.270401,1.105469
1,FOODS_1_001_CA_2,230.382385,0.981118,0.439298,0.22328,0.153054,1.105469
2,FOODS_1_001_CA_3,116.775986,0.984305,0.384099,0.162804,0.150131,1.118577
3,FOODS_1_001_CA_4,0.956493,0.952965,0.479389,0.110839,0.28407,1.276018
4,FOODS_1_001_TX_1,20.594612,0.962183,0.376637,0.260977,0.168827,1.200855


* add a few useful descriptors to help us understand the data in a more intuitive way. 
    - how much does each item make up of the total demand
    - where does an item rank in terms of total sales? 
    - skewness and kurtosis (understand distribution shape)
        - kurtosis: how heavy are the tails of the distribution ? 
        - skewness: how asymmetric is the distribution? 

In [8]:
# add some additional useful descriptors
id_lvl_feats = id_lvl_feats.assign(
    pct_of_demand=id_lvl_feats["total_sum"] / id_lvl_feats["total_sum"].sum(),
)

import scipy.stats as st

# merge with skew, kurtosis of demand!
id_lvl_feats = id_lvl_feats.merge(
    weekly_df.groupby("unique_id").agg(
        skew=("y", "skew"),
        kurtosis=("y", st.kurtosis),
    ),
    on="unique_id",
)


In [None]:
ld_six = [
        "lumpiness",
        "permutation_entropy",
        "seasonal_strength",
        "trend",
        "MI_top_k_lags",
        "adi",
    ]

# Absolute variants used for the second set of detectors
id_lvl_feats["trend_two"] = id_lvl_feats["linearity"].abs()
id_lvl_feats["seasonal_two"] = id_lvl_feats["seasonal_pacf"].abs()

descriptors = ['unique_id','sales_rank','skew','kurtosis','pct_of_demand']

cols = descriptors + ld_six

In [None]:
# lets examine the top 25% of EACH of the lie detector 6, and look to spot some trends!
for col in ld_six:
    print(f'Inspecting Top 25% of timeseries for {col}')
    res = id_lvl_feats.loc[id_lvl_feats[col] >= id_lvl_feats[col].quantile(0.75),
    ['unique_id','skew','kurtosis','pct_of_demand','lumpiness','adi','MI_top_k_lags']].set_index("unique_id")

    corr = res[["lumpiness", "adi", "MI_top_k_lags", "pct_of_demand"]].corr(method="spearman")

    threshold = 0.7  # adjust as needed
    high_corr = (
        corr.where(
            np.triu(np.ones(corr.shape), k=1).astype(bool)
        )  # keep upper triangle without diagonal
        .stack()
        .rename("corr")
        .reset_index()
    )

    high_corr = high_corr.loc[high_corr["corr"].abs() >= threshold]

    if not high_corr.empty:
        print(f"Pairs with |corr| >= {threshold}:")
        display(high_corr.sort_values("corr", key=np.abs, ascending=False))
        

    display(res.describe().loc[['mean','std']])

In [None]:
# Create individual detector flags
trend_detectors = ld_six + ["trend_two", "seasonal_two"]

for detector in trend_detectors:
    id_lvl_feats[f"{detector}_flag"] = id_lvl_feats[detector] > id_lvl_feats[detector].quantile(
        0.75
    )

# Build labeled dataset with prominent flags + detector details
flag_cols = [f"{d}_flag" for d in trend_detectors]

id_lvl_feats_labeled = id_lvl_feats.assign(
    # Prominent characteristic flags
    intermittent=id_lvl_feats["adi"] >= 1.34,
    heavy_tailed=id_lvl_feats["kurtosis"].abs() > 3,
    non_zero_min=id_lvl_feats["min"] > 0,
    # Detector summary flags
    n_flags=id_lvl_feats[flag_cols].sum(axis=1),
    suspect=id_lvl_feats[flag_cols].any(axis=1),
    highly_suspect=lambda df: df["n_flags"] >= 2,
    # Which detectors are flagging
    flagged_detectors=id_lvl_feats[flag_cols].apply(
        lambda row: [trend_detectors[i] for i, val in enumerate(row) if val], axis=1
    ),
    # Compact string representation
    flag_pattern=lambda df: df["flagged_detectors"].apply(
        lambda x: "|".join([d[:4].upper() for d in x]) if x else "CLEAN"
    ),
)

# Quick summary
print("Characteristic flags:")
print(f"  Intermittent: {id_lvl_feats_labeled['intermittent'].sum()}")
print(f"  Heavy-tailed: {id_lvl_feats_labeled['heavy_tailed'].sum()}")
print(f"  Non-zero min: {id_lvl_feats_labeled['non_zero_min'].sum()}")

print("\nLie detector flags:")
print(f"  Suspect (1+ detectors): {id_lvl_feats_labeled['suspect'].sum()}")
print(f"  Highly suspect (2+ detectors): {id_lvl_feats_labeled['highly_suspect'].sum()}")

print("\nMost common flag patterns:")
print(id_lvl_feats_labeled["flag_pattern"].value_counts().head(10))

# View highly suspect series with all context
print("\nHighly suspect series with characteristics:")
display(
    id_lvl_feats_labeled[id_lvl_feats_labeled["highly_suspect"]][
        [
            "unique_id",
            "intermittent",
            "heavy_tailed",
            "non_zero_min",
            "n_flags",
            "flagged_detectors",
            "adi",
            "kurtosis",
        ]
    ].head(10)
)

# Crosstab: how do characteristics relate to detector flags?
print("\nIntermittent series that are also highly suspect:")
print(pd.crosstab(id_lvl_feats_labeled["intermittent"], id_lvl_feats_labeled["highly_suspect"]))

print("\nHeavy-tailed series that are also highly suspect:")
print(pd.crosstab(id_lvl_feats_labeled["heavy_tailed"], id_lvl_feats_labeled["highly_suspect"]))

Unnamed: 0,level_0,level_1,corr
5,lumpiness_scaled,pct_of_demand,0.844968
9,permutation_entropy,adi,-0.923062


# Lets look at items that are in the top25th percentile in our diagnostics
    - this will help us spot certain more nuanced patterns
    - Are intermittent items lumpy? Are seasonal items intermittent? Etc... 

In [48]:
# Create individual detector flags

id_lvl_feats["linearity_abs"] = id_lvl_feats["linearity"].abs()


for detector in ld_six + ['linearity_abs']:
    id_lvl_feats[f"{detector}_flag"] = id_lvl_feats[detector] > id_lvl_feats[detector].quantile(
        0.75
    )

# Build labeled dataset with prominent flags + detector details
flag_cols = [f"{d}_flag" for d in ld_six]

id_lvl_feats_labeled = id_lvl_feats.assign(
    # Prominent characteristic flags
    intermittent=id_lvl_feats["adi"] >= 1.34,
    heavy_tailed=id_lvl_feats["kurtosis"].abs() > 3, # look at heavytailed behavior
    non_zero_min=id_lvl_feats["min"] > 0,
    # Detector summary flags
    n_flags=id_lvl_feats[flag_cols].sum(axis=1),
    suspect=id_lvl_feats[flag_cols].any(axis=1),
    highly_suspect=lambda df: df["n_flags"] >= 2,
    # Which detectors are flagging
    flagged_detectors=id_lvl_feats[flag_cols].apply(
        lambda row: [ld_six[i] for i, val in enumerate(row) if val], axis=1
    ),
    # Compact string representation
    flag_pattern=lambda df: df["flagged_detectors"].apply(
        lambda x: "|".join([d[:4].upper() for d in x]) if x else "CLEAN"
    ),
)

# Quick summary
print("Characteristic flags:")
print(f"  Intermittent: {id_lvl_feats_labeled['intermittent'].sum()}")
print(f"  Heavy-tailed: {id_lvl_feats_labeled['heavy_tailed'].sum()}")
print(f"  Non-zero min: {id_lvl_feats_labeled['non_zero_min'].sum()}")

print("\nLie detector flags:")
print(f"  Suspect (1+ detectors): {id_lvl_feats_labeled['suspect'].sum()}")
print(f"  Highly suspect (2+ detectors): {id_lvl_feats_labeled['highly_suspect'].sum()}")

print("\nMost common flag patterns:")
print(id_lvl_feats_labeled["flag_pattern"].value_counts().head(10))

# View highly suspect series with all context
print("\nHighly suspect series with characteristics:")
display(
    id_lvl_feats_labeled[id_lvl_feats_labeled["highly_suspect"]][
        [
            "unique_id",
            "intermittent",
            "heavy_tailed",
            "non_zero_min",
            "n_flags",
            "flagged_detectors",
            "adi",
            "kurtosis",
        ]
    ].head(10)
)

# Crosstab: how do characteristics relate to detector flags?
print("\nIntermittent series that are also highly suspect:")
print(pd.crosstab(id_lvl_feats_labeled["intermittent"], id_lvl_feats_labeled["highly_suspect"]))

print("\nHeavy-tailed series that are also highly suspect:")
print(pd.crosstab(id_lvl_feats_labeled["heavy_tailed"], id_lvl_feats_labeled["highly_suspect"]))


Characteristic flags:
  Intermittent: 11924
  Heavy-tailed: 6394
  Non-zero min: 752

Lie detector flags:
  Suspect (1+ detectors): 24698
  Highly suspect (2+ detectors): 14898

Most common flag patterns:
flag_pattern
CLEAN        5792
MI_T|ADI     3199
PERM         2793
LUMP|PERM    1955
LUMP         1870
MI_T         1735
ADI          1389
SEAS|TREN    1328
SEAS         1197
LUMP|TREN     975
Name: count, dtype: int64

Highly suspect series with characteristics:


Unnamed: 0,unique_id,intermittent,heavy_tailed,non_zero_min,n_flags,flagged_detectors,adi,kurtosis
14,FOODS_1_002_TX_1,True,True,False,2,"[MI_top_k_lags, adi]",1.612717,4.299198
23,FOODS_1_003_CA_4,False,True,False,2,"[permutation_entropy, MI_top_k_lags]",1.097656,15.750452
26,FOODS_1_003_TX_3,True,False,False,2,"[MI_top_k_lags, adi]",1.694611,1.766621
31,FOODS_1_004_CA_2,False,True,False,3,"[lumpiness, permutation_entropy, trend]",1.097561,3.194076
35,FOODS_1_004_TX_2,False,False,False,2,"[lumpiness, trend]",1.125,0.837187
38,FOODS_1_004_WI_2,False,False,False,3,"[lumpiness, permutation_entropy, MI_top_k_lags]",1.092233,2.144132
39,FOODS_1_004_WI_3,False,False,False,2,"[lumpiness, permutation_entropy]",1.125,1.918513
40,FOODS_1_005_CA_1,True,False,False,2,"[lumpiness, seasonal_strength]",1.387255,1.836866
42,FOODS_1_005_CA_3,False,False,False,2,"[lumpiness, permutation_entropy]",1.214592,1.092073
52,FOODS_1_006_CA_3,False,True,False,2,"[lumpiness, permutation_entropy]",1.064394,3.127723



Intermittent series that are also highly suspect:
highly_suspect  False  True 
intermittent                
False           10942   7624
True             4650   7274

Heavy-tailed series that are also highly suspect:
highly_suspect  False  True 
heavy_tailed                
False           13147  10949
True             2445   3949


In [12]:
dmd_by_pattern = id_lvl_feats_labeled.groupby('flag_pattern')['pct_of_demand'].sum().sort_values(ascending=False)

dmd_by_pattern.filter(like="LUMP").sum() # 70% of our data has high lumpiness! 

0.71391785

* this is a big clue, this means we likely will need to use robust loss functions in any ML or DL approaches as the variance is highly unstable in many of our timeseries. 

* this also indicates some clear stationarity issues, we can see that the variance and likely the mean is not stable over time. 

In [45]:
print(f'Percentage of Demand with Lumpiness and Trend: {dmd_by_pattern.filter(like="LUMP|TREN").sum() * 100:.2f}%')
print(f'Percentage of Demand with Lumpiness and Seasonality: {dmd_by_pattern.filter(like="LUMP|SEAS").sum() * 100:.2f}%')

Percentage of Demand with Lumpiness and Trend: 10.75%
Percentage of Demand with Lumpiness and Seasonality: 3.53%


In [33]:
id_lvl_feats.to_parquet(
    OUTPUT_DIR / "uid_lvl_feats.parquet"
)

NameError: name 'OUTPUT_DIR' is not defined