# Module 1.09: Diagnostics — The Big Picture (v2)

> **Goal:** Compute time series diagnostics and understand what patterns exist in our portfolio.

**5Q Lens:** Q4 (Data & Drivers) — Measure structure & chaos across portfolio


---

## 1. Setup

In [2]:
import warnings
import numpy as np
import pandas as pd
from tsfeatures import tsfeatures
from tsforge.eda.ts_features_extension import ADI
from tsforge.eda import evaluate_diagnostics
from tsforge.plots import plot_bar, plot_distribution

# Settings
warnings.filterwarnings('ignore')

---
## 2. Load Data

In [3]:
# Load the forecast-ready dataset from Module 1.08
weekly_df = pd.read_parquet('/Users/lindsaytruong/forecast-academy/real-world-forecasting-foundations/data/output/data/1.08_data_preparation_output.parquet')

In [4]:
# Quick sanity check
weekly_df.head(3)

Unnamed: 0,unique_id,ds,y,is_gap,item_id,store_id,dept_id,cat_id,state_id,wm_yr_wk,month,year,snap_CA,snap_TX,snap_WI,event_name_1,event_name_2,event_name_3,event_type_1,event_type_2,event_type_3
0,FOODS_1_001_CA_1,2011-01-23,3.0,0,FOODS_1_001,CA_1,FOODS_1,FOODS,CA,11101,1,2011,0,0,0,,,,,,
1,FOODS_1_001_CA_1,2011-01-30,9.0,0,FOODS_1_001,CA_1,FOODS_1,FOODS,CA,11101,1,2011,1,1,1,,,,,,
2,FOODS_1_001_CA_1,2011-02-06,7.0,0,FOODS_1_001,CA_1,FOODS_1,FOODS,CA,11102,2,2011,1,1,1,SuperBowl,,,Sporting,,


---

<div style="text-align: center;">

## 3. Compute Diagnostics

<div style="background: linear-gradient(135deg, #2E86AB 0%, #1a5276 100%); color: white; padding: 12px 20px; border-radius: 8px; margin: 10px auto; max-width: 600px;">
<strong>What patterns exist in the data?</strong><br>
<em>50+ metrics from  tsfeatures + tsforge that describe trend, seasonality, noise, intermittency, and more.</em>
</div>

</div>

### 3.1 Calculate diagnostics from `tsfeatures`

`tsfeatures` extracts dozens of time series characteristics automatically — this is our first systematic look at the portfolio.

In [4]:
# diagnostics = tsfeatures(weekly_df, freq=52, threads=8)  # All defaults
# adi_df = (weekly_df.groupby('unique_id')['y'].apply(lambda y: ADI(y.values, freq=52)["adi"]).reset_index(name='adi'))
# diagnostics = diagnostics.merge(adi_df, on='unique_id')
diagnostics = pd.read_parquet('/Users/lindsaytruong/forecast-academy/real-world-forecasting-foundations/data/output/data/1.09_diagnostics_output.parquet')


### 3.3 Merge Hierarchy Metadata

Attach business dimensions so we can slice diagnostics by department, category, store.

In [5]:
# Get hierarchy from original data
hierarchy = (
    weekly_df[['unique_id', 'item_id', 'dept_id', 'cat_id', 'store_id', 'state_id']]
    .drop_duplicates(subset=['unique_id'])
)

# Merge
diagnostics = diagnostics.merge(hierarchy, on='unique_id', how='left')

### 3.4 Preview Key Metrics

In [5]:
KEY_METRICS = ['trend', 'seasonal_strength', 'entropy', 'adi']

diagnostics[['unique_id', 'cat_id', 'dept_id'] + KEY_METRICS].head(10)

Unnamed: 0,unique_id,cat_id,dept_id,trend,seasonal_strength,entropy,adi
0,FOODS_1_001_CA_1,FOODS,FOODS_1,0.20445,0.376623,0.850729,1.105469
1,FOODS_1_001_CA_2,FOODS,FOODS_1,0.22328,0.439298,0.843529,1.105469
2,FOODS_1_001_CA_3,FOODS,FOODS_1,0.162804,0.384099,0.895521,1.118577
3,FOODS_1_001_CA_4,FOODS,FOODS_1,0.110839,0.479389,0.87317,1.276018
4,FOODS_1_001_TX_1,FOODS,FOODS_1,0.260977,0.376637,0.866836,1.200855
5,FOODS_1_001_TX_2,FOODS,FOODS_1,0.212788,0.397528,0.855971,1.156379
6,FOODS_1_001_TX_3,FOODS,FOODS_1,0.048173,0.417552,0.886295,1.181435
7,FOODS_1_001_WI_1,FOODS,FOODS_1,0.287522,0.423122,0.837675,1.119522
8,FOODS_1_001_WI_2,FOODS,FOODS_1,0.392052,0.434856,0.828196,1.288991
9,FOODS_1_001_WI_3,FOODS,FOODS_1,0.466792,0.466487,0.828913,1.524324


---

<div style="text-align: center;">

## 4. Structure & Chaos Overview

<div style="background: linear-gradient(135deg, #4A90A4 0%, #2d5a6b 100%); color: white; padding: 12px 20px; border-radius: 8px; margin: 10px auto; max-width: 600px;">
<strong>One view of all key metrics</strong><br>
<em>Structure (trend, seasonality) vs Chaos (entropy, intermittency, variability)</em>
</div>

</div>

In [6]:
plot_distribution(
    diagnostics,
    columns=['trend', 'seasonal_strength', 'entropy', 'adi'],
    use_metric_defaults=True, 
    thresholds={
        'trend': 0.3,              # Custom: lower bar for "strong trend"
        'seasonal_strength': 0.6,  # Custom: higher bar for seasonality
        'entropy': 0.8,            # Keep default
        'adi': 1.5,                # Custom intermittency threshold
    },
    show_threshold_pct=True,
    show_threshold_legend=True,
    mode='facet'

)


---
<div style="text-align: center;">

## 6. Portfolio Summary by Segment

</div>

### 6.1 By Category

In [9]:
plot_distribution(
    diagnostics,
    columns=['trend', 'seasonal_strength', 'entropy', 'adi'],
    use_metric_defaults=True,  
    thresholds={
        'trend': 0.3,              # Custom: lower bar for "strong trend"
        'seasonal_strength': 0.6,  # Custom: higher bar for seasonality
        'entropy': 0.8,            # Keep default
        'adi': 1.5,                # Custom intermittency threshold
    },
    threshold_labels={
        'trend': 'Strong Trend',
        'seasonal_strength': 'Strong Seasonality', 
        'entropy': 'Predictable',
        'adi': 'Regular',
    },
    show_threshold_pct=True,
    mode='dropdown',
    group_col='cat_id'

)


In [9]:
plot_distribution(
    diagnostics,
    columns=['trend'],
    use_metric_defaults=True,  
    thresholds={
        'trend': 0.3,            
    },
    threshold_labels={
        'trend': 'Strong Trend',
    },
    show_threshold_pct=True,
    mode='facet',
    group_col='cat_id'

)

### 6.2 By Department

In [10]:
plot_distribution(
    diagnostics,
    columns=['trend'],
    use_metric_defaults=True,  
    thresholds={
        'trend': 0.3,            
    },
    threshold_labels={
        'trend': 'Strong Trend',
    },
    show_threshold_pct=True,
    mode='facet',
    group_col='dept_id'

)

### Portfolio-Level Summary

In [10]:
# Portfolio-level summary
evaluate_diagnostics(
    diagnostics,
    metrics=['trend', 'seasonal_strength', 'entropy', 'adi'],
    thresholds={'trend': 0.3, 'seasonal_strength': 0.6, 'entropy': 0.8, 'adi': 1.5},
)

Unnamed: 0,N,trend (Below Threshold),trend (Above Threshold),seasonal_strength (Below Threshold),seasonal_strength (Above Threshold),entropy (Below Threshold),entropy (Above Threshold),adi (Below Threshold),adi (Above Threshold)
0,30490,13418,17072,19781,10709,12050,18440,22038,8452


### Summary by Category

In [11]:
# By category - with percentages
evaluate_diagnostics(
    diagnostics,
    metrics=['trend', 'seasonal_strength', 'entropy', 'adi'],
    thresholds={'trend': 0.3, 'seasonal_strength': 0.6, 'entropy': 0.8, 'adi': 1.5},
    group_by='cat_id',
    show_pct=True,
)

Unnamed: 0,cat_id,N,trend (Below Threshold),trend (Above Threshold),trend (% Below Threshold),trend (% Above Threshold),seasonal_strength (Below Threshold),seasonal_strength (Above Threshold),seasonal_strength (% Below Threshold),seasonal_strength (% Above Threshold),entropy (Below Threshold),entropy (Above Threshold),entropy (% Below Threshold),entropy (% Above Threshold),adi (Below Threshold),adi (Above Threshold),adi (% Below Threshold),adi (% Above Threshold)
0,HOBBIES,5650,3067,2583,54.3%,45.7%,3904,1746,69.1%,30.9%,1299,4351,23.0%,77.0%,3374,2276,59.7%,40.3%
1,HOUSEHOLD,10470,5548,4922,53.0%,47.0%,7031,3439,67.2%,32.8%,3266,7204,31.2%,68.8%,7061,3409,67.4%,32.6%
2,FOODS,14370,4803,9567,33.4%,66.6%,8846,5524,61.6%,38.4%,7485,6885,52.1%,47.9%,11603,2767,80.7%,19.3%


### Summary by Department

In [12]:
# By department - with percentages
evaluate_diagnostics(
    diagnostics,
    metrics=['trend', 'entropy'],
    thresholds={'trend': 0.3, 'entropy': 0.8},
    group_by='dept_id',
    show_pct=True,
    show_cnt=False
)

Unnamed: 0,dept_id,N,trend (% Below Threshold),trend (% Above Threshold),entropy (% Below Threshold),entropy (% Above Threshold)
0,HOBBIES_1,4160,50.5%,49.5%,28.4%,71.6%
1,HOBBIES_2,1490,65.0%,35.0%,7.8%,92.2%
2,HOUSEHOLD_1,5320,38.3%,61.7%,43.5%,56.5%
3,HOUSEHOLD_2,5150,68.2%,31.8%,18.4%,81.6%
4,FOODS_1,2160,45.8%,54.2%,40.6%,59.4%
5,FOODS_2,3980,32.6%,67.4%,51.9%,48.1%
6,FOODS_3,8230,30.6%,69.4%,55.2%,44.8%


In [None]:
# Save diagnostics for downstream modules
diagnostics.to_parquet('./1.09_diagnostics.parquet', index=False)