# 00 Recession Classifier Baselines

Baselines and class imbalance for next-quarter recession prediction.


## Table of Contents
- [Load data](#load-data)
- [Define baselines](#define-baselines)
- [Evaluate metrics](#evaluate-metrics)
- [Checkpoint (Self-Check)](#checkpoint-self-check)
- [Solutions (Reference)](#solutions-reference)


## Why This Notebook Matters
Classification notebooks turn the recession label into a **probability model**.
You will learn how to evaluate rare-event prediction and how to choose thresholds intentionally.


## What You Will Produce
- (no file output; learning/analysis notebook)

## Success Criteria
- You can explain what you built and why each step exists.
- You can run your work end-to-end without undefined variables.

## Common Pitfalls
- Running cells top-to-bottom without reading the instructions.
- Leaving `...` placeholders in code cells.
- Reporting only accuracy on imbalanced data.
- Using threshold=0.5 by default without considering costs.

## Matching Guide
- `docs/guides/03_classification/00_recession_classifier_baselines.md`



## How To Use This Notebook
- This notebook is hands-on. Most code cells are incomplete on purpose.
- Complete each TODO, then run the cell.
- Use the matching guide (`docs/guides/03_classification/00_recession_classifier_baselines.md`) for deep explanations and alternative examples.
- Write short interpretation notes as you go (what changed, why it matters).



<a id="environment-bootstrap"></a>
## Environment Bootstrap
Run this cell first. It makes the repo importable and defines common directories.



In [None]:
from __future__ import annotations

from pathlib import Path
import sys


def find_repo_root(start: Path) -> Path:
    p = start
    for _ in range(8):
        if (p / 'src').exists() and (p / 'docs').exists():
            return p
        p = p.parent
    raise RuntimeError('Could not find repo root. Start Jupyter from the repo root.')


PROJECT_ROOT = find_repo_root(Path.cwd())
if str(PROJECT_ROOT) not in sys.path:
    sys.path.append(str(PROJECT_ROOT))

DATA_DIR = PROJECT_ROOT / 'data'
RAW_DIR = DATA_DIR / 'raw'
PROCESSED_DIR = DATA_DIR / 'processed'
SAMPLE_DIR = DATA_DIR / 'sample'

PROJECT_ROOT



## Goal
Establish baselines for predicting **next-quarter technical recession**.

Baselines matter because:
- recession is rare (class imbalance)
- a model that beats chance may still be useless
- you need a reference point before tuning anything



## Primer: sklearn Pipelines (How To Avoid Preprocessing Leakage)

### Why pipelines exist
A common ML mistake is fitting preprocessing (scalers, imputers) on the full dataset.
That leaks information from the test set into training.

A `Pipeline` enforces the correct order:
- fit preprocessing on training only
- apply preprocessing to test
- fit model on training only

### Key API concepts
- `fit(X, y)`: learn parameters from data (e.g., scaler means/standard deviations, model weights).
- `transform(X)`: apply learned parameters to new data (e.g., scale).
- `fit_transform(X, y)`: convenience that does both on the same data.

If you do `scaler.fit(X_all)` before splitting, you leaked test-set information.

### Example pattern
```python
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

clf = Pipeline([
    ('scaler', StandardScaler()),
    ('model', LogisticRegression(max_iter=5000)),
])

# clf.fit(X_train, y_train)
# y_prob = clf.predict_proba(X_test)[:, 1]
```

### Mini demo: the leakage you're avoiding (toy example)
```python
import numpy as np
from sklearn.preprocessing import StandardScaler

# Pretend the last 20% of data comes from a different era with a different mean
rng = np.random.default_rng(0)
X_train = rng.normal(loc=0.0, scale=1.0, size=(100, 1))
X_test  = rng.normal(loc=2.0, scale=1.0, size=(25, 1))

# WRONG: fit scaler on train+test (leaks the future)
sc_wrong = StandardScaler().fit(np.vstack([X_train, X_test]))
X_test_wrong = sc_wrong.transform(X_test)

# RIGHT: fit scaler on train only
sc_right = StandardScaler().fit(X_train)
X_test_right = sc_right.transform(X_test)

print("test mean after wrong scaling:", float(X_test_wrong.mean()))
print("test mean after right scaling:", float(X_test_right.mean()))
```

### What to remember
- Always split by time first.
- Then fit the pipeline on train.
- Then evaluate on test.

If you need different preprocessing for different columns, look into:
- `sklearn.compose.ColumnTransformer`


<a id="load-data"></a>
## Load data

### Goal
Load the macro quarterly modeling table and select:
- `y = target_recession_next_q`
- a minimal set of features



### Your Turn (1): Load macro_quarterly.csv (or sample)


In [None]:
import pandas as pd

path = PROCESSED_DIR / 'macro_quarterly.csv'
if path.exists():
    df = pd.read_csv(path, index_col=0, parse_dates=True)
else:
    df = pd.read_csv(SAMPLE_DIR / 'macro_quarterly_sample.csv', index_col=0, parse_dates=True)

df.head()



### Your Turn (2): Define target and a starter feature set


In [None]:
# Target (0/1)
y_col = 'target_recession_next_q'

# TODO: Pick a small feature set.
# Tip: use lagged predictors.
x_cols = [
    'T10Y2Y_lag1',
    'UNRATE_lag1',
    'FEDFUNDS_lag1',
]

df_m = df[[y_col] + x_cols + ['recession']].dropna().copy()
df_m[y_col].value_counts(dropna=False)



### Checkpoint (class imbalance awareness)


In [None]:
# TODO: Compute the base rate of recession in the target.
base_rate = df_m[y_col].mean()
base_rate



<a id="define-baselines"></a>
## Define baselines

You will implement 3 baselines:
1) **Majority class**: always predict 0
2) **Persistence**: predict next recession equals current recession label
3) **Simple rule**: yield spread negative => recession (choose threshold)



### Your Turn (1): Build baseline probability scores


In [None]:
import numpy as np

y_true = df_m[y_col].astype(int).to_numpy()

# Baseline 1: always 0 probability
p_majority = np.zeros_like(y_true, dtype=float)

# Baseline 2: persistence (use current recession label as probability)
p_persist = df_m['recession'].astype(float).to_numpy()

# Baseline 3: simple rule on yield spread
# TODO: Choose a threshold (e.g., 0.0 means inverted curve)
thr = 0.0
p_rule = (df_m['T10Y2Y_lag1'].to_numpy() < thr).astype(float)

p_majority[:5], p_persist[:5], p_rule[:5]



<a id="evaluate-metrics"></a>
## Evaluate metrics

### Goal
Evaluate baselines with metrics that make sense for imbalanced classification:
- ROC-AUC
- PR-AUC
- Brier score
- precision/recall at a chosen threshold



### Your Turn (1): Evaluate metrics


In [None]:
from src.evaluation import classification_metrics

metrics = {
    'majority': classification_metrics(y_true, p_majority, threshold=0.5),
    'persistence': classification_metrics(y_true, p_persist, threshold=0.5),
    'rule': classification_metrics(y_true, p_rule, threshold=0.5),
}

metrics



### Your Turn (2): Time-aware evaluation preview (optional)


In [None]:
# Optional: do the same baseline evaluation on a time-based train/test split.
# Why: baselines can look better in-sample than out-of-sample.
...



<a id="checkpoint-self-check"></a>
## Checkpoint (Self-Check)
Run a few asserts and write 2-3 sentences summarizing what you verified.



In [None]:
# TODO: After you build X/y and split by time, validate the split.
# Example (adjust variable names):
# assert X_train.index.max() < X_test.index.min()
# assert y_train.index.equals(X_train.index)
# assert y_test.index.equals(X_test.index)
# assert not X_train.isna().any().any()
# assert not X_test.isna().any().any()
...



## Extensions (Optional)
- Try one additional variant beyond the main path (different features, different split, different model).
- Write down what improved, what got worse, and your hypothesis for why.



## Reflection
- What did you assume implicitly (about timing, availability, stationarity, or costs)?
- If you had to ship this model, what would you monitor?



<a id="solutions-reference"></a>
## Solutions (Reference)

Try the TODOs first. Use these only to unblock yourself or to compare approaches.

<details><summary>Solution: Load data</summary>

_One possible approach. Your variable names may differ; align them with the notebook._

```python
# Reference solution for 00_recession_classifier_baselines — Load data
import pandas as pd
df = pd.read_csv(SAMPLE_DIR / 'macro_quarterly_sample.csv', index_col=0, parse_dates=True).dropna()
df[['target_recession_next_q']].value_counts(dropna=False)
```

</details>

<details><summary>Solution: Define baselines</summary>

_One possible approach. Your variable names may differ; align them with the notebook._

```python
# Reference solution for 00_recession_classifier_baselines — Define baselines
import numpy as np
from src import evaluation

y = df['target_recession_next_q'].astype(int).to_numpy()

# Baseline 1: always predict base rate
p_base = np.full_like(y, y.mean(), dtype=float)
m_base = evaluation.classification_metrics(y, p_base)

# Baseline 2: predict next recession = current recession (persistence)
p_persist = df['recession'].astype(float).to_numpy()
m_persist = evaluation.classification_metrics(y, p_persist)

{'base_rate': m_base, 'persistence': m_persist}
```

</details>

<details><summary>Solution: Evaluate metrics</summary>

_One possible approach. Your variable names may differ; align them with the notebook._

```python
# Reference solution for 00_recession_classifier_baselines — Evaluate metrics
# See above (ROC-AUC, PR-AUC, Brier).
```

</details>

