# Stage 10b — Time Series Baseline (Features, Pipeline, Evaluation)

**Chain statement:** In the lecture, we learned how to create lag/rolling features, build pipelines, and evaluate with appropriate metrics. Now we adapt those patterns to a simple time series to produce a validated baseline.

## 1) Load & Prepare Dataset (DateTime index)

In [None]:
import pandas as pd, numpy as np
from pathlib import Path

DATA = Path("../data/raw/time_series.csv")
df = pd.read_csv(DATA, parse_dates=["date"]).sort_values("date").set_index("date")
df.head()

## 2) Engineer Features (lag/rolling/momentum/z-score)
- Avoid leakage by **shifting** roll/lag so features at *t* use info up to *t-1*.
- Target for forecasting: **next-step return** `y = ret.shift(-1)` (drop tail).

In [None]:
import numpy as np
# 2.1 returns
df["ret"] = df["price"].pct_change()

# 2.2 engineered features (all based on *past* info)
# lags
df["lag1_ret"] = df["ret"].shift(1)
df["lag5_ret"] = df["ret"].shift(5)

# rolling stats on returns (shifted by 1 to avoid using t info)
df["roll_mean_5"] = df["ret"].rolling(window=5).mean().shift(1)
df["roll_std_5"]  = df["ret"].rolling(window=5).std().shift(1)

# rolling min/max on price (shifted)
df["roll_min_10"] = df["price"].rolling(window=10).min().shift(1)
df["roll_max_10"] = df["price"].rolling(window=10).max().shift(1)

# momentum (10-day)
df["mom_10"] = (df["price"] / df["price"].shift(10)) - 1

# rolling z-score of price (20-day)
mean20 = df["price"].rolling(20).mean().shift(1)
std20  = df["price"].rolling(20).std().shift(1)
df["zscore_20"] = (df["price"] - mean20) / std20

# 2.3 target: next-step return
df["y"] = df["ret"].shift(-1)

# drop warmup NaNs
df_feat = df.dropna().copy()
df_feat.head()

### Notes on leakage & rationale
- **Leakage control**: `.shift(1)` on lags/rolling stats ensures only past info is used for features at time *t*.
- **Feature choices**:
  - `lag1_ret`, `lag5_ret`: short-term memory of returns。
  - `roll_mean_5`, `roll_std_5`: local trend & volatility。
  - `roll_min_10`, `roll_max_10`: local bounds (support/resistance proxy)。
  - `mom_10`: momentum signal。
  - `zscore_20`: relative deviation from recent mean。

## 3) Time-aware Split (last 25% as test)

In [None]:
n = len(df_feat)
cut = int(n * 0.75)
train = df_feat.iloc[:cut].copy()
test  = df_feat.iloc[cut:].copy()

features = ["lag1_ret","lag5_ret","roll_mean_5","roll_std_5","roll_min_10","roll_max_10","mom_10","zscore_20"]
X_train, y_train = train[features], train["y"]
X_test,  y_test  = test[features],  test["y"]

len(train), len(test), X_train.shape, X_test.shape

## 4) Build sklearn Pipeline (preprocessing → model)

In [None]:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Ridge

pipe = Pipeline([
    ("scaler", StandardScaler()),
    ("model", Ridge(alpha=1.0))
])
pipe

## 5) Fit → Predict → Evaluate (MAE/RMSE) + Plot prediction vs truth

In [None]:
from sklearn.metrics import mean_absolute_error, mean_squared_error
import numpy as np
import matplotlib.pyplot as plt

pipe.fit(X_train, y_train)
pred_train = pipe.predict(X_train)
pred_test  = pipe.predict(X_test)

mae_train = mean_absolute_error(y_train, pred_train)
rmse_train = mean_squared_error(y_train, pred_train, squared=False)
mae_test = mean_absolute_error(y_test, pred_test)
rmse_test = mean_squared_error(y_test, pred_test, squared=False)

print("MAE  (train, test):", round(mae_train,6), round(mae_test,6))
print("RMSE (train, test):", round(rmse_train,6), round(rmse_test,6))

# Plot prediction vs truth on test
plt.figure(figsize=(10,4))
plt.plot(y_test.index, y_test.values, label="truth")
plt.plot(y_test.index, pred_test, label="prediction")
plt.title("Next-step return: prediction vs truth (test)")
plt.legend()
plt.tight_layout()
plt.show()

## 6) (Optional) Classification baseline
If you prefer classification: `y_up = (ret.shift(-1) > 0).astype(int)` then train a classifier.
Below is a compact example with `LogisticRegression` and metrics.

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, ConfusionMatrixDisplay

cls_train = (train["y"] > 0).astype(int)
cls_test  = (test["y"] > 0).astype(int)

clf = Pipeline([("scaler", StandardScaler()),
                ("clf", LogisticRegression(max_iter=1000))])
clf.fit(X_train, cls_train)
pred_cls = clf.predict(X_test)

acc = accuracy_score(cls_test, pred_cls)
prec = precision_score(cls_test, pred_cls, zero_division=0)
rec = recall_score(cls_test, pred_cls, zero_division=0)
f1 = f1_score(cls_test, pred_cls, zero_division=0)

print({"accuracy":acc, "precision":prec, "recall":rec, "f1":f1})

import matplotlib.pyplot as plt
disp = ConfusionMatrixDisplay.from_predictions(cls_test, pred_cls)
plt.title("Confusion Matrix (test)")
plt.tight_layout()
plt.show()

## 7) Interpretation — what works, what fails, assumptions
- **Works**: Lag/rolling features provide a modest signal; scaling + Ridge yields a stable baseline。
- **Fails**: 高噪声时序下，短期可预测性弱；波动突变会降低性能。
- **Assumptions risk**: 平稳性与弱依赖假设可能不成立；若存在概念漂移，固定窗口特征失效；请避免泄漏（窗口与 shift 要严格）。
- **Next steps**: 调参（alpha、窗口长度）、增补特征（季节性、节假日）、尝试更稳健损失或非线性模型（但先保留可解释基线）。