[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ababber/pyhou-02-17-2026/blob/main/part-1-classical-ml/research-notebook.ipynb)

> **What You'll Need**
> - **To run demos locally:** Python 3.9+, numpy, scikit-learn, plotly ([see requirements](../requirements.txt))
> - **To run demos in Colab:** Nothing — click the badge above, all deps pre-installed
> - **To run the trading strategy:** Free [QuantConnect](https://www.quantconnect.com/) account (no credit card required)
> - **Time:** ~30 min to read, ~5 min to run backtest on QC



# Part 1: Classical ML — Ridge Regression

**Quantitative Trading: A First Look**

This notebook explores whether a classical machine learning model — ridge regression from 1970 — can generate alpha in futures markets. Spoiler: it can't. But understanding *why* it fails is essential groundwork for everything that follows.

**What you'll learn:**
- Why ridge regression handles correlated features better than ordinary least squares
- How inverse volatility weighting allocates capital based on predicted risk
- How to interpret backtest metrics (Sharpe, alpha, beta, drawdown)
- Why linear models struggle with financial prediction

**Prerequisites:**
- Basic Python (numpy, pandas)
- Familiarity with linear regression
- No QuantConnect experience required (code is explained step-by-step)

---

**Navigation:** [The Strategy](#2-the-strategy) | [The Math](#3-the-math) | [Implementation](#4-implementation) | [Results](#5-results) | [Analysis](#6-analysis) | [References](#7-references)

## 1. Introduction

### The Question

Can machine learning predict financial markets?

This series tests three generations of ML on the same backtesting platform:

| Generation | Model | Year | Complexity |
|------------|-------|------|------------|
| Classical | Ridge Regression | 1970 | Linear, 3 features |
| Deep Learning | Temporal CNN | 1989 | Nonlinear, learned features |
| Foundation | Amazon Chronos | 2024 | Pre-trained, zero-shot |

Same platform. Same time period. Increasing complexity. Let's see if sophistication translates to performance.

### Why Start with Ridge Regression?

**Interpretability matters in finance.** You want to know *which* features drive predictions and by how much. Linear models give you that directly through coefficients.

**The problem:** When input features are correlated — and in finance they almost always are — ordinary least squares (OLS) becomes unstable. Small changes in data can swing coefficients wildly, even flip their signs.

**The solution:** Ridge regression adds an L2 penalty that shrinks all coefficients toward zero. This stabilizes the model without discarding any features.

> **Key Insight:** Ridge regression is OLS with a constraint: fit the data, but don't let any coefficient get too large.

This makes ridge a sensible baseline. If a simple, interpretable model can find signal, we've learned something valuable. If it can't, we know linear relationships aren't enough.

## 2. The Strategy

### Universe: 12 Futures Across 3 Sectors

| Sector | Contracts | Why? |
|--------|-----------|------|
| **Indices** | VIX, S&P 500 E-Mini, Nasdaq 100 E-Mini, DOW 30 E-Mini | Equity market exposure |
| **Energy** | Brent Crude, Gasoline, Heating Oil, Natural Gas | Commodity cycle exposure |
| **Grains** | Corn, Oats, Soybeans, Wheat | Agricultural/weather exposure |

**Why these three sectors?** Diversification. Stock indices, energy, and grains respond to different economic forces. They don't move in lockstep. That matters because we're betting on *relative* volatility — we want some contracts to be calm while others aren't.

### Features: Three Volatility Signals

For each contract, we compute three daily features:

| Feature | Definition | What it captures |
|---------|------------|------------------|
| **Closing Price Volatility** | 3-month standard deviation of daily returns | How much has the price been bouncing around? |
| **Average True Range (ATR)** | 3-month average of daily true range | Intraday swings, including gaps |
| **Open Interest** | Number of contracts actively held | New money flowing in (trend confirmation) |

**True Range** is the maximum of:
- High − Low (intraday range)
- |High − Previous Close| (gap up)
- |Low − Previous Close| (gap down)

Together, these three features give the model a picture of each contract's recent behavior from different angles.

### Allocation: Inverse Volatility Weighting

The core idea: **contracts we expect to be LEAST volatile get the MOST capital.**

This is inverse volatility weighting — a risk-parity adjacent technique. The intuition:
- If an asset is expected to be twice as volatile, it gets half the weight
- You're equalizing *risk exposure* across the portfolio, not just spreading dollars evenly

**Schedule:**
- **Train:** Rolling 365-day window (markets change; stale data dilutes signal)
- **Predict:** Next-week opening price volatility
- **Rebalance:** Every Monday at market open

> **Key Insight:** This is risk management by prediction. Instead of equal-weighting contracts, we weight by inverse predicted risk.

## 3. The Math

Two equations drive this strategy: the **cost function** (what the model minimizes) and the **allocation formula** (how we size positions).

### Ridge Regression Cost Function

$$J(\theta) = \text{MSE}(\theta) + \frac{\alpha}{m} \sum_i \theta_i^2$$

| Term | Meaning |
|------|---------|
| MSE(θ) | Mean squared error — how far off are predictions? |
| (α/m) Σᵢ θᵢ² | L2 penalty — penalize large coefficients |
| α | Regularization strength (dial between fit and shrinkage) |

**Behavior at extremes:**
- α = 0 → plain OLS (ordinary least squares)
- α ↑ → coefficients shrink toward zero
- α → ∞ → all coefficients = 0 (model predicts the mean)

### Demo: Why Ridge Beats OLS on Correlated Features

Run this cell to see how OLS coefficients become unstable when features are correlated, and how ridge stabilizes them.

In [None]:
# RUNNABLE DEMO: Ridge vs OLS on correlated features
import numpy as np
from sklearn.linear_model import LinearRegression, Ridge

np.random.seed(42)

# Create correlated features (like volatility and ATR)
n_samples = 100
x1 = np.random.randn(n_samples)
x2 = x1 + np.random.randn(n_samples) * 0.1  # x2 ≈ x1 (highly correlated)
X = np.column_stack([x1, x2])
y = 2 * x1 + 3 * x2 + np.random.randn(n_samples) * 0.5  # True: β1=2, β2=3

# Fit OLS and Ridge
ols = LinearRegression().fit(X, y)
ridge = Ridge(alpha=1.0).fit(X, y)

print("True coefficients:      β1=2.0, β2=3.0")
print(f"OLS coefficients:       β1={ols.coef_[0]:.2f}, β2={ols.coef_[1]:.2f}")
print(f"Ridge coefficients:     β1={ridge.coef_[0]:.2f}, β2={ridge.coef_[1]:.2f}")
print(f"\nCorrelation(x1, x2):    {np.corrcoef(x1, x2)[0,1]:.3f}")
print("\n→ OLS coefficients are unstable (far from true values)")
print("→ Ridge coefficients are biased but stable (closer to each other)")

### Allocation Formula

$$w_i = \frac{C}{\sigma_i \times \sum_j(1/\sigma_j) \times \text{multiplier}_i}$$

| Term | Meaning |
|------|---------|
| σᵢ | Predicted volatility for contract i (inverse relationship) |
| Σⱼ(1/σⱼ) | Normalization — weights sum consistently |
| multiplierᵢ | Contract size adjustment (S&P E-Mini ≠ corn) |
| C = 3 | Position scaling constant (tuned to avoid margin calls) |

**Why the multiplier matters:** A single S&P E-Mini contract controls ~$250K. A corn contract controls ~$25K. Without the multiplier, equal *weights* would mean wildly different *dollar exposures*.

## 4. Implementation

This section walks through the QuantConnect algorithm piece by piece. The code runs on the QuantConnect platform — you can't execute it locally, but you can copy it to QC to run backtests.

> **Note:** Code cells marked `# QUANTCONNECT` are view-only. Copy them to [QuantConnect](https://www.quantconnect.com/) to execute.
>
> **QC Free Tier:** This Part 1 strategy runs entirely on the free tier. No credit card or subscription required. Futures backtesting is free; only live trading requires a paid plan.

### Step 1: Universe Setup

Define the 12 futures contracts and configure the algorithm.

In [None]:
# QUANTCONNECT — Universe Setup (view-only)

from AlgorithmImports import *
from sklearn.linear_model import Ridge

class InverseVolatilityRankAlgorithm(QCAlgorithm):
    
    def initialize(self):
        # Backtest period
        self.set_start_date(2018, 12, 31)
        self.set_end_date(2024, 4, 1)
        self.set_cash(100_000_000)  # $100M starting capital
        
        # Model parameters
        self._std_period = 3 * 26          # 3 months of trading days
        self._atr_period = 3 * 26          # 3 months for ATR
        self._training_set_duration = timedelta(365)  # Rolling 1-year window
        self._future_std_period = 6        # Predict 1-week volatility
        
        # Define the 12 futures universe
        tickers = [
            # Indices
            Futures.Indices.VIX, 
            Futures.Indices.SP_500_E_MINI,
            Futures.Indices.NASDAQ_100_E_MINI,
            Futures.Indices.DOW_30_E_MINI,
            # Energy
            Futures.Energy.BRENT_CRUDE,
            Futures.Energy.GASOLINE,
            Futures.Energy.HEATING_OIL,
            Futures.Energy.NATURAL_GAS,
            # Grains
            Futures.Grains.CORN,
            Futures.Grains.OATS,
            Futures.Grains.SOYBEANS,
            Futures.Grains.WHEAT
        ]
        
        # Add each future, trading front-month contracts only
        for ticker in tickers:
            future = self.add_future(ticker, extended_market_hours=True)
            future.set_filter(lambda universe: universe.front_month())

### Step 2: Feature Engineering

For each contract, we track three indicators: standard deviation (volatility), ATR, and open interest. QuantConnect provides built-in indicator classes.

In [None]:
# QUANTCONNECT — Feature Engineering (view-only)

def on_securities_changed(self, changes):
    """Called when futures contracts roll or are added/removed."""
    
    for security in changes.added_securities:
        # Skip non-futures
        if security.symbol.security_type != SecurityType.FUTURE:
            continue
        
        # Create indicators for this contract
        # 1. Standard deviation (closing price volatility)
        std = self.std(
            security.symbol, 
            self._std_period, 
            Resolution.DAILY
        )
        
        # 2. Average True Range (intraday volatility)
        atr = self.atr(
            security.symbol, 
            self._atr_period, 
            Resolution.DAILY
        )
        
        # Store history for training
        security.indicator_history = pd.DataFrame()
        security.label_history = pd.Series(dtype=float)
        
        # Track this contract
        self._contracts.append(security)

### Step 3: Model Training & Prediction

Every Monday, we train a ridge model for each contract and predict next-week volatility.

In [None]:
# QUANTCONNECT — Model Training (view-only)

def _trade(self):
    """Called every Monday — train models and rebalance."""
    
    # Get open interest for all contracts
    open_interest = self.history(
        OpenInterest, 
        [c.symbol for c in self._contracts], 
        self._training_set_duration, 
        fill_forward=False
    )
    
    expected_volatility = {}
    
    for security in self._contracts:
        symbol = security.symbol
        
        # Combine features: [volatility, ATR, open_interest]
        factors = pd.concat([
            security.indicator_history, 
            open_interest.loc[symbol]
        ], axis=1).ffill().dropna()
        
        # Labels: actual future volatility (what we're predicting)
        labels = security.label_history
        
        # Align features and labels
        idx = sorted(set(factors.index) & set(labels.index))
        if len(idx) < 20:  # Need minimum training samples
            continue
        
        # Train ridge regression
        model = Ridge()  # α=1.0 default
        model.fit(factors.loc[idx].values, labels.loc[idx].values)
        
        # Predict next-week volatility
        latest_factors = factors.iloc[-1:].values
        expected_volatility[symbol] = model.predict(latest_factors)[0]

### Step 4: Position Sizing (Inverse Volatility)

Convert predictions to portfolio weights. Less volatile → higher weight.

In [None]:
# QUANTCONNECT — Position Sizing (view-only)

def _calculate_weights(self, expected_volatility):
    """Convert volatility predictions to inverse-weighted positions."""
    
    C = 3  # Position scaling constant
    
    # Sum of inverse volatilities (for normalization)
    inv_vol_sum = sum(1/vol for vol in expected_volatility.values())
    
    weights = {}
    for symbol, vol in expected_volatility.items():
        # Get contract multiplier (notional value per contract)
        multiplier = self.securities[symbol].symbol_properties.contract_multiplier
        
        # Inverse volatility weight
        # Less volatile → higher weight (inverse relationship)
        weights[symbol] = C / (vol * inv_vol_sum * multiplier)
    
    return weights

def _rebalance(self, weights):
    """Execute trades to match target weights."""
    
    # Liquidate contracts no longer in universe
    for holding in self.portfolio.values():
        if holding.symbol not in weights and holding.invested:
            self.liquidate(holding.symbol)
    
    # Set new positions
    for symbol, weight in weights.items():
        self.set_holdings(symbol, weight)

### Step 5: Full Algorithm

Copy this complete algorithm to QuantConnect to run a backtest. Create a new algorithm, paste this code into `main.py`, and click "Backtest".

<details>
<summary><strong>Click to expand full algorithm</strong></summary>

In [None]:
# QUANTCONNECT — Full Algorithm (copy this to QuantConnect)
# File: main.py

from AlgorithmImports import *
from sklearn.linear_model import Ridge


class InverseVolatilityRankAlgorithm(QCAlgorithm):
    """
    Inverse volatility weighting on 12 futures using ridge regression.
    Predicts next-week volatility, allocates more capital to less volatile contracts.
    """

    def initialize(self):
        self.set_start_date(2018, 12, 31)
        self.set_end_date(2024, 4, 1)
        self.set_cash(100_000_000)

        self._std_period = self.get_parameter('std_months', 3) * 26
        self._atr_period = self.get_parameter('atr_months', 3) * 26
        self._training_set_duration = timedelta(
            self.get_parameter('training_set_duration', 365)
        )
        self._future_std_period = 6

        self._contracts = []
        tickers = [
            Futures.Indices.VIX, 
            Futures.Indices.SP_500_E_MINI,
            Futures.Indices.NASDAQ_100_E_MINI,
            Futures.Indices.DOW_30_E_MINI,
            Futures.Energy.BRENT_CRUDE,
            Futures.Energy.GASOLINE,
            Futures.Energy.HEATING_OIL,
            Futures.Energy.NATURAL_GAS,
            Futures.Grains.CORN,
            Futures.Grains.OATS,
            Futures.Grains.SOYBEANS,
            Futures.Grains.WHEAT
        ]
        for ticker in tickers:
            future = self.add_future(ticker, extended_market_hours=True)
            future.set_filter(lambda universe: universe.front_month())
        
        schedule_symbol = Symbol.create("SPY", SecurityType.EQUITY, Market.USA)
        self.schedule.on(
            self.date_rules.week_start(schedule_symbol),
            self.time_rules.after_market_open(schedule_symbol, 1), 
            self._trade
        )

    def _trade(self):
        open_interest = self.history(
            OpenInterest, [c.symbol for c in self._contracts], 
            self._training_set_duration, fill_forward=False
        )
        open_interest.index = open_interest.index.droplevel(0)

        expected_volatility_by_security = {}
        for security in self._contracts:
            symbol = security.symbol
            if symbol not in open_interest.index:
                continue
            factors = pd.concat(
                [security.indicator_history, open_interest.loc[symbol]], 
                axis=1
            ).ffill().loc[security.indicator_history.index].dropna()
            if factors.empty:
                continue 
            label = security.label_history
            idx = sorted(list(set(factors.index).intersection(set(label.index))))
            if len(idx) < 20:
                continue

            model = Ridge()
            model.fit(factors.loc[idx].values, label.loc[idx].values)
            prediction = model.predict([factors.iloc[-1].values])[0] 
            if prediction > 0:
                expected_volatility_by_security[security] = prediction

        portfolio_targets = []
        std_sum = sum([1/v for v in expected_volatility_by_security.values()])
        for security, expected_vol in expected_volatility_by_security.items():
            weight = 3 / expected_vol / std_sum / security.symbol_properties.contract_multiplier
            portfolio_targets.append(PortfolioTarget(security.symbol, weight))
        self.set_holdings(portfolio_targets, True)

    def on_securities_changed(self, changes):
        for security in changes.added_securities:
            if security.symbol.is_canonical(): 
                continue
            security.close_roc = RateOfChange(1)
            security.std_of_close_returns = IndicatorExtensions.of(
                StandardDeviation(self._std_period), security.close_roc
            )
            security.atr = AverageTrueRange(self._atr_period)
            security.open_roc = RateOfChange(1)
            security.std_of_open_returns = IndicatorExtensions.of(
                StandardDeviation(self._future_std_period), security.open_roc
            )
            security.indicator_history = pd.DataFrame()
            security.label_history = pd.Series()
            security.consolidator = self.consolidate(
                security.symbol, Resolution.DAILY, self._consolidation_handler
            )
            warm_up_length = (
                max(self._std_period + 1, self._atr_period) 
                + self._training_set_duration.days
            )
            bars = self.history[TradeBar](security.symbol, warm_up_length, Resolution.DAILY)
            for bar in bars:
                security.consolidator.update(bar)
            self._contracts.append(security)

        for security in changes.removed_securities:
            self.subscription_manager.remove_consolidator(security.symbol, security.consolidator)
            security.close_roc.reset()
            security.std_of_close_returns.reset()
            security.atr.reset()
            security.open_roc.reset()
            security.std_of_open_returns.reset()
            if security in self._contracts:
                self._contracts.remove(security)

    def _consolidation_handler(self, consolidated_bar):
        security = self.securities[consolidated_bar.symbol]
        t = consolidated_bar.end_time
        if security.atr.update(consolidated_bar):
            security.indicator_history.loc[t, 'atr'] = security.atr.current.value
        security.close_roc.update(t, consolidated_bar.close)
        if security.std_of_close_returns.is_ready:
            security.indicator_history.loc[t, 'std_of_close_returns'] = \
                security.std_of_close_returns.current.value
        security.open_roc.update(t, consolidated_bar.open)
        if (security.std_of_open_returns.is_ready and 
            len(security.indicator_history.index) > self._future_std_period):
            security.label_history.loc[
                security.indicator_history.index[-self._future_std_period - 1]
            ] = security.std_of_open_returns.current.value
        security.indicator_history = security.indicator_history[
            security.indicator_history.index >= self.time - self._training_set_duration
        ]
        security.label_history = security.label_history[
            security.label_history.index >= self.time - self._training_set_duration
        ]

</details>

## 5. Results

**Backtest Period:** 2018-12-31 to 2024-04-01  
**Starting Capital:** $100,000,000

### Key Metrics

| Metric | Value | Interpretation |
|--------|-------|----------------|
| **Sharpe Ratio** | 0.212 | Poor — only 0.2 units of return per unit of risk |
| **CAGR** | 5.85% | Below market — S&P returned ~16%/year |
| **Net Profit** | +34.8% | Looks OK until you compare to benchmark |
| **Alpha** | -0.062 | **Negative** — model destroys value |
| **Beta** | 1.146 | Amplifies market moves by ~15% |
| **Max Drawdown** | 54.7% | Lost more than half at worst point |

> **Key Takeaway:** Alpha = -0.062 means that after removing market exposure, the strategy *loses* money. The model isn't finding any signal the market doesn't already price in.

### Equity Curve

Run this cell to see the cumulative return over the backtest period.

In [None]:
# RUNNABLE: Equity Curve Visualization
import plotly.graph_objects as go
from datetime import datetime

# Actual backtest results from QuantConnect
timestamps = [1546232400, 1546675197, 1547117994, 1547560791, 1548003588, 1548446385, 1548889182, 1549331979, 1549774776, 1550217573, 1550660370, 1551103167, 1551545964, 1551988761, 1552431558, 1552874355, 1553317152, 1553759950, 1554202747, 1554645544, 1555088341, 1555531138, 1555973935, 1556416732, 1556859529, 1557302326, 1557745123, 1558187920, 1558630717, 1559073514, 1559516311, 1559959108, 1560401905, 1560844702, 1561287500, 1561730297, 1562173094, 1562615891, 1563058688, 1563501485, 1563944282, 1564387079, 1564829876, 1565272673, 1565715470, 1566158267, 1566601064, 1567043861, 1567486658, 1567929455, 1568372252, 1568815050, 1569257847, 1569700644, 1570143441, 1570586238, 1571029035, 1571471832, 1571914629, 1572357426, 1572800223, 1573243020, 1573685817, 1574128614, 1574571411, 1575014208, 1575457005, 1575899802, 1576342600, 1576785397, 1577228194, 1577670991, 1578113788, 1578556585, 1578999382, 1579442179, 1579884976, 1580327773, 1580770570, 1581213367, 1581656164, 1582098961, 1582541758, 1582984555, 1583427352, 1583870150, 1584312947, 1584755744, 1585198541, 1585641338, 1586084135, 1586526932, 1586969729, 1587412526, 1587855323, 1588298120, 1588740917, 1589183714, 1589626511, 1590069308, 1590512105, 1590954902, 1591397700, 1591840497, 1592283294, 1592726091, 1593168888, 1593611685, 1594054482, 1594497279, 1594940076, 1595382873, 1595825670, 1596268467, 1596711264, 1597154061, 1597596858, 1598039655, 1598482452, 1598925250, 1599368047, 1599810844, 1600253641, 1600696438, 1601139235, 1601582032, 1602024829, 1602467626, 1602910423, 1603353220, 1603796017, 1604238814, 1604681611, 1605124408, 1605567205, 1606010003, 1606452800, 1606895597, 1607338394, 1607781191, 1608223988, 1608666785, 1609109582, 1609552379, 1609995176, 1610437973, 1610880770, 1611323567, 1611766364, 1612209161, 1612651958, 1613094755, 1613537553, 1613980350, 1614423147, 1614865944, 1615308741, 1615751538, 1616194335, 1616637132, 1617079929, 1617522726, 1617965523, 1618408320, 1618851117, 1619293914, 1619736711, 1620179508, 1620622305, 1621065103, 1621507900, 1621950697, 1622393494, 1622836291, 1623279088, 1623721885, 1624164682, 1624607479, 1625050276, 1625493073, 1625935870, 1626378667, 1626821464, 1627264261, 1627707058, 1628149855, 1628592653, 1629035450, 1629478247, 1629921044, 1630363841, 1630806638, 1631249435, 1631692232, 1632135029, 1632577826, 1633020623, 1633463420, 1633906217, 1634349014, 1634791811, 1635234608, 1635677405, 1636120203, 1636563000, 1637005797, 1637448594, 1637891391, 1638334188, 1638776985, 1639219782, 1639662579, 1640105376, 1640548173, 1640990970, 1641433767, 1641876564, 1642319361, 1642762158, 1643204955, 1643647753, 1644090550, 1644533347, 1644976144, 1645418941, 1645861738, 1646304535, 1646747332, 1647190129, 1647632926, 1648075723, 1648518520, 1648961317, 1649404114, 1649846911, 1650289708, 1650732505, 1651175303, 1651618100, 1652060897, 1652503694, 1652946491, 1653389288, 1653832085, 1654274882, 1654717679, 1655160476, 1655603273, 1656046070, 1656488867, 1656931664, 1657374461, 1657817258, 1658260056, 1658702853, 1659145650, 1659588447, 1660031244, 1660474041, 1660916838, 1661359635, 1661802432, 1662245229, 1662688026, 1663130823, 1663573620, 1664016417, 1664459214, 1664902011, 1665344808, 1665787606, 1666230403, 1666673200, 1667115997, 1667558794, 1668001591, 1668444388, 1668887185, 1669329982, 1669772779, 1670215576, 1670658373, 1671101170, 1671543967, 1671986764, 1672429561, 1672872358, 1673315156, 1673757953, 1674200750, 1674643547, 1675086344, 1675529141, 1675971938, 1676414735, 1676857532, 1677300329, 1677743126, 1678185923, 1678628720, 1679071517, 1679514314, 1679957111, 1680399908, 1680842706, 1681285503, 1681728300, 1682171097, 1682613894, 1683056691, 1683499488, 1683942285, 1684385082, 1684827879, 1685270676, 1685713473, 1686156270, 1686599067, 1687041864, 1687484661, 1687927458, 1688370256, 1688813053, 1689255850, 1689698647, 1690141444, 1690584241, 1691027038, 1691469835, 1691912632, 1692355429, 1692798226, 1693241023, 1693683820, 1694126617, 1694569414, 1695012211, 1695455008, 1695897806, 1696340603, 1696783400, 1697226197, 1697668994, 1698111791, 1698554588, 1698997385, 1699440182, 1699882979, 1700325776, 1700768573, 1701211370, 1701654167, 1702096964, 1702539761, 1702982558, 1703425356, 1703868153, 1704310950, 1704753747, 1705196544, 1705639341, 1706082138, 1706524935, 1706967732, 1707410529, 1707853326, 1708296123, 1708738920, 1709181717, 1709624514, 1710067311, 1710510109, 1710952906, 1711395703, 1711838500]

returns_pct = [0.0, 0.53, 2.03, 3.05, 6.59, 6.65, 7.89, 9.17, 8.73, 11.59, 14.28, 16.11, 15.56, 7.65, 8.6, 11.73, 11.78, 12.36, 18.05, 20.47, 21.02, 20.83, 21.93, 22.68, 21.8, 18.04, 12.04, 16.74, 13.97, 10.63, 3.21, 20.13, 22.13, 26.32, 29.37, 29.31, 32.32, 27.26, 35.23, 35.5, 34.9, 34.34, 22.87, 18.33, 17.67, 14.98, 9.87, 13.09, 15.56, 20.38, 23.74, 22.21, 23.21, 23.3, 14.61, 15.35, 22.97, 22.64, 23.24, 26.47, 28.91, 33.13, 34.43, 38.67, 36.12, 39.88, 34.41, 37.99, 41.05, 45.04, 46.24, 49.36, 48.84, 52.14, 54.17, 59.45, 54.98, 49.87, 49.62, 57.6, 65.6, 63.78, 43.68, 15.12, 13.35, -4.53, -14.27, -17.63, -17.68, -17.05, -18.28, -15.49, -15.1, -15.11, -14.63, -14.24, -13.98, -13.11, -14.09, -12.67, -11.01, -10.28, -4.44, -4.44, -7.1, -6.36, -6.27, -4.56, -3.18, -3.05, -2.28, -0.89, -2.73, -2.06, 0.11, 3.13, 2.66, 3.55, 6.08, 7.43, 5.21, 3.0, 4.79, 3.55, 3.47, 4.21, 4.27, 7.69, 7.45, 6.4, 3.78, 0.03, 7.28, 10.46, 13.4, 10.3, 13.62, 13.75, 15.07, 14.72, 15.98, 16.71, 16.73, 17.94, 19.52, 20.67, 18.61, 21.47, 16.64, 16.77, 19.24, 20.75, 21.49, 20.45, 16.37, 18.38, 20.65, 27.19, 31.15, 31.29, 31.94, 33.38, 37.42, 40.55, 42.72, 41.81, 42.01, 42.43, 47.89, 43.85, 38.51, 43.06, 45.53, 47.42, 45.78, 46.27, 39.71, 39.02, 39.75, 41.96, 43.28, 43.21, 40.42, 45.39, 45.58, 46.12, 47.98, 50.35, 47.93, 50.72, 52.64, 52.46, 50.45, 46.89, 46.97, 50.74, 42.39, 44.01, 48.93, 55.4, 57.82, 61.68, 62.76, 67.9, 65.53, 66.1, 64.21, 65.47, 54.48, 56.73, 65.47, 67.3, 65.08, 68.63, 70.96, 69.14, 65.27, 64.44, 50.32, 46.13, 48.93, 49.67, 50.02, 49.0, 43.5, 42.64, 42.71, 32.54, 34.61, 43.37, 43.92, 47.1, 44.89, 42.37, 35.56, 35.0, 27.11, 27.44, 21.02, 15.99, 13.32, 7.85, 7.41, 18.73, 17.19, 16.46, 0.92, -4.75, -4.53, -7.66, -8.81, -6.22, -9.68, -2.35, -2.37, 4.55, 4.81, 4.61, 11.85, 11.3, 5.62, -2.16, -8.16, -5.42, -9.59, -12.75, -13.96, -16.0, -10.17, -16.34, -15.19, -10.75, -2.84, 2.77, -1.62, 1.29, 6.39, 5.77, 8.61, 6.09, 9.8, 4.42, 3.78, -0.17, -0.3, -0.31, -0.56, 0.05, 1.8, 1.05, 1.63, 2.89, 4.42, 3.34, 5.08, 3.86, -3.05, -4.58, -2.2, -14.95, -14.98, -14.9, -14.99, -11.98, -11.51, -10.63, -10.35, -10.76, -10.89, -11.71, -12.14, -14.25, -13.13, -12.86, -12.71, -8.92, -9.74, -6.41, -2.05, -2.26, -1.88, 1.17, -2.64, 0.86, 5.93, 6.47, 8.94, 6.5, 3.84, 4.2, -5.05, -2.8, -1.5, 1.78, -0.2, 0.21, 2.19, 1.62, -2.88, -8.46, -3.67, -2.86, -2.9, -8.53, -13.74, -3.37, -1.6, 0.06, 3.69, 6.58, 8.11, 13.69, 14.04, 22.04, 23.21, 23.26, 24.47, 20.78, 22.81, 22.3, 21.7, 28.32, 29.68, 33.59, 35.27, 30.62, 34.37, 39.23, 37.79, 35.77, 36.12, 34.55, 35.33, 34.8, 37.0]

dates = [datetime.fromtimestamp(t) for t in timestamps]

fig = go.Figure()
fig.add_trace(go.Scatter(
    x=dates, y=returns_pct,
    mode='lines', name='Ridge Regression Strategy',
    line=dict(color='#ef553b', width=2),
))

# Add zero line for reference
fig.add_hline(y=0, line_dash="dash", line_color="gray", opacity=0.5)

# Mark key events
fig.add_annotation(x=datetime(2020, 3, 1), y=-18, text="COVID Crash<br>-54.7% drawdown",
                   showarrow=True, arrowhead=2, ax=0, ay=-40)
fig.add_annotation(x=datetime(2022, 10, 1), y=-16, text="2022 Bear Market",
                   showarrow=True, arrowhead=2, ax=0, ay=-40)

fig.update_layout(
    title="Ridge Regression Strategy — Equity Curve (2019-2024)",
    xaxis_title="Date",
    yaxis_title="Cumulative Return %",
    template="plotly_white",
    hovermode="x unified",
    height=500,
)
fig.show()

## 6. Analysis

### Why the Model Failed

**1. Linear models can't capture nonlinear patterns**

Financial markets are complex adaptive systems. The relationship between today's volatility features and next week's volatility isn't a straight line — it depends on regime, sentiment, and interactions the model can't see.

**2. The model found the strongest available pattern: beta**

With beta = 1.146, the strategy amplifies market movements by 15%. When you strip that out (alpha), you're left with *negative* returns. The ridge regression didn't learn anything about relative volatility that the market wasn't already pricing in.

**3. Features are lagging indicators**

All three features (volatility, ATR, open interest) describe what *already happened*. By the time they signal high volatility, the market has often already moved. Predicting the future from the past requires patterns that persist — and in efficient markets, they often don't.

### What the Equity Curve Tells Us

The curve tracks the broad market. Peaks and troughs align with market cycles, not with any independent signal. This is what happens when a linear model tries to predict a nonlinear system: it finds the strongest pattern available — *the market going up over time* — and rides it.

> **Verdict:** A linear model on three correlated volatility features doesn't generate alpha. It just tracks the market with extra drawdown.

### Lessons for Part 2

The model's failure isn't surprising — it's informative:

1. **Linear relationships aren't enough** — we need a model that can learn nonlinear patterns
2. **Static features aren't enough** — we need a model that can learn temporal dependencies across multiple time scales
3. **Interpretability has limits** — sometimes you need to trade it for capacity

In Part 2, we'll try a Temporal Convolutional Neural Network (CNN) — a deep learning model that can learn nonlinear patterns from sequences of data. Will added complexity translate to better results?

## Try This: Exercises

Deepen your understanding by modifying the strategy. All exercises work on QC's free tier.

### Exercise 1: Change the Regularization Strength

The algorithm uses `Ridge()` with default α=1.0. What happens if you increase it?

```python
# In _trade(), change:
model = Ridge()  # α=1.0 default
# To:
model = Ridge(alpha=10.0)  # Stronger regularization
```

**Predict:** Will Sharpe improve or decline? Why?

---

### Exercise 2: Add a Fourth Feature

The strategy uses 3 features (volatility, ATR, open interest). Add momentum:

```python
# In on_securities_changed(), add:
security.momentum = self.momp(security.symbol, 20, Resolution.DAILY)

# In _consolidation_handler(), add:
if security.momentum.is_ready:
    security.indicator_history.loc[t, 'momentum'] = security.momentum.current.value
```

**Predict:** Does adding momentum help or hurt? What does this tell you about feature selection?

---

### Exercise 3: Reduce the Universe

Try trading only the 4 index futures (VIX, ES, NQ, YM). Remove energy and grains.

**Predict:** Will concentration help or hurt diversification benefits?

---

### Exercise 4: Compare to Equal Weighting

Replace inverse volatility weighting with equal weights:

```python
# In _trade(), replace the weighting logic with:
weight = 1.0 / len(expected_volatility_by_security)
for security in expected_volatility_by_security:
    portfolio_targets.append(PortfolioTarget(security.symbol, weight))
```

**Predict:** Is the model's prediction adding value, or would equal weights perform the same?

---

*Each exercise takes ~5 minutes to implement and ~5 minutes to backtest. Record your results!*


## 7. References

### Primary Source

- Pik, J., Chan, E. P., Broad, J., Sun, P., & Singh, V. (2025). *Hands-On AI Trading with Python, QuantConnect, and AWS*. Wiley. ISBN 978-1394268436. — Strategy design and implementation.

### Ridge Regression

- Hoerl, A. E. & Kennard, R. W. (1970). "Ridge Regression: Biased Estimation for Nonorthogonal Problems." *Technometrics*, 12(1), 55–67. [doi:10.1080/00401706.1970.10488634](https://doi.org/10.1080/00401706.1970.10488634) — Seminal paper introducing ridge regularization.
- Hastie, T., Tibshirani, R., & Friedman, J. (2009). *The Elements of Statistical Learning*. 2nd ed., Chapter 3. Springer. — L2 penalty and shrinkage methods.

### Technical Indicators

- Wilder, J. W. (1978). *New Concepts in Technical Trading Systems*. Trend Research. ISBN 978-0894590276. — Original ATR definition.
- Murphy, J. J. (1999). *Technical Analysis of the Financial Markets*. New York Institute of Finance. Chapter 7. — Open interest interpretation.

### Performance Metrics

- Sharpe, W. F. (1966). "Mutual Fund Performance." *The Journal of Business*, 39(1), 119–138. — Sharpe ratio.
- Jensen, M. C. (1968). "The Performance of Mutual Funds in the Period 1945–1964." *The Journal of Finance*, 23(2), 389–416. — Jensen's alpha.

### Contract Specifications

- Schwab. [S&P 500 E-Mini Futures](https://www.schwab.com/futures/sp-500-emini). — E-Mini multiplier ($50 per index point).
- CME Group. [Corn Futures Contract Specs](https://www.cmegroup.com/markets/agriculture/grains/corn.contractSpecs.html). — 5,000 bushels per contract.

---

## License

This notebook is released under the MIT License. See [LICENSE](../LICENSE) for details.

---

*Part 1 of 3. Next: [Part 2 — Deep Learning (Temporal CNN)](../part-2-deep-learning/)*