<a href="https://colab.research.google.com/github/jeanmhuang/Daily-Quant-Notes/blob/main/Daily_Quant_Notes_Momentum_Costs_2025_09_12.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Daily Quant Notes — Transaction Cost Sensitivity for Momentum
**Date:** 2025-09-12

This notebook builds a simple cross-sectional momentum strategy and studies how transaction costs (slippage/commissions) erode returns under realistic assumptions. It is designed to be short, reproducible, and portfolio-relevant for interviews with hedge funds and banks.

**What you'll do:**
1. Download liquid U.S. equities data (S&P 100 by default).
2. Construct a monthly-rebalanced momentum signal (12–1 months lookback; configurable).
3. Form long-only and long–short decile portfolios.
4. Apply a simple transaction cost model per turnover.
5. Compare gross vs. net performance across cost assumptions.
6. Report key statistics (CAGR, Sharpe, max drawdown, turnover).




## 0. Setup
Run this cell to install/import dependencies. (Colab users: the installs are included.)

In [None]:

# If you're on Colab, uncomment the following line to install yfinance and pandas_ta if needed:
# !pip install yfinance pandas_ta --quiet

import warnings
warnings.filterwarnings("ignore")

import math
import numpy as np
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt

plt.rcParams["figure.figsize"] = (10, 5)
pd.options.display.float_format = "{:,.6f}".format

print("Versions -> pandas:", pd.__version__, "| numpy:", np.__version__)


## 1. Parameters
Adjust your universe, dates, and strategy hyperparameters here.

In [None]:

# --- Universe: default to S&P 100 tickers (static list for reproducibility) ---
SP100 = [
    "AAPL","ABBV","ABT","ACN","ADBE","AIG","AMD","AMGN","AMT","AMZN","AVGO","AXP",
    "BA","BAC","BMY","BK","BKNG","BLK","C","CAT","CHTR","CL","CMCSA","COF","COP",
    "COST","CRM","CSCO","CVS","CVX","DHR","DIS","DOW","DUK","EMR","EXC","F","FDX",
    "GD","GE","GILD","GM","GOOG","GOOGL","GS","HD","HON","IBM","INTC","JNJ","JPM",
    "KO","LIN","LLY","LMT","LOW","MA","MCD","MDLZ","MDT","META","MET","MMM","MO",
    "MRK","MS","MSFT","NEE","NFLX","NKE","NVDA","ORCL","PEP","PFE","PG","PM","QCOM",
    "RTX","SBUX","SO","SPG","T","TGT","TMO","TMUS","TSLA","TXN","UNH","UNP","UPS",
    "USB","V","VZ","WBA","WFC","WMT","XOM"
]

params = {
    "tickers": SP100,              # universe
    "start": "2005-01-01",         # backtest start
    "end": None,                   # None = today
    "rebalance_freq": "M",         # monthly
    "lookback_months": 12,         # momentum lookback length (months)
    "skip_recent_months": 1,       # skip most-recent month (12-1 momentum)
    "top_decile": 0.10,            # long decile threshold
    "bottom_decile": 0.10,         # short decile threshold
    "cost_grid_bps": [0, 5, 10, 25, 50, 100],  # round-trip cost in basis points
    "min_price": 5.0,              # filter penny/illiquid names (optional heuristic)
    "dropna_thresh": 0.80          # require >=80% data coverage per ticker
}
params


## 2. Data
We use **adjusted close** from Yahoo Finance via `yfinance`. For liquidity sanity, we can drop names with too many missing values and impose a minimum price filter.

In [None]:

def download_prices(tickers, start, end=None):
    """Download Adjusted Close prices with yfinance."""
    data = yf.download(tickers, start=start, end=end, auto_adjust=False, progress=False)["Adj Close"]
    # If a single ticker is passed, yfinance returns a Series -> convert to DataFrame
    if isinstance(data, pd.Series):
        data = data.to_frame()
    return data.sort_index()

prices = download_prices(params["tickers"], params["start"], params["end"])

# Liquidity/quality filters
coverage = prices.notna().mean()
keep = coverage[coverage >= params["dropna_thresh"]].index.tolist()
prices = prices[keep]

# Filter by minimum price (using last price)
last_px = prices.ffill().iloc[-1]
keep2 = last_px[last_px >= params["min_price"]].index.tolist()
prices = prices[keep2]

print(f"Final universe size: {prices.shape[1]} tickers")
prices.tail().head()


## 3. Momentum Signal (12–1 months by default)
Classical cross-sectional momentum: rank stocks by trailing returns over `lookback_months`, skipping the most recent `skip_recent_months` to avoid short-term mean reversion.

In [None]:

def resample_month_end(df):
    return df.resample("M").last()

def compute_mom_signal(prices_m, lookback, skip):
    # compute trailing returns from t-(lookback+skip) to t-skip
    ret = prices_m.pct_change().add(1).cumprod()
    # shift by 'skip' months so the window ends skip months ago
    ret_shifted = ret.shift(skip)
    # momentum = trailing return over lookback window
    mom = ret_shifted / ret_shifted.shift(lookback) - 1.0
    return mom

prices_m = resample_month_end(prices.ffill())
momentum = compute_mom_signal(prices_m, params["lookback_months"], params["skip_recent_months"])
momentum = momentum.dropna(how="all")
momentum.tail().head()


## 4. Portfolio Construction
- **Long-only Top Decile**: equally-weight top `p` fraction by momentum each month.
- **Long–Short (Top–Bottom)**: long top decile, short bottom decile, market-neutral.
We compute **monthly portfolio returns** and track **turnover** (for costs).

In [None]:

def make_weights_from_ranks(scores, top_frac=0.10, bottom_frac=0.10, long_short=False):
    weights = pd.DataFrame(index=scores.index, columns=scores.columns, data=0.0)
    for date, row in scores.iterrows():
        valid = row.dropna()
        if valid.empty:
            continue
        n = len(valid)
        k_top = max(1, int(math.floor(n * top_frac)))
        winners = valid.sort_values(ascending=False).head(k_top).index.tolist()

        if long_short:
            k_bot = max(1, int(math.floor(n * bottom_frac)))
            losers = valid.sort_values(ascending=True).head(k_bot).index.tolist()
            w = pd.Series(0.0, index=row.index)
            if k_top > 0:
                w.loc[winners] =  1.0 / k_top
            if k_bot > 0:
                w.loc[losers]  = -1.0 / k_bot
            weights.loc[date] = w
        else:
            w = pd.Series(0.0, index=row.index)
            if k_top > 0:
                w.loc[winners] = 1.0 / k_top
            weights.loc[date] = w

    return weights

def compute_portfolio_returns(prices_m, weights):
    """Next-month returns of current weights (assumes rebalancing at month-end close)."""
    rets = prices_m.pct_change().shift(-1)  # next month forward return
    port_ret = (weights * rets).sum(axis=1)
    port_ret = port_ret.dropna()
    return port_ret

def compute_turnover(weights):
    """Sum of absolute weight changes at each rebalance (L1), divided by 2 for one-way turnover."""
    dW = (weights - weights.shift(1)).abs()
    # For long-only, L1/2 is conventional one-way turnover; for long-short, L1 may be used.
    turnover = dW.sum(axis=1) / 2.0
    return turnover.fillna(0.0)

# Align scores to month-end dates present in prices_m
scores = momentum.loc[prices_m.index.intersection(momentum.index)]
w_long = make_weights_from_ranks(scores, top_frac=params["top_decile"], long_short=False)
w_ls   = make_weights_from_ranks(scores, top_frac=params["top_decile"], bottom_frac=params["bottom_decile"], long_short=True)

r_long = compute_portfolio_returns(prices_m, w_long)
r_ls   = compute_portfolio_returns(prices_m, w_ls)

to_long = compute_turnover(w_long)
to_ls   = compute_turnover(w_ls)

print("Sample:")
print(pd.DataFrame({
    "r_long": r_long.tail(3),
    "r_ls"  : r_ls.tail(3),
}))


## 5. Transaction Cost Model
We apply a **round-trip cost** in basis points per unit turnover. A simple monthly approximation:

$$ r^{net}_t = r^{gross}_t - c \times \text{turnover}_t $$

Where `c` is cost per round-trip in decimal (e.g., 25 bps = 0.0025).
This is a stylized model; in reality execution costs depend on spread, volatility, participation rate, and slippage vs. benchmarks.

In [None]:

def apply_costs(gross_returns, turnover, cost_bps):
    c = cost_bps / 1e4  # convert bps to decimal
    # Monthly cost ~ c * turnover
    net = gross_returns - c * turnover
    return net

def perf_stats(returns, rf_annual=0.0):
    rd = returns.dropna()
    if rd.empty:
        return {"CAGR": np.nan, "Sharpe": np.nan, "MaxDD": np.nan, "Vol": np.nan}
    # CAGR
    cum = (1.0 + rd).prod()
    years = len(rd) / 12.0
    cagr = cum ** (1/years) - 1 if years > 0 else np.nan
    # Vol (annualized)
    vol = rd.std() * np.sqrt(12)
    # Sharpe (assuming rf=0 for simplicity)
    sharpe = (rd.mean() * 12) / vol if vol and vol > 0 else np.nan
    # Max drawdown
    equity = (1.0 + rd).cumprod()
    peak = equity.cummax()
    dd = (equity / peak - 1.0).min()
    return {"CAGR": cagr, "Sharpe": sharpe, "MaxDD": dd, "Vol": vol}

# Sweep costs
cost_grid = params["cost_grid_bps"]
results = []

for strategy_name, gross, to in [
    ("Long-Only Top Decile", r_long, to_long),
    ("Long–Short Top–Bottom", r_ls, to_ls)
]:
    for bps in cost_grid:
        net = apply_costs(gross, to, bps)
        stats = perf_stats(net)
        stats.update({"Strategy": strategy_name, "Cost_bps": bps})
        results.append(stats)

res_df = pd.DataFrame(results)[["Strategy","Cost_bps","CAGR","Sharpe","MaxDD","Vol"]]
res_df.sort_values(["Strategy","Cost_bps"], inplace=True)
res_df.head(12)


## 6. Results & Visuals
We plot gross vs net equity curves and a sensitivity table for Sharpe/CAGR across cost assumptions.

In [None]:

def equity_curve(returns):
    return (1.0 + returns.dropna()).cumprod()

# Choose a representative cost to visualize
viz_bps = 25  # change to taste

# Long-only
net_long = apply_costs(r_long, to_long, viz_bps)
ec_long_g = equity_curve(r_long)
ec_long_n = equity_curve(net_long)

plt.figure()
ec_long_g.plot(label="Gross")
ec_long_n.plot(label=f"Net ({viz_bps} bps)")
plt.title("Long-Only Top Decile — Equity Curve")
plt.legend()
plt.xlabel("Date")
plt.ylabel("Growth of $1")
plt.show()

# Long–short
net_ls = apply_costs(r_ls, to_ls, viz_bps)
ec_ls_g = equity_curve(r_ls)
ec_ls_n = equity_curve(net_ls)

plt.figure()
ec_ls_g.plot(label="Gross")
ec_ls_n.plot(label=f"Net ({viz_bps} bps)")
plt.title("Long–Short Top–Bottom — Equity Curve")
plt.legend()
plt.xlabel("Date")
plt.ylabel("Growth of $1")
plt.show()

# Table: performance by cost
res_df


### Turnover Diagnostics
High turnover strategies are more sensitive to costs. We show average and distribution of turnover.

In [None]:

print("Average one-way turnover (monthly):")
print(pd.DataFrame({
    "Long-Only": [to_long.mean()],
    "Long–Short": [to_ls.mean()]
}))

plt.figure()
to_long.plot()
plt.title("Long-Only Turnover (Monthly)")
plt.xlabel("Date")
plt.ylabel("One-way turnover")
plt.show()

plt.figure()
to_ls.plot()
plt.title("Long–Short Turnover (Monthly)")
plt.xlabel("Date")
plt.ylabel("One-way turnover")
plt.show()


## 7. Discussion & Next Steps
**What to write in your GitHub README today (2–4 sentences):**
- _Example_: “I tested a standard 12–1 cross-sectional momentum strategy on a liquid US universe with monthly rebalancing. Net performance is highly sensitive to transaction costs; at ~25 bps round-trip, Sharpe drops by X% and CAGR by Y%. Long–short is more cost‑fragile due to higher turnover. Next, I’ll explore slower signals (e.g., 6–12M) and turnover-aware weighting to improve net-of-cost results.”

**Extensions you can add tomorrow:**
- **Signal robustness**: try different lookbacks (3, 6, 9, 12 months) and skip windows (0–2 months).
- **Turnover control**: add a “do-nothing band” (only trade if rank change exceeds a threshold).
- **Risk control**: equal risk contribution (vol targeting) vs. equal weight.
- **Universe sanity**: filter by ADV/volume; drop illiquid names, or cap position sizes.
- **Lead–lag**: relate high momentum deciles’ returns to sector ETFs to see diffusion effects.