# Lab 4: Momentum II

In the last lab we explored how to backtest decile portfolio style trading strategies. In this lab we will explore how to backtest portfolios that are optimized each period to maximize alpha while minimizing variance. 

## Imports

In [1]:
import sf_quant.data as sfd
import sf_quant.optimizer as sfo
import sf_quant.backtester as sfb
import sf_quant.performance as sfp
import polars as pl
import datetime as dt
import matplotlib.pyplot as plt
import seaborn as sns
import tqdm

  from .autonotebook import tqdm as notebook_tqdm
2025-09-17 15:07:05,443	INFO util.py:154 -- Missing packages: ['ipywidgets']. Run `pip install -U ipywidgets`, then restart the notebook server for rich notebook output.


## Data

We import the necessary data for you here. We will be doing a backtest from 2023-01-01 to 2024-01-31. However since our signal takes 1 year to compute, we will really only be backtesting 1 month of data.

In [2]:
start = dt.date(2023, 1, 1)
end = dt.date(2024, 1, 31)

columns = [
    'date',
    'barrid',
    'ticker',
    'price',
    'return',
    'specific_risk',
    'predicted_beta'
]

data = sfd.load_assets(
    start=start,
    end=end,
    in_universe=True,
    columns=columns
)

data

date,barrid,ticker,price,return,specific_risk,predicted_beta
date,str,str,f64,f64,f64,f64
2023-01-03,"""USA06Z1""","""MDXG""",2.97,6.8345,47.65898,1.43155
2023-01-04,"""USA06Z1""","""MDXG""",3.0,1.0101,47.539847,1.428854
2023-01-05,"""USA06Z1""","""MDXG""",3.08,2.6667,47.755957,1.37346
2023-01-06,"""USA06Z1""","""MDXG""",3.21,4.2208,48.110135,1.389274
2023-01-09,"""USA06Z1""","""MDXG""",3.28,2.1807,48.612414,1.348687
…,…,…,…,…,…,…
2024-01-25,"""USBPM41""","""WS""",29.38,-0.3054,50.968819,1.408389
2024-01-26,"""USBPM41""","""WS""",29.58,0.6807,50.922137,1.40225
2024-01-29,"""USBPM41""","""WS""",30.28,2.3665,50.93332,1.395143
2024-01-30,"""USBPM41""","""WS""",30.64,1.1889,50.438011,1.380131


## Compute the Momentum Signal

## Instructions

- Compute momentum for each security and date as the rolling 230 day return (you can just use log returns here).
- Shift the momentum signal 22 days. This will results in the 11 month return from t-12 to t-2.

In [3]:
def task_compute_momentum(data: pl.DataFrame) -> pl.DataFrame:
    """
    Compute the t_12 to t_2 momentum signal for each security and date combination.
    
    Args:
        data (pl.DataFrame): Data frame containing date, barrid, price, and return columns.
    
    Returns:
        pl.DataFrame: Data frame with columns date, barrid, price, return, and momentum columns.
    """
    data = data.clone().with_columns((pl.col('return')/100 + 1).log().rolling_sum(230).shift(22).over("barrid").alias("momentum"))
    return data.drop_nulls()
pl.Config.set_tbl_rows(20)
momentum = task_compute_momentum(data)

momentum

date,barrid,ticker,price,return,specific_risk,predicted_beta,momentum
date,str,str,f64,f64,f64,f64,f64
2024-01-03,"""USA06Z1""","""MDXG""",7.775,-1.2071,49.827621,1.092049,1.027828
2024-01-04,"""USA06Z1""","""MDXG""",7.76,-0.1929,49.746256,1.098982,0.989638
2024-01-05,"""USA06Z1""","""MDXG""",7.8,0.5155,49.535456,1.063097,0.979587
2024-01-08,"""USA06Z1""","""MDXG""",8.22,5.3846,49.435238,1.085788,0.936866
2024-01-09,"""USA06Z1""","""MDXG""",8.03,-2.3114,49.468538,1.182189,0.896796
2024-01-10,"""USA06Z1""","""MDXG""",8.1,0.8717,49.402579,1.208223,0.882818
2024-01-11,"""USA06Z1""","""MDXG""",8.04,-0.7407,49.324236,1.232689,0.853987
2024-01-12,"""USA06Z1""","""MDXG""",8.04,0.0,49.150153,1.230763,0.847307
2024-01-16,"""USA06Z1""","""MDXG""",8.08,0.4975,49.037617,1.137885,0.826331
2024-01-17,"""USA06Z1""","""MDXG""",8.35,3.3416,48.946023,1.164485,0.896639


## Compute the Alphas

In order to make our momentum signal usable in our optimizer we will use a predetermined Information Coefficient of 0.05 and the forecasted idiosyncratic risk provided by Barra to convert our signal into alpha forecasts.

### Instructions
- For each date z-score the momentum signal across all assets cross sectionally and call this `score`.
- Using the `specific_risk` column compute the alphas as `0.05` * `score` * `specific_risk`.
- Note: Make sure to divide `specific_risk` by 100 to put it in decimal space. 

In [4]:
def task_compute_alphas(momentum: pl.DataFrame) -> pl.DataFrame:
    """ 
    Compute the alphas for each security and date combo.

    Args:
        momentum (pl.DataFrame): Data frame containing barrid, date, specific_risk, and momentum columns.
    
    Returns:
        pl.DataFrame: Data frame containing barrid, date, specific_risk, momentum, score, and alpha columns.
    """
    df = momentum.clone()
    df = df.with_columns(((pl.col('momentum') - pl.col('momentum').mean())/pl.col('momentum').std()).over('date').alias('score'))
    df = df.with_columns((0.05 * pl.col('specific_risk') / 100 * pl.col('score')).alias('alpha'))
    return df

alphas = task_compute_alphas(momentum)

alphas

date,barrid,ticker,price,return,specific_risk,predicted_beta,momentum,score,alpha
date,str,str,f64,f64,f64,f64,f64,f64,f64
2024-01-03,"""USA06Z1""","""MDXG""",7.775,-1.2071,49.827621,1.092049,1.027828,2.291715,0.057095
2024-01-04,"""USA06Z1""","""MDXG""",7.76,-0.1929,49.746256,1.098982,0.989638,2.186411,0.054383
2024-01-05,"""USA06Z1""","""MDXG""",7.8,0.5155,49.535456,1.063097,0.979587,2.186923,0.054165
2024-01-08,"""USA06Z1""","""MDXG""",8.22,5.3846,49.435238,1.085788,0.936866,2.069079,0.051143
2024-01-09,"""USA06Z1""","""MDXG""",8.03,-2.3114,49.468538,1.182189,0.896796,2.054735,0.050822
2024-01-10,"""USA06Z1""","""MDXG""",8.1,0.8717,49.402579,1.208223,0.882818,2.027264,0.050076
2024-01-11,"""USA06Z1""","""MDXG""",8.04,-0.7407,49.324236,1.232689,0.853987,1.96722,0.048516
2024-01-12,"""USA06Z1""","""MDXG""",8.04,0.0,49.150153,1.230763,0.847307,1.958182,0.048122
2024-01-16,"""USA06Z1""","""MDXG""",8.08,0.4975,49.037617,1.137885,0.826331,1.926794,0.047243
2024-01-17,"""USA06Z1""","""MDXG""",8.35,3.3416,48.946023,1.164485,0.896639,2.052958,0.050242


In [5]:
def task_price_filter(alphas: pl.DataFrame) -> pl.DataFrame:
    """
    Filter the universe to lagged price greater than 5 and non-null alpha.
    
    Args:
        alphas (pl.DataFrame): Data frame containing barrid, date, specific_risk, momentum, score, and alpha columns.
    Returns:
        pl.DataFrame: Data frame containing barrid, date, specific_risk, momentum, score, and alpha columns.
    """
    df = alphas.clone()
    df = df.filter(pl.col('price').shift() > 5).drop_nulls(pl.col('alpha'))
    return df

price_filter = task_price_filter(alphas)

price_filter

date,barrid,ticker,price,return,specific_risk,predicted_beta,momentum,score,alpha
date,str,str,f64,f64,f64,f64,f64,f64,f64
2024-01-04,"""USA06Z1""","""MDXG""",7.76,-0.1929,49.746256,1.098982,0.989638,2.186411,0.054383
2024-01-05,"""USA06Z1""","""MDXG""",7.8,0.5155,49.535456,1.063097,0.979587,2.186923,0.054165
2024-01-08,"""USA06Z1""","""MDXG""",8.22,5.3846,49.435238,1.085788,0.936866,2.069079,0.051143
2024-01-09,"""USA06Z1""","""MDXG""",8.03,-2.3114,49.468538,1.182189,0.896796,2.054735,0.050822
2024-01-10,"""USA06Z1""","""MDXG""",8.1,0.8717,49.402579,1.208223,0.882818,2.027264,0.050076
2024-01-11,"""USA06Z1""","""MDXG""",8.04,-0.7407,49.324236,1.232689,0.853987,1.96722,0.048516
2024-01-12,"""USA06Z1""","""MDXG""",8.04,0.0,49.150153,1.230763,0.847307,1.958182,0.048122
2024-01-16,"""USA06Z1""","""MDXG""",8.08,0.4975,49.037617,1.137885,0.826331,1.926794,0.047243
2024-01-17,"""USA06Z1""","""MDXG""",8.35,3.3416,48.946023,1.164485,0.896639,2.052958,0.050242
2024-01-18,"""USA06Z1""","""MDXG""",8.25,-1.1976,48.923825,1.098937,0.935543,2.111361,0.051648


## Backtest

Now that we have our alphas we will compute the MVO portfolios for each date in our sample.

### Instructions
- Use the `FullInvestment`, `LongOnly`, `NoBuyingOnMargin`, and `UnitBeta` constraints.
- For each unique date in the `price_filter` data frame find the optimal weights using `sf_quant.optimizer.mve_optimizer`.
- Note: for the `UnitBeta` constraint to work you will need to provide the predicted betas to the optimizer in each iteration.
- Hint: the optimizer assumes that your alpha vector and covariance matrix are both sorted the same way.
- Hint: use a gamma of 10.

In [16]:
def task_backtest(price_filter: pl.DataFrame) -> pl.DataFrame:
    """
    Compute the optimal portfolio weights for each day in our sample.
    
    Args:
        price_filter (pl.DataFrame): Data frame containing barrid, date, specific_risk, momentum, score, and alpha columns.
    Returns:
        pl.DataFrame: Data frame containing barrid, date, and weight columns.
    """
    df = price_filter.clone()
    # weights = df.select(['date', 'barrid'])
    # weights = weights.with_columns((pl.col('barrid') + pl.Null()).alias('weight'))
    for day in df.get_column('date').sort().unique(maintain_order=True):
        dayDf = df.filter(pl.col('date') == day).sort('ticker')
        alphas = dayDf.get_column('alpha').to_numpy()
        barrids = dayDf.get_column('barrid').unique(maintain_order=True).to_list()
        betas = dayDf.get_column('predicted_beta').to_numpy()
        cov_mat = sfd.construct_covariance_matrix(day, barrids).to_numpy()[:,1:] / 252
        tempdf = sfo.mve_optimizer(barrids, alphas, cov_mat, [sfo.FullInvestment(), sfo.LongOnly(), sfo.NoBuyingOnMargin(), sfo.UnitBeta()], gamma=10, betas=betas).rename({"barrid":"ticker"})
        return tempdf
        for barrid in barrids:
            weights[(pl.col('barrid') == day) & (pl.col('barrid') == barrid), 'weight'] = tempdf[barrid]

weights = task_backtest(price_filter)

weights

ticker,weight
str,f64
"""USA3871""",2.3147e-20
"""USBDIJ1""",-3.5159e-19
"""USBDPM1""",-2.1599e-19
"""USA91R1""",-2.2867e-19
"""USBFCZ1""",-2.5956e-19
"""USAA181""",-1.6836e-19
"""USAB1X1""",-3.0775e-20
"""USAC121""",-2.3106e-19
"""USBALJ1""",2.4309e-19
"""USAV4K1""",-2.2693e-19


## Performance Analysis

Now that we have our optimal weights we will join the returns from our initial dataset. 

### Instructions
- Join the returns from `data` and compute the return and cumulative return of the portfolio using the optimal weights.
- Note: since our covariance matrix isn't lagged we will need to shift our returns forward. To do this use `.shift(-1)` by `barrid` and call it `fwd_return`.
- Chart the cumulative returns of the portfolio.

In [None]:
def task_compute_returns(weights: pl.DataFrame, data: pl.DataFrame) -> pl.DataFrame:
    """ 
    Compute the optimal portfolio returns.

    Args:
        weights (pl.DataFrame): Data frame containing barrid, date, and weight columns.
        data (pl.DataFrame): Data frame containing barrid, date, and return columns

    Returns:
        pl.DataFrame: Data frame containing date, fwd_return, and cumulative_fwd_return_columns
    """
    # TODO: Finish this function.
    pass           

returns = task_compute_returns(weights, data)

returns

In [None]:
# TODO: Chart the cumulative returns of the portfolio.

## Benchmark Decomposition

You should find that our portfolio is up and to the right. But the question is how much of that is due to the market being up versus our signal being good. We will find out by joining the benchmark weights to our `weights` data frame and computing the active weights.

### Instructions

- Pull in the benchmark weights using `sf_quant.data.load_benchmark`.
- Join the benchmark weights to the optimal weights.
- Compute the active weights as `weight` - `weight_bmk` = `weight_act`
- Unpivot the weight columns and compute the forward return for each portfolio (total, benchmark, and active). 

In [None]:
def task_return_decomposition(weights: pl.DataFrame, data: pl.DataFrame) -> pl.DataFrame:
    """ 
    Compute the forward returns for the total, benchmark, and active portfolios.

    Args:
        weights (pl.DataFrame): Data frame containing barrid, date, and weight columns.
        data (pl.DataFrame): Data frame containing barrid, date, and return columns

    Returns:
        pl.DataFrame: Data frame containing date, portfolio, fwd_return, and cumulative_fwd_return columns        
    """
    # TODO: Finish this function.
    pass

returns_decomp = task_return_decomposition(weights, data)

returns_decomp

In [None]:
# TODO: Chart the cumulative returns of each portfolio
# HINT: Use seaborn.lineplot() with the attribute hue='portfolio'

In [None]:
# TODO: Compute the annual average return, annual volatility, and annualized sharpe ratio for each portfolio.

## `sf_quant` Backtester Module

That was a lot of fun right? Just kidding. All of that code takes a lot of work. That's why we've implemented a backtester in the `sf_quant` package. Let's practice using it really quick and compare our results.

### Instructions
- Declare your constraints the same way you did previously.
- Use a gamma of 10.
- Find the optimal weights using the `sf_quant.backtester` module.
- Hint: use the `backtest_parallel()` module to run your backtest in parallel across all the cores on your machine.

In [None]:
def task_backtest_sf(price_filter: pl.DataFrame) -> pl.DataFrame:
    """ 
    Compute the optimal portfolio weights using the `sf_quant` package.

    Args:
        price_filter (pl.DataFrame): Data frame containing barrid, date, specific_risk, momentum, score, and alpha columns.

    Returns:
        pl.DataFrame: Data frame containing barrid, date, and weight columns.
    """
    # TODO: Finish this function.
    pass

weights_sf = task_backtest_sf(price_filter)

weights_sf

## `sf_quant` Performance Package

It's also not a lot of fun to merge the returns dataset and do a full decomposition manually. You can do that with `sf_quant.performance` too.

### Instructions

- Compute the portfolio forward returns decomposition using the `generate_returns_from_weights` function.
- Chart the cumulative returns of the portfolios using the `generate_returns_chart` function.
- Generate the summary table using the `generate_summary_table` function.

In [None]:
def task_return_decomposition_sf(weights_sf: pl.DataFrame) -> pl.DataFrame:
    """ 
    Compute the returns decomposition using the `sf_quant` package.

    Args:
        weights_sf (pl.DataFrame): Data frame containing date, barrid, and weight columns.

    Returns:
        pl.DataFrame: Data frame containing date, portfolio, and return (fwd_return) columns
    """
    # TODO: Finish this function.
    pass

returns_sf = task_return_decomposition_sf(weights_sf)

returns_sf

In [None]:
# TODO: Generate the returns chart

In [None]:
# TODO: Generate the summary table