# Quantitative Momentum Strategy

By Teddy Tenetcha





Momentum investing is an investment strategy that involves buying stocks that have outperformed their peers over a recent look-back period and selling those that have underperformed. In this project, we applied a momentum strategy to the 30 stocks in the Dow Jones Industrial Average (DJIA) a widely followed U.S. index comprising large, publicly traded companies across diverse sectors—making it an effective barometer of overall market trends and performance.

## Methodology
For the purpose of this section, we employed a momentum trading strategy with the DJIA constituents as our reference universe. This involve analyzing their respective price trends and performance relative to each other over a specified period in order to identify potential investment opportunities. Our strategy seek to capitalize on the continuing momentum of outperforming stocks while shorting those with poor performance, with the expectation that these trends will persist over the near to medium term. In other words, we are going to make trading decisions by longing top performers and shorting bottom performers of the 30 constituent stocks.

In [2]:
import pandas as pd
import requests
from bs4 import BeautifulSoup
import os
import numpy as np
import pandas as pd
import yfinance as yf

 Fetching relevant information from the web page

In [3]:
def fetch_info():
    try:
        url = "https://en.wikipedia.org/wiki/Dow_Jones_Industrial_Average"

        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:101.0) Gecko/20100101 Firefox/101.0',
            'Accept': 'application/json',
            'Accept-Language': 'en-US,en;q=0.5',

        }

        #  Send GET request
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, "html.parser")

        #  Get the symbols table
        tables = soup.find_all('table')

        #  #  Convert table to dataframe
        df = pd.read_html(str(tables))[1]

        #  Cleanup
        df.drop(columns=['Notes'], inplace=True)
        return df

    except:
        print('Error loading data')
        return None


def fetch_info():
    try:
        import requests
        from bs4 import BeautifulSoup
        import pandas as pd

        url = "https://en.wikipedia.org/wiki/Dow_Jones_Industrial_Average"
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:101.0) Gecko/20100101 Firefox/101.0',
            'Accept': 'application/json',
            'Accept-Language': 'en-US,en;q=0.5',
        }
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, "html.parser")

        tables = soup.find_all('table')
        df = pd.read_html(str(tables))[1]

        df.drop(columns=['Notes'], inplace=True)
        return df

    except:
        print('Error loading data')
        return None


Now let us call the function to store the result in dji_df and output the first five rows, as shown in the following:

In [4]:
import requests
from bs4 import BeautifulSoup
import pandas as pd
from io import StringIO

def fetch_info():
    try:
        url = "https://en.wikipedia.org/wiki/Dow_Jones_Industrial_Average"
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:101.0) Gecko/20100101 Firefox/101.0',
            'Accept': 'text/html',
            'Accept-Language': 'en-US,en;q=0.5',
        }

        response = requests.get(url, headers=headers)
        response.raise_for_status()

        soup = BeautifulSoup(response.content, "html.parser")

        # Find all tables with class "wikitable sortable"
        tables = soup.find_all('table', {'class': 'wikitable sortable'})

        for table in tables:
            df = pd.read_html(StringIO(str(table)))[0]
            # Look for expected column to verify it's the right table
            if 'Company' in df.columns:
                # Clean up if necessary
                if 'Notes' in df.columns:
                    df = df.drop(columns=['Notes'])
                return df

        print("Could not find the DJI components table.")
        return None

    except Exception as e:
        print(f"Error loading data: {e}")
        return None

# Now use it
dji_df = fetch_info()
dji_df.head()


Unnamed: 0,Company,Exchange,Symbol,Industry,Date added,Index weighting
0,3M,NYSE,MMM,Conglomerate,1976-08-09,2.11%
1,American Express,NYSE,AXP,Financial services,1982-08-30,4.37%
2,Amgen,NASDAQ,AMGN,Biopharmaceutical,2020-08-31,3.89%
3,Amazon,NASDAQ,AMZN,Retailing,2024-02-26,3.21%
4,Apple,NASDAQ,AAPL,Information technology,2015-03-19,3.25%


We can then take the Symbol column, extract the values, and convert it to a list format:

In [5]:
tickers = dji_df.Symbol.values.tolist()

## Downloading Stock Prices
There are three input arguments to be specified to call the download() function: the ticker symbols, the start date, and the end date. In this case, we set the start date as 2024-01-01 and the end date as 2024-09-01.

In [6]:
start_date = "2024-01-01"
end_date = "2024-09-01"
df = yf.download(tickers, start=start_date, end=end_date)


YF.download() has changed argument auto_adjust default to True


[*********************100%***********************]  30 of 30 completed


In [9]:
# use the adjusted closing prices for follow-up analysis
df = df['Close']


## Calculating Monthly Returns
To transition from the raw daily stock prices to monthly returns, we need to go through a few steps. The first step is to convert the prices to daily percentage returns using the pct_change() method. This function automatically calculates the simple percentage return: $$ {R}_{t,t+1}=\frac{S_{t+1}-{S}_t}{S_t} $$ for all trading days. As this is a daily return, we need to roll it up to the monthly return by compounding all daily returns of the same month and using the terminal return as the monthly return. Breaking it down, we need to group all trading days by month and then calculate the terminal return for each month

In [10]:
mth_return_df = df.pct_change().resample("M").agg(lambda x: (x+1).prod()-1)

  mth_return_df = df.pct_change().resample("M").agg(lambda x: (x+1).prod()-1)


  Create a Pandas Series object with nine integers ranging from zero to eight, which are indexed by nine one-minute timestamps:

In [11]:
# creating a series with 9 one minute timestamps

index = pd.date_range('1/1/2022', periods=9, freq='T')
series = pd.Series(range(9), index=index)
series

  index = pd.date_range('1/1/2022', periods=9, freq='T')


Unnamed: 0,0
2022-01-01 00:00:00,0
2022-01-01 00:01:00,1
2022-01-01 00:02:00,2
2022-01-01 00:03:00,3
2022-01-01 00:04:00,4
2022-01-01 00:05:00,5
2022-01-01 00:06:00,6
2022-01-01 00:07:00,7
2022-01-01 00:08:00,8


We then aggregate the series into three-minute bins and sum the values of the timestamps falling into a bin, as shown in the following code snippet:

In [None]:
series.resample('3T').sum()


'T' is deprecated and will be removed in a future version, please use 'min' instead.



Unnamed: 0,0
2022-01-01 00:00:00,3
2022-01-01 00:03:00,12
2022-01-01 00:06:00,21


## Calculating the Six-Month Terminal Return
We know that making a trading decision based on the current month’s return would be flawed in two ways. First, we rely too much on the current month and ignore historical performances. Second, we run into the risk of data snooping. That is, to calculate the monthly return on a given day of the month, if it does not fall on the last day of the month, we would snoop all future daily returns within the same month in order to calculate the terminal return.

We focus on the first point and came back to the second point in a moment. Obviously, we need to find a way to incorporate historical monthly returns when generating trading signals in the current month. However, this is different from the moving averages used for stock prices, the historical average monthly return obtained using the same arithmetic mean essentially ignores the sequential compounding process. Therefore, we need to treat historical monthly returns as a sequential process and compound these returns (up to a specific lookback window) to obtain the terminal monthly return.



In [12]:
# obtain the historical cumulative returns of past 6 months as the terminal return of current month
past_cum_return_df = (mth_return_df+1).rolling(6).apply(np.prod) - 1

## Generating Trading Signals

We have fixed the lookback window to be six months into the past. The momentum trading strategy involves another lookahead window used to fix the trading horizon in the future. Specifically, suppose we form our trading strategy and make the trading decision in the current month. These new positions will last for a full month in the next month if the lookahead horizon is one. We can then measure the performance of these positions at the end of the next month. In this case, the size of the lookahead window is set to be one.

Since our data lasts until 2024-08-31, we will use 2024-08-31 as the trade formation period. To generate a trading strategy, we will use the terminal monthly return from the previous month indexed at 2022-06-30 as the end of the measurement period.

In [27]:
import datetime as dt

end_of_measurement_period = dt.datetime(2024,6,30)
formation_period = dt.datetime(2024,7,31)

In [28]:
end_of_measurement_period_return_df = past_cum_return_df.loc[end_of_measurement_period]
end_of_measurement_period_return_df = end_of_measurement_period_return_df.reset_index()
end_of_measurement_period_return_df.head()

Unnamed: 0,Ticker,2024-06-30 00:00:00
0,AAPL,0.13755
1,AMGN,0.066374
2,AMZN,0.288935
3,AXP,0.237416
4,BA,-0.27705


The six-month terminal monthly returns of the 30 DJI constituents represent the relative momentum of each stock. We can observe the stock symbols and returns with the highest momentum in the positive and negative directions.

In [29]:
# highest momentum in the positive direction

end_of_measurement_period_return_df.loc[end_of_measurement_period_return_df.iloc[:,1].idxmax()]

Unnamed: 0,22
Ticker,NVDA
2024-06-30 00:00:00,1.565104


In [30]:
# highest momentum in the negative direction

end_of_measurement_period_return_df.loc[end_of_measurement_period_return_df.iloc[:,1].idxmin()]

Unnamed: 0,21
Ticker,NKE
2024-06-30 00:00:00,-0.287331


These two stocks would become the best choices if we were to long or short an asset. Instead of focusing on only one stock in each direction (long and short), we can enlarge the space and use a quantile approach for stock selection. For example, we can classify all stocks into five groups (also referred to as quantiles or percentiles) based on their returns and form a trading strategy that longs the stocks in the top percentile and shorts those in the bottom percentile.

To obtain the quantile of each return, we can use the qcut() function from Pandas, which receives a Pandas Series and cuts it into a prespecified number of groups based on their quantiles, thus discretizing the continuous variables into a categorical, more specifically, and ordinal one.

In [31]:
pd.qcut(series, 5, labels=False)

Unnamed: 0,0
2022-01-01 00:00:00,0
2022-01-01 00:01:00,0
2022-01-01 00:02:00,1
2022-01-01 00:03:00,1
2022-01-01 00:04:00,2
2022-01-01 00:05:00,3
2022-01-01 00:06:00,3
2022-01-01 00:07:00,4
2022-01-01 00:08:00,4


In [32]:
end_of_measurement_period_return_df['rank'] = pd.qcut(end_of_measurement_period_return_df.iloc[:,1], 5, labels=False)
end_of_measurement_period_return_df

Unnamed: 0,Ticker,2024-06-30 00:00:00,rank
0,AAPL,0.13755,3
1,AMGN,0.066374,1
2,AMZN,0.288935,4
3,AXP,0.237416,4
4,BA,-0.27705,0
5,CAT,0.147426,3
6,CRM,0.005098,1
7,CSCO,-0.04433,0
8,CVX,0.068528,2
9,DIS,0.094587,2


In [33]:
long_stocks = end_of_measurement_period_return_df.loc[end_of_measurement_period_return_df["rank"]==4,"Ticker"].values
long_stocks

array(['AMZN', 'AXP', 'JPM', 'MSFT', 'NVDA', 'WMT'], dtype=object)

In [34]:
short_stocks = end_of_measurement_period_return_df.loc[end_of_measurement_period_return_df["rank"]==0,"Ticker"].values
short_stocks

array(['BA', 'CSCO', 'JNJ', 'MCD', 'NKE', 'UNH'], dtype=object)

## Evaluating Out-of-Sample Performance
Let us first grab the monthly return indexed at 2024-08-31 from mth_return_df for the long and short positions, respectively. We used the relativedelta function from the dateutil package to shift formation_period by one month into the future, arriving at the evaluation period. This goes to the row-level condition in the .loc[] property. For the column-level condition, we subset the columns to the stock symbols within the long positions using the isin() method.

In [35]:
from dateutil.relativedelta import relativedelta

long_return_df = mth_return_df.loc[formation_period +  relativedelta(months=1), mth_return_df.columns.isin(long_stocks)]
long_return_df

Unnamed: 0_level_0,2024-08-31
Ticker,Unnamed: 1_level_1
AMZN,-0.045352
AXP,0.02217
JPM,0.056391
MSFT,-0.001095
NVDA,0.020082
WMT,0.128353


The result shows that some of the top performers are decreasing in price, which is a direct reflection of market sentiment during that period of time. We can similarly obtain the evaluation-period performance for the bottom performances in the short position.

In [37]:
short_return_df = mth_return_df.loc[formation_period +  relativedelta(months=1), mth_return_df.columns.isin(short_stocks)]
short_return_df

Unnamed: 0_level_0,2024-08-31
Ticker,Unnamed: 1_level_1
BA,-0.088458
CSCO,0.043137
JNJ,0.05872
MCD,0.087641
NKE,0.113011
UNH,0.024368


Now we calculate the return of the evaluation period based on these two positions. We assume an equally weighted portfolio in both positions. Thus, the final return is the average of all member stocks in the respective position. Also, since we hold a short position for the bottom performers, we subtract the average return from the short position in these stocks while adding the average return from the long position.

In [38]:
momentum_profit = long_return_df.mean() - short_return_df.mean()
momentum_profit

np.float64(-0.009645176034173236)

## Comparing with the Buy-and-Hold Strategy
We assume a buy-and-hold strategy based on DJI as the benchmark. This means entering a long position of the index at the same beginning of the trading period on 2024-01-01 and holding them all the way until 2024-09-01.

In [39]:
df_dji = yf.download("^DJI", start=start_date, end=end_date)

[*********************100%***********************]  1 of 1 completed


In [40]:
buy_n_hold_df = df_dji['Close'].pct_change().resample("M").agg(lambda x: (x+1).prod()-1)
buy_n_hold_df.head()

  buy_n_hold_df = df_dji['Close'].pct_change().resample("M").agg(lambda x: (x+1).prod()-1)


Ticker,^DJI
Date,Unnamed: 1_level_1
2024-01-31,0.011541
2024-02-29,0.022178
2024-03-31,0.020796
2024-04-30,-0.050027
2024-05-31,0.023017


In [None]:
buy_n_hold_df.loc[formation_period + relativedelta(months=1),]

Unnamed: 0_level_0,2024-08-31
Ticker,Unnamed: 1_level_1
^DJI,0.017636


The buy-and-hold strategy thus reports a monthly return of 1.76% in the same evaluation period. The momentum trading strategy performs poorly. More robust backtesting on the out-of-sample performance across multiple periods will be beneficial.

## Optimization

In [41]:
# Fetch DJI Tickers
def fetch_dji_tickers():
    url = "https://en.wikipedia.org/wiki/Dow_Jones_Industrial_Average"
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:101.0) Gecko/20100101 Firefox/101.0',
        'Accept': 'text/html',
        'Accept-Language': 'en-US,en;q=0.5',
    }
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.content, "html.parser")
    tables = soup.find_all('table', {'class': 'wikitable sortable'})
    for table in tables:
        df = pd.read_html(StringIO(str(table)))[0]
        if 'Symbol' in df.columns:
            return df['Symbol'].str.replace('.', '-', regex=False).tolist()
    raise Exception("Could not find DJI tickers.")



In [42]:
# Download Stock Prices
def download_prices(tickers, start_date, end_date):
    df = yf.download(tickers, start=start_date, end=end_date)
    if isinstance(df.columns, pd.MultiIndex):
        if 'Adj Close' in df.columns.levels[0]:
            return df['Adj Close']
        else:
            return df['Close']
    else:
        return df

In [43]:
# Calculate Momentum Scores
def calculate_momentum(prices, lookback_days=126):
    returns = prices.pct_change(periods=lookback_days)
    momentum_scores = returns.iloc[-1]
    return momentum_scores

In [44]:
# Select Top Momentum Stocks
def select_top_momentum(momentum_scores, top_n=10):
    return momentum_scores.nlargest(top_n).index.tolist()


In [45]:
# Random Portfolio Simulation
def simulate_portfolios(prices, tickers, num_portfolios=5000):
    returns = prices[tickers].pct_change().dropna()
    mean_returns = returns.mean() * 252
    cov_matrix = returns.cov() * 252

    results = []
    weights_record = []

    for _ in range(num_portfolios):
        weights = np.random.random(len(tickers))
        weights /= np.sum(weights)
        portfolio_return = np.sum(weights * mean_returns)
        portfolio_volatility = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))
        sharpe_ratio = portfolio_return / portfolio_volatility
        results.append([portfolio_return, portfolio_volatility, sharpe_ratio])
        weights_record.append(weights)

    results = np.array(results)
    return results, weights_record, mean_returns, cov_matrix

In [46]:
# Find Optimal Portfolio
def find_optimal_portfolio(results, weights_record, tickers):
    max_sharpe_idx = np.argmax(results[:, 2])
    best_weights = weights_record[max_sharpe_idx]
    best_return = results[max_sharpe_idx, 0]
    best_volatility = results[max_sharpe_idx, 1]
    best_sharpe = results[max_sharpe_idx, 2]
    return {
        'weights': dict(zip(tickers, best_weights)),
        'return': best_return,
        'volatility': best_volatility,
        'sharpe': best_sharpe
    }


In [47]:
# Plot Efficient Frontier
def plot_efficient_frontier(results):
    df = pd.DataFrame(results, columns=['Return', 'Volatility', 'Sharpe'])
    fig = px.scatter(df, x='Volatility', y='Return', color='Sharpe', title='Efficient Frontier')
    fig.show()


In [68]:
import plotly.express as px
import plotly.io as pio

pio.renderers.default = 'notebook'

def plot_efficient_frontier(results):
    df = pd.DataFrame(results, columns=['Return', 'Volatility', 'Sharpe'])
    fig = px.scatter(df, x='Volatility', y='Return', color='Sharpe', title='Efficient Frontier')
    fig.show()

In [65]:
def print_optimal_weights(optimal_portfolio):
    print("\nOptimal Portfolio Allocation:\n")
    for stock, weight in optimal_portfolio['weights'].items():
        print(f"{stock}: {weight:.2%}")
    print(f"\nExpected Return: {optimal_portfolio['return']:.2%}")
    print(f"Expected Volatility: {optimal_portfolio['volatility']:.2%}")
    print(f"Sharpe Ratio: {optimal_portfolio['sharpe']:.2f}")
print_optimal_weights(optimal_portfolio)


Optimal Portfolio Allocation:

MMM: 4.52%
NVDA: 9.43%
GS: 7.35%
WMT: 19.31%
AAPL: 4.69%
KO: 27.19%
JPM: 6.51%
UNH: 5.84%
AMGN: 6.13%
AXP: 9.01%

Expected Return: 52.98%
Expected Volatility: 10.57%
Sharpe Ratio: 5.01


In [66]:
import plotly.io as pio
pio.renderers.default = 'colab'
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio  # <-- add this

pio.renderers.default = 'colab'  # <-- add this

import pandas as pd
import numpy as np

def plot_efficient_frontier_with_best_px(results):
    # Prepare the dataframe
    df = pd.DataFrame(results, columns=['Return', 'Volatility', 'Sharpe'])

    # Find the best Sharpe ratio portfolio
    max_sharpe_idx = np.argmax(df['Sharpe'])
    best_portfolio = df.iloc[max_sharpe_idx]

    # Create the scatter plot
    fig = px.scatter(
        df,
        x='Volatility',
        y='Return',
        color='Sharpe',
        color_continuous_scale='viridis',
        title='Efficient Frontier (with Best Portfolio ⭐)'
    )

    # Add red star for best Sharpe portfolio
    fig.add_trace(go.Scatter(
        x=[best_portfolio['Volatility']],
        y=[best_portfolio['Return']],
        mode='markers',
        marker=dict(color='red', size=15, symbol='star'),
        name='Best Sharpe Portfolio'
    ))

    fig.update_layout(coloraxis_colorbar=dict(title="Sharpe Ratio"))
    fig.show()

plot_efficient_frontier_with_best_px(results)


## Summary
In this momentum trading strategy, the goal was to select the top 10 performing Dow Jones Industrial Average stocks based on their past 6-month returns and optimize the portfolio allocation to maximize risk-adjusted returns. Initially, the raw momentum portfolio underperformed, delivering a return of -0.96% compared to a simple buy-and-hold return of +1.76%, highlighting that momentum signals alone can sometimes falter depending on market conditions. However, after applying portfolio optimization, specifically maximizing the Sharpe ratio the strategy significantly improved, achieving an expected annual return of +52.98% with a controlled volatility of 10.57%, resulting in a remarkably high Sharpe ratio of 5.01. The optimized portfolio emphasized a balanced allocation across sectors, favoring stable defensive stocks like Coca-Cola and Walmart alongside strong growth names like NVIDIA. Overall, this project demonstrates that while momentum provides a useful starting signal, combining it with optimization techniques is crucial to building a robust, high-performance trading strategy.