# Momentum trading strategy
As from Quantitative Trading by Ernest Chan, there are the paradigms of mean reversion vs momentum. I believe that crypto assets are much more momentum based given their speculative nature, and the sheer number of retail traders. 

I will be implementing that hypothesis in this notebook.

In [1]:
from dotenv import load_dotenv
import os
from pathlib import Path

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import mplfinance as mpf

In [2]:
load_dotenv()

True

# Data loading

In [3]:
BTCUSDT_FOLDER_PATH = os.getenv("BTCUSDT_FOLDER_PATH")

In [4]:
def retrieve_csv_files(directory):
    csv_files = []
    for file in os.listdir(directory):
        if file.endswith('.csv'):
            csv_files.append(os.path.join(directory, file))
    csv_files.sort()

    return csv_files

In [5]:
btcusdt_csv_files = retrieve_csv_files(BTCUSDT_FOLDER_PATH)

In [6]:
columns = [
    'open time',
    'open',
    'high',
    'low',
    'close',
    'volume',
    'close time',
    'quote asset volume',
    'number of trades',
    'taker buy base asset volume',
    'taker buy quote asset volume',
    'ignore'
]

In [7]:
# have to process these data differently as units of time
# changed at the start of 2025
df_before_2025 = pd.DataFrame(columns=columns)
df_2025 = pd.DataFrame(columns=columns)

In [8]:
for file in btcusdt_csv_files:
    filename = os.path.basename(file)
    temp_df = pd.read_csv(file, names=columns)
    if '2024' in filename:
        df_before_2025 = pd.concat([df_before_2025, temp_df])
    else:
        df_2025 = pd.concat([df_2025, temp_df])

  df_before_2025 = pd.concat([df_before_2025, temp_df])
  df_2025 = pd.concat([df_2025, temp_df])


In [9]:
df_before_2025.shape, df_2025.shape

((3474, 12), (936, 12))

In [10]:
df_before_2025

Unnamed: 0,open time,open,high,low,close,volume,close time,quote asset volume,number of trades,taker buy base asset volume,taker buy quote asset volume,ignore
0,1723161600000,61686.00,61744.37,61086.95,61214.00,2033.42760,1723165199999,1.249798e+08,104192,955.82560,5.874844e+07,0
1,1723165200000,61214.01,61550.08,60720.00,61400.00,1766.56142,1723168799999,1.080571e+08,82677,802.34639,4.908559e+07,0
2,1723168800000,61400.00,61461.03,61147.88,61235.51,829.95324,1723172399999,5.088503e+07,48800,417.87153,2.562022e+07,0
3,1723172400000,61235.51,61573.27,61171.86,61358.39,743.62787,1723175999999,4.561472e+07,42618,381.79973,2.341730e+07,0
4,1723176000000,61358.39,61390.70,60720.00,60720.01,1063.85453,1723179599999,6.493274e+07,63522,412.04926,2.514318e+07,0
...,...,...,...,...,...,...,...,...,...,...,...,...
19,1735671600000,93875.69,94290.91,93712.45,94166.88,462.79342,1735675199999,4.350635e+07,104273,253.35809,2.381812e+07,0
20,1735675200000,94166.88,94222.50,93450.17,93564.04,733.04147,1735678799999,6.875354e+07,117956,344.26486,3.228285e+07,0
21,1735678800000,93564.01,93964.15,93504.67,93899.68,337.52715,1735682399999,3.163012e+07,68192,170.68198,1.599449e+07,0
22,1735682400000,93899.67,93899.67,93375.58,93488.84,315.53272,1735685999999,2.955364e+07,53117,162.17454,1.519085e+07,0


In [11]:
df_2025

Unnamed: 0,open time,open,high,low,close,volume,close time,quote asset volume,number of trades,taker buy base asset volume,taker buy quote asset volume,ignore
0,1735689600000000,93576.00,94509.42,93489.03,94401.14,755.99010,1735693199999999,7.106881e+07,93525,421.08319,3.959678e+07,0
1,1735693200000000,94401.13,94408.72,93578.77,93607.74,586.53456,1735696799999999,5.509661e+07,79943,257.42023,2.418794e+07,0
2,1735696800000000,93607.74,94105.12,93594.56,94098.91,276.78045,1735700399999999,2.597409e+07,55078,185.35204,1.739377e+07,0
3,1735700400000000,94098.90,94098.91,93728.22,93838.04,220.99302,1735703999999999,2.074804e+07,35001,119.92140,1.125867e+07,0
4,1735704000000000,93838.04,93838.04,93500.00,93553.91,279.46909,1735707599999999,2.617906e+07,38597,132.98547,1.246062e+07,0
...,...,...,...,...,...,...,...,...,...,...,...,...
19,1739041200000000,96512.00,96750.00,96399.27,96579.99,279.95367,1739044799999999,2.703996e+07,58626,133.27011,1.287272e+07,0
20,1739044800000000,96579.99,96638.89,96358.50,96455.92,264.21148,1739048399999999,2.548104e+07,52758,97.23387,9.377567e+06,0
21,1739048400000000,96455.92,96600.00,96360.00,96527.36,224.76729,1739051999999999,2.168460e+07,46206,78.51361,7.574618e+06,0
22,1739052000000000,96527.37,96712.00,96388.88,96473.90,286.06330,1739055599999999,2.761928e+07,45007,111.90036,1.080221e+07,0


In [12]:
df_before_2025['open_timestamp'] = pd.to_datetime(df_before_2025['open time'], unit='ms')
df_before_2025.set_index('open_timestamp', inplace=True)

df_2025['open_timestamp'] = pd.to_datetime(df_2025['open time'], unit='us')
df_2025.set_index('open_timestamp', inplace=True)

In [13]:
df = pd.concat([df_before_2025, df_2025])
df

Unnamed: 0_level_0,open time,open,high,low,close,volume,close time,quote asset volume,number of trades,taker buy base asset volume,taker buy quote asset volume,ignore
open_timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2024-08-09 00:00:00,1723161600000,61686.00,61744.37,61086.95,61214.00,2033.42760,1723165199999,1.249798e+08,104192,955.82560,5.874844e+07,0
2024-08-09 01:00:00,1723165200000,61214.01,61550.08,60720.00,61400.00,1766.56142,1723168799999,1.080571e+08,82677,802.34639,4.908559e+07,0
2024-08-09 02:00:00,1723168800000,61400.00,61461.03,61147.88,61235.51,829.95324,1723172399999,5.088503e+07,48800,417.87153,2.562022e+07,0
2024-08-09 03:00:00,1723172400000,61235.51,61573.27,61171.86,61358.39,743.62787,1723175999999,4.561472e+07,42618,381.79973,2.341730e+07,0
2024-08-09 04:00:00,1723176000000,61358.39,61390.70,60720.00,60720.01,1063.85453,1723179599999,6.493274e+07,63522,412.04926,2.514318e+07,0
...,...,...,...,...,...,...,...,...,...,...,...,...
2025-02-08 19:00:00,1739041200000000,96512.00,96750.00,96399.27,96579.99,279.95367,1739044799999999,2.703996e+07,58626,133.27011,1.287272e+07,0
2025-02-08 20:00:00,1739044800000000,96579.99,96638.89,96358.50,96455.92,264.21148,1739048399999999,2.548104e+07,52758,97.23387,9.377567e+06,0
2025-02-08 21:00:00,1739048400000000,96455.92,96600.00,96360.00,96527.36,224.76729,1739051999999999,2.168460e+07,46206,78.51361,7.574618e+06,0
2025-02-08 22:00:00,1739052000000000,96527.37,96712.00,96388.88,96473.90,286.06330,1739055599999999,2.761928e+07,45007,111.90036,1.080221e+07,0


## Data quality check
### Check for missing data

In [14]:
time_diff = df.reset_index()['open_timestamp'].diff()

expected_interval = pd.Timedelta('1 hour')
missing_data = time_diff != expected_interval

if missing_data.any():
    print('Missing data')
    print(time_diff[missing_data])

Missing data
0                  NaT
2466   0 days 07:00:00
Name: open_timestamp, dtype: timedelta64[ns]


In [15]:
print(df.iloc[2448:2467]) 

                         open time      open      high       low     close  \
open_timestamp                                                               
2024-11-19 00:00:00  1731974400000  90464.07  91228.58  90357.00  91042.94   
2024-11-19 01:00:00  1731978000000  91042.95  91241.46  90616.00  90714.26   
2024-11-19 02:00:00  1731981600000  90714.27  91271.65  90702.73  91236.00   
2024-11-19 03:00:00  1731985200000  91236.00  91529.00  91163.90  91428.00   
2024-11-19 04:00:00  1731988800000  91428.00  91816.28  91248.01  91528.00   
2024-11-19 05:00:00  1731992400000  91528.00  91891.80  91238.09  91877.00   
2024-11-19 06:00:00  1731996000000  91877.00  91980.00  91598.50  91959.11   
2024-11-19 07:00:00  1731999600000  91959.11  92000.00  91600.00  91620.01   
2024-11-19 08:00:00  1732003200000  91620.00  91781.00  91400.03  91780.99   
2024-11-19 09:00:00  1732006800000  91780.99  91950.00  91200.00  91694.38   
2024-11-19 10:00:00  1732010400000  91694.37  91916.24  91485.04

There are 7 hours of missing data on 2024-11-19. We will create a new column called `data_quality` where we flag it to be `pass` or `fail`. We will run our backtest on only those data points where the data quality is `pass`.

In [16]:
df['data_quality'] = 'pass'

gap_start_index = 2448
gap_end_index = 2466

df.loc[df.index[gap_start_index:gap_end_index], 'data_quality'] = 'fail'

In [17]:
df.iloc[2448:2467]

Unnamed: 0_level_0,open time,open,high,low,close,volume,close time,quote asset volume,number of trades,taker buy base asset volume,taker buy quote asset volume,ignore,data_quality
open_timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2024-11-19 00:00:00,1731974400000,90464.07,91228.58,90357.0,91042.94,1152.06439,1731977999999,104509800.0,176262,695.51421,63099920.0,0,fail
2024-11-19 01:00:00,1731978000000,91042.95,91241.46,90616.0,90714.26,692.68474,1731981599999,62993600.0,136859,317.1187,28846900.0,0,fail
2024-11-19 02:00:00,1731981600000,90714.27,91271.65,90702.73,91236.0,895.46765,1731985199999,81576750.0,115961,470.05079,42822140.0,0,fail
2024-11-19 03:00:00,1731985200000,91236.0,91529.0,91163.9,91428.0,835.04934,1731988799999,76287680.0,110512,467.43654,42709270.0,0,fail
2024-11-19 04:00:00,1731988800000,91428.0,91816.28,91248.01,91528.0,1005.5469,1731992399999,92025640.0,141302,508.53787,46552500.0,0,fail
2024-11-19 05:00:00,1731992400000,91528.0,91891.8,91238.09,91877.0,968.00577,1731995999999,88637130.0,108267,501.17451,45896690.0,0,fail
2024-11-19 06:00:00,1731996000000,91877.0,91980.0,91598.5,91959.11,951.70937,1731999599999,87382890.0,120901,473.04061,43437020.0,0,fail
2024-11-19 07:00:00,1731999600000,91959.11,92000.0,91600.0,91620.01,968.54265,1732003199999,88935260.0,105126,387.56373,35593230.0,0,fail
2024-11-19 08:00:00,1732003200000,91620.0,91781.0,91400.03,91780.99,766.0414,1732006799999,70179630.0,109748,368.19716,33735830.0,0,fail
2024-11-19 09:00:00,1732006800000,91780.99,91950.0,91200.0,91694.38,1510.11782,1732010399999,138147700.0,191570,806.75546,73818160.0,0,fail


# Momentum trading strategy
We will implement momentum trading strategy with lookback and relative strength index. I chose momentum over mean reversion as crypto prices in general follow a very strong 'herd mentality'. Let's test out this hypothesis.

In [18]:
def calculate_returns(prices):
    return prices.pct_change()

def calculate_sma(prices, lookback_period):
    return prices.rolling(window=lookback_period).mean()

def calculate_rsi(prices, periods=14):
    delta = prices.diff()
    gain = (delta.where(delta > 0, 0)).rolling(window=periods).mean()
    loss = (-delta.where(delta < 0, 0)).rolling(window=periods).mean()
    rs = gain / loss
    return 100 - (100 / (1 + rs))

In [19]:
def generate_mean_reversion_signals(df, sma, rsi):
    """
    Generate trading signals based on SMA and RSI
    
    Returns:
    Series with values: 1 (buy), -1 (sell), 0 (no position)
    """
    signals = pd.Series(0, index=df.index)
    
    # Basic momentum signals
    signals[df['close'] > sma] = 1  # Buy signal
    signals[df['close'] < sma] = -1  # Sell signal
    
    # Filter signals using RSI
    signals[(signals == 1) & (rsi < 30)] = 0  # Remove oversold signals
    signals[(signals == -1) & (rsi > 70)] = 0  # Remove overbought signals
    
    return signals

In [20]:
def generate_bollinger_signals(df, lookback_period, num_std):
    """
    Generate signals for Bollinger Bands strategy
    
    Parameters:
    df: DataFrame with OHLC data
    lookback_period: Period for SMA and standard deviation calculation
    num_std: Number of standard deviations for band calculation
    """
    # Calculate Bollinger Bands
    std = df['close'].rolling(window=lookback_period).std()
    upper_band = df['SMA'] + (std * num_std)
    lower_band = df['SMA'] - (std * num_std)
    
    signals = pd.Series(index=df.index, data=0)
    
    # Bollinger Bands logic
    signals[df['close'] < lower_band] = 1    # Buy when price crosses below lower band
    signals[df['close'] > upper_band] = -1   # Sell when price crosses above upper band
    
    # Optionally, you can add the bands to the DataFrame if needed
    df['Upper_Band'] = upper_band
    df['Lower_Band'] = lower_band
    
    return signals

In [21]:
def calculate_strategy_returns(price_returns, signals):
    return price_returns * signals.shift(1)

def calculate_cumulative_returns(returns):
    return (1 + returns).cumprod()

In [22]:
def calculate_metrics(price_returns, strategy_returns, signals):
    # Count actual trades when signal changes
    signal_changes = signals[signals != signals.shift(1)]
    total_trades = len(signal_changes) - 1  # Subtract 1 to exclude the first signal
    
    # Only count returns when we actually have trades
    trade_returns = strategy_returns[signals != 0]  # Only consider returns when we have a position
    
    winning_trades = len(trade_returns[trade_returns > 0])
    losing_trades = len(trade_returns[trade_returns < 0])
    
    # Win rate should be winning_trades / (winning_trades + losing_trades)
    win_rate = winning_trades / (winning_trades + losing_trades) if (winning_trades + losing_trades) > 0 else 0
    
    returns_std = strategy_returns.std()
    sharpe_ratio = (np.sqrt(365 * 24) * strategy_returns.mean() / 
                   returns_std if returns_std != 0 else 0)
    
    return {
        'Total Trades': total_trades,
        'Win Rate': win_rate,
        'Sharpe Ratio': sharpe_ratio,
        'Final Return': calculate_cumulative_returns(strategy_returns).iloc[-1] - 1,
        'Market Cumulative Return': calculate_cumulative_returns(price_returns).iloc[-1] - 1
    }

In [23]:

def implement_momentum_strategy(df, strategy_type, lookback_period=24, rsi_period=14):
    """
    Implement momentum trading strategy
    
    Parameters:
    df: DataFrame with OHLC data
    lookback_period: Period for SMA calculation (default: 24 hours)
    rsi_period: Period for RSI calculation (default: 14 hours)
    """
    # Filter for valid data
    df = df[df['data_quality'] == 'pass'].copy()
    
    # Calculate indicators
    df['returns'] = calculate_returns(df['close'])
    df['SMA'] = calculate_sma(df['close'], lookback_period)
    df['RSI'] = calculate_rsi(df['close'], rsi_period)
    
    # Generate signals
    if strategy_type == 'mean_reversion':
        df['signal'] = generate_mean_reversion_signals(df, df['SMA'], df['RSI'])
    elif strategy_type == 'bollinger_bands':
        df['signal'] = generate_bollinger_signals(df, lookback_period, num_std=2)
    
    # Calculate returns
    # `returns` is in percentage. Thus strategy returns is also in percentage change of price   
    df['strategy_returns'] = calculate_strategy_returns(df['returns'], df['signal'])
    
    # Calculate cumulative returns
    df['market_cumulative_returns'] = calculate_cumulative_returns(df['returns'])
    df['strategy_cumulative_returns'] = calculate_cumulative_returns(df['strategy_returns'])
    
    # Calculate metrics
    metrics = calculate_metrics(df['returns'], df['strategy_returns'], df['signal'])
    
    return df, metrics

In [24]:
def run_strategy(df, strategy_type='mean_reversion', lookback_period=24, rsi_period=14):
    """
    Run the momentum strategy and print results
    """
    df, metrics = implement_momentum_strategy(df, strategy_type, lookback_period, rsi_period)
    
    print("\nStrategy Metrics:")
    for key, value in metrics.items():
        print(f"{key}: {value:.4f}")
    
    return df, metrics

In [25]:
run_strategy(df)


Strategy Metrics:
Total Trades: 495.0000
Win Rate: 0.4794
Sharpe Ratio: -1.1615
Final Return: -0.3030
Market Cumulative Return: 0.5755


(                            open time      open      high       low     close  \
 open_timestamp                                                                  
 2024-08-09 00:00:00     1723161600000  61686.00  61744.37  61086.95  61214.00   
 2024-08-09 01:00:00     1723165200000  61214.01  61550.08  60720.00  61400.00   
 2024-08-09 02:00:00     1723168800000  61400.00  61461.03  61147.88  61235.51   
 2024-08-09 03:00:00     1723172400000  61235.51  61573.27  61171.86  61358.39   
 2024-08-09 04:00:00     1723176000000  61358.39  61390.70  60720.00  60720.01   
 ...                               ...       ...       ...       ...       ...   
 2025-02-08 19:00:00  1739041200000000  96512.00  96750.00  96399.27  96579.99   
 2025-02-08 20:00:00  1739044800000000  96579.99  96638.89  96358.50  96455.92   
 2025-02-08 21:00:00  1739048400000000  96455.92  96600.00  96360.00  96527.36   
 2025-02-08 22:00:00  1739052000000000  96527.37  96712.00  96388.88  96473.90   
 2025-02-08 23:0

## Test various lookback and rsi periods
As we can see, our default parameters of lookback and rsi fared very badly. Let's try to improve it by experimenting with various lengths of lookback and rsi.

In [26]:
def analyze_different_periods(df, strategy_type='mean_reversion', lookback_periods=[6, 12, 24], rsi_periods=[14, 21, 28]):
    """
    Analyze strategy performance with different combinations of periods
    """
    results = []
    
    for lookback in lookback_periods:
        for rsi in rsi_periods:
            _, metrics = implement_momentum_strategy(df, strategy_type, lookback, rsi)
            results.append({
                'Lookback Period': lookback,
                'RSI Period': rsi,
                'Final Return': metrics['Final Return'],
                'Market Cumulative Return': metrics['Market Cumulative Return'],
                'Sharpe Ratio': metrics['Sharpe Ratio'],
                'Win Rate': metrics['Win Rate']
            })
    
    return pd.DataFrame(results)

In [31]:
def find_optimal_periods(df, strategy_type, lookback_periods, rsi_periods):
    """
    Test different period combinations and find the best performing ones
    """ 
    results_df = analyze_different_periods(df, strategy_type, lookback_periods, rsi_periods)
    
    # Sort by Sharpe Ratio (or Final Return, depending on your preference)
    results_df = results_df.sort_values('Sharpe Ratio', ascending=False)
    
    print("\nTop 5 Period Combinations:")
    print(results_df.head())
    
    return results_df

In [None]:
lookback_periods = list(range(2, 25, 2))
rsi_periods = list(range(2, 25, 2))

In [34]:
strategy_type = 'mean_reversion'

In [35]:
experiment_df = find_optimal_periods(df, strategy_type, lookback_periods, rsi_periods)


Top 5 Period Combinations:
     Lookback Period  RSI Period  Final Return  Market Cumulative Return  \
135               24           8     -0.279785                  0.575534   
138               24          14     -0.302963                  0.575534   
143               24          24     -0.303752                  0.575534   
142               24          22     -0.304893                  0.575534   
141               24          20     -0.304893                  0.575534   

     Sharpe Ratio  Win Rate  
135     -1.043018  0.482207  
138     -1.161521  0.479377  
143     -1.165886  0.479615  
142     -1.172303  0.479396  
141     -1.172303  0.479396  


Ok, we have managed to slightly improve the final return from -0.302 to -0.279. This is obviously still bad, so we shall now look at AlgoVibes momentum trading strategy implementation and note what we can take away.

## Bollinger Band Mean Reversion Strategy
To make the code more organized, I've added Bollinger Band Mean Reversion Strategy to above. It is a little confusing if you're reading from top to bottom, just understand that I'm trying Bollinger Band Mean Reversion Strategy after simple mean reversion strategy.

In [36]:
strategy_type = 'bollinger_bands'

In [37]:
results_df = find_optimal_periods(df, strategy_type, lookback_periods, rsi_periods)


Top 5 Period Combinations:
    Lookback Period  RSI Period  Final Return  Market Cumulative Return  \
29                6          12      0.029542                  0.575534   
24                6           2      0.029542                  0.575534   
34                6          22      0.029542                  0.575534   
33                6          20      0.029542                  0.575534   
32                6          18      0.029542                  0.575534   

    Sharpe Ratio  Win Rate  
29      1.015062       0.0  
24      1.015062       0.0  
34      1.015062       0.0  
33      1.015062       0.0  
32      1.015062       0.0  


Surprise! With Bollinger Band we managed to get to a positive return, assuming no slippage or transaction cost. Unfortunately, our model is still performing way worse than the market cumulative return. Let's look at how to factor in slippage and transaction cost.