# Strategies and Panel Data

When working with panel data, we have two options.

1. We can apply a strategy to all stocks in the panel data.

1. We can subset a particular stock, effectively obtaining time series data for that single asset. From here we can use the same method we did in the previous notebook. This has the advantage of parameter optimisation, which we'll see at the end of this notebook.

Let's start with the *one strategy for all stocks* approach, applying the SMA crossover strategy.

First, imports and preparing the dataframe.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

plt.style.use("ggplot")

In [2]:
df = pd.read_csv("data/top_six_2020_2025.csv")
df.DlyCalDt = pd.to_datetime(df.DlyCalDt, dayfirst=True)

### Exercise: Recycling Code

Whenever we're going to be repeating an operation, it is usually a good idea (and sometimes necessary) to create our own function. The code below is correct, but out-of-order. Re-arrange the below into a function called `calculate_sma`, which will take three inputs:
* a data frame that contains a prccd column
* a fast window size
* a slow window size

The function will add as features:
* slow and fast moving averages based on the prccd price
* buy/sell signals
* a position

In [None]:
df

# dfi["FastMA"] = dfi.DlyClose.rolling(window=fast_window).mean() ###

# dfi["Position"] = dfi.Signal.shift() ###

# dfi.Signal = np.where(dfi.SlowMA.isna(), 0, dfi.Signal) ###

# return dfi ##

# dfi["Signal"] = np.where(dfi.FastMA > dfi.SlowMA, 1, -1) ###

# dfi["SlowMA"] = dfi.DlyClose.rolling(window=slow_window).mean() ###

# def calculate_sma(dfi, fast_window, slow_window): ###

Unnamed: 0,DlyCalDt,Ticker,DlyClose,DlyHigh,DlyLow,DlyOpen,DlyVolume
0,2020-01-02,AAPL,72.620834,72.681281,71.373211,71.627084,135480400
1,2020-01-03,AAPL,71.914818,72.676447,71.689957,71.847118,146322800
2,2020-01-06,AAPL,72.487839,72.526526,70.783241,71.034702,118387200
3,2020-01-07,AAPL,72.146927,72.753808,71.926900,72.497514,108872000
4,2020-01-08,AAPL,73.307503,73.609737,71.849525,71.849525,132079200
...,...,...,...,...,...,...,...
7537,2024-12-23,NVDA,139.657150,139.777134,135.107566,136.267463,176053500
7538,2024-12-24,NVDA,140.207108,141.886946,138.637245,139.987127,105157000
7539,2024-12-26,NVDA,139.917130,140.837058,137.717335,139.687155,116205600
7540,2024-12-27,NVDA,136.997391,139.007216,134.697615,138.537258,170582600


In [4]:
def calculate_sma(dfi, fast_window, slow_window):
    dfi["FastMA"] = dfi.DlyClose.rolling(window=fast_window).mean()
    dfi["SlowMA"] = dfi.DlyClose.rolling(window=slow_window).mean()

    dfi["Signal"] = np.where(dfi.FastMA > dfi.SlowMA, 1, -1)
    dfi.Signal = np.where(dfi.SlowMA.isna(), 0, dfi.Signal)

    dfi["Position"] = dfi.Signal.shift()

    return dfi

calculate_sma(df, 5, 10)

Unnamed: 0,DlyCalDt,Ticker,DlyClose,DlyHigh,DlyLow,DlyOpen,DlyVolume,FastMA,SlowMA,Signal,Position
0,2020-01-02,AAPL,72.620834,72.681281,71.373211,71.627084,135480400,,,0,
1,2020-01-03,AAPL,71.914818,72.676447,71.689957,71.847118,146322800,,,0,0.0
2,2020-01-06,AAPL,72.487839,72.526526,70.783241,71.034702,118387200,,,0,0.0
3,2020-01-07,AAPL,72.146927,72.753808,71.926900,72.497514,108872000,,,0,0.0
4,2020-01-08,AAPL,73.307503,73.609737,71.849525,71.849525,132079200,72.495584,,0,0.0
...,...,...,...,...,...,...,...,...,...,...,...
7537,2024-12-23,NVDA,139.657150,139.777134,135.107566,136.267463,176053500,132.857776,134.219652,-1,-1.0
7538,2024-12-24,NVDA,140.207108,141.886946,138.637245,139.987127,105157000,134.823596,134.734604,1,-1.0
7539,2024-12-26,NVDA,139.917130,140.837058,137.717335,139.687155,116205600,137.027393,134.796599,1,1.0
7540,2024-12-27,NVDA,136.997391,139.007216,134.697615,138.537258,170582600,138.293277,134.763602,1,1.0


With the function done, we can apply it groupwise to the stocks in our panel data by iterating over the groups.

In [None]:
FAST, SLOW = 50, 200

grouped = df.groupby("Ticker") # grouping df by the security name

groups = [] # creating a list

for name, group in grouped:                    #name, group in the for loop as now we need to ask for 2 things from the grouped
    signals = calculate_sma(group, FAST, SLOW)
    groups.append(signals)

df = pd.concat(groups) # sticking things togethere through "concat"
df

Unnamed: 0,DlyCalDt,Ticker,DlyClose,DlyHigh,DlyLow,DlyOpen,DlyVolume,FastMA,SlowMA,Signal,Position
0,2020-01-02,AAPL,72.620834,72.681281,71.373211,71.627084,135480400,,,0,
1,2020-01-03,AAPL,71.914818,72.676447,71.689957,71.847118,146322800,,,0,0.0
2,2020-01-06,AAPL,72.487839,72.526526,70.783241,71.034702,118387200,,,0,0.0
3,2020-01-07,AAPL,72.146927,72.753808,71.926900,72.497514,108872000,,,0,0.0
4,2020-01-08,AAPL,73.307503,73.609737,71.849525,71.849525,132079200,,,0,0.0
...,...,...,...,...,...,...,...,...,...,...,...
7537,2024-12-23,NVDA,139.657150,139.777134,135.107566,136.267463,176053500,139.697961,116.614786,1,1.0
7538,2024-12-24,NVDA,140.207108,141.886946,138.637245,139.987127,105157000,139.741147,116.887092,1,1.0
7539,2024-12-26,NVDA,139.917130,140.837058,137.717335,139.687155,116205600,139.907913,117.127264,1,1.0
7540,2024-12-27,NVDA,136.997391,139.007216,134.697615,138.537258,170582600,139.933897,117.357960,1,1.0


Then it's a matter of calculating the cumulative returns as we've done before.... nearly. Remember that we have grouped data here!

In [9]:
df["Returns"] = df.groupby("Ticker").DlyClose.pct_change()
df["Strategy"] = df.Returns * df.Signal

df["BuyHold"] = (1 + df.Returns).groupby(df.Ticker).cumprod() - 1 
df["MACS"] = (1 + df.Strategy).groupby(df.Ticker).cumprod() - 1 

df.tail()

Unnamed: 0,DlyCalDt,Ticker,DlyClose,DlyHigh,DlyLow,DlyOpen,DlyVolume,FastMA,SlowMA,Signal,Position,Returns,Strategy,BuyHold,MACS
7537,2024-12-23,NVDA,139.65715,139.777134,135.107566,136.267463,176053500,139.697961,116.614786,1,1.0,0.036897,0.036897,22.384692,8.834289
7538,2024-12-24,NVDA,140.207108,141.886946,138.637245,139.987127,105157000,139.741147,116.887092,1,1.0,0.003938,0.003938,22.476778,8.873015
7539,2024-12-26,NVDA,139.91713,140.837058,137.717335,139.687155,116205600,139.907913,117.127264,1,1.0,-0.002068,-0.002068,22.428223,8.852596
7540,2024-12-27,NVDA,136.997391,139.007216,134.697615,138.537258,170582600,139.933897,117.35796,1,1.0,-0.020868,-0.020868,21.939332,8.646995
7541,2024-12-30,NVDA,137.477356,140.257099,134.007674,134.817597,167734700,139.945285,117.605771,1,1.0,0.003503,0.003503,22.019699,8.680793


### Exercise: Data Display

Can you produce a data frame with the end-of-period market and strategy cumulative return for each stock?

In [152]:
## YOUR CODE GOES HERE

## Single Stock

Let's subset META (Facebook) and use that for the rest of the notebook.

### Exercise: BB Breakout

Implement an Bollinger Bands Breakout strategy and backtest it on META.

- First create high and low Bollinger Bands as usual
- Then generate signals as follows:
  - Buy (+1) when the close price crosses above the upper band
    - Hold the position until a signal change
  - Sell (-1) when the close price crosses under the lower band
    - Hold the position until a signal change
- Generate positions
- Calculate the strategy returns and cumulative strategy returns
- Report and plot the cumulative strategy and market returns

**HINT** You can create an empty column by assigning it `np.nan`

In [154]:
## YOUR CODE GOES HERE

## Parameter Optimisation

Parameter optimisation involves getting the best out of a strategy. The *parameters* in our strategy are the window size of the fast SMA and slow SMA. We used 50 and 200 days, but could try different combinations to see if they give better results.

Common **fast** window sizes include 7, 20, 50 and common **slow** window sizes include 50, 100, 200.

However, beware of overfitting! When a model fits our historical data too closely, it can perform poorly in future real-world scenarios.

### Exercise: Which windows?

The code below implements an Exponential Moving Average Crossover Strategy. Similar to our SMA strategy, but calculating the rolling mean in a different way that takes recent prices into greater consideration.

Unfortunately, the implementation with the window sizes below does not perform well. Modify the code below to perform a parameter optimisation on the strategy. Your code should be able to test 3 fast window sizes and 3 slow window sizes in combination.

Can you find a combination of window sizes that sees this EMA strategy outperform the simple Bollinger Band strategy above?

In [155]:
ema = meta.copy()

## MODIFY THIS CODE

fast_window, slow_window = 50, 200

ema["FastEMA"] = ema.DlyClose.ewm(span=fast_window, adjust=False).mean()
ema["SlowEMA"] = ema.DlyClose.ewm(span=slow_window, adjust=False).mean()

ema["Signal"] = np.where(ema.FastEMA > ema.SlowEMA, 1, -1)
ema["Signal"] = np.where(ema.SlowEMA.isna(), 0, ema.Signal)

ema["Position"] = ema.Signal.shift()
ema["StratRet"] = ema.Position * ema.MarketDaily
ema["Strategy"] = (1 + ema.StratRet).cumprod() - 1

print("Short window:", fast_window, "Long window:", slow_window)
print("Cumulative Strategy Return is", ema.Strategy.iloc[-1])