In [1]:
import pandas as pd
import numpy as np

## Open Gap Signal Trading

### Introduction

**The idea of this strategy is to chase the momentum in a financial market where information dissipates not all at once.**
For example, after a strong earnings report, a single stock can open at a very high level on the next trading day compared to the closing price from the previous trading day. We observe that this uptick continues throughout the next trading day. Therefore, there are opportunities for traders to take advantage of.

In this project, our trading universe is 100 stocks from S&P 100 index (excluding META due to data quality issues), between 2010-01-04 and 2022-12-30.

Further research are performed on **optimal lookback window, optimal holding period, optimal portfolio construction, max profit/loss threshold, inverse signals**, etc. 

### Methodology
**Formally, the strategy generates a buy signal for a single stock when the gap between open price at trading day t and close price at trading day t-1 is considered significant over a fixed period of look back window.** A mathematical explanation of a gap being "significant" is as follows,<br><br>

*Consider a fixed-length look back window $W$=50 trading days. The open gap $gap_{t,i}$ for stock $i$ is defined as<br><br>
$$
gap_{t,i} = S_{open,t,i} - S_{Close,t-1,i},
$$<br>
where $S_{open,t,i}$ is the open price of stock $i$ at trading day t, and $S_{Close,t-1,i}$ is the close price of stock $i$ at trading day t-1.<br><br>
For stock $i$, we obtain a time series $G_{i}$ = {$gap_{s-49,i}$, $gap_{s-48,i}$, $gap_{s-47,i}$,..., $gap_{s,i}$}, where $s$ corresponds to the last trading day in the look back window $W$.
For a significance level of 90%, $gap_{s,i}$ is considered significant if and only if $gap_{s,i}$ is at least in the 90th-percentile of $G_{i}$. Under this case, our strategy will generate a buy signal for stock $i$ on trading day $s$ at open. Since $gap_{s,i}$ is available information on trading day $s$ as soon as market opens, this strategy does not have look ahead bias.*<br>

### Signal Generation

In [5]:
signals = pd.read_csv('./signals/Open Gap Signals/SP 100/sp100_signal_100rolling_90lvl.csv')

In [6]:
selected = [i for i in signals.columns if i!='META']
signals = signals[selected]

signals.head()

Unnamed: 0,Date,AAPL,ABBV,ABT,ACN,ADBE,AIG,AMD,AMGN,AMT,...,UNH,UNP,UPS,USB,V,VZ,WBA,WFC,WMT,XOM
0,2010-05-27,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,0.0,1.0,0.0,1.0,1.0,0.0,1.0,1.0,1.0
1,2010-05-28,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,2010-06-01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
3,2010-06-02,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0
4,2010-06-03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0


In [None]:
# WIP 
# number of trades every month/year
# avg time of break between trades

In [None]:
strat_pnl_summary = strat_pnl.groupby(['year','Time']).describe().droplevel(0,axis=1)[['mean','std','min','max']].unstack()

In [None]:
strat_pnl_summary['mean'].T.plot().legend(bbox_to_anchor=(1, 1))

In [None]:
strat_pnl_summary['std'].T.plot().legend(bbox_to_anchor=(1, 1))

In [None]:
(strat_pnl_summary['mean']/strat_pnl_summary['std']).T.plot().legend(bbox_to_anchor=(1, 1))

In [None]:
strat_pnl.groupby(['year','Time'])['Open_to_Date_Return'].count().groupby('year').mean().round()

In [None]:
strat_pnl.groupby(['Day','Time'])['Open_to_Date_Return'].mean().unstack()['15:45'].plot()

In [None]:
for i in ['09:30', '09:45', '10:00', '10:15', '10:30', '10:45', '11:00', '11:15','11:30', '11:45', '12:00', '12:15', '12:30', '12:45', '13:00', '13:15',
       '13:30', '13:45', '14:00', '14:15', '14:30', '14:45', '15:00', '15:15','15:30', '15:45']:
    print(i,sum(strat_pnl.groupby(['Day','Time'])['Open_to_Date_Return'].mean().unstack()[i]>0)/len(strat_pnl.groupby(['Day','Time'])['Open_to_Date_Return'].mean().unstack()[i]))