# Seasonal Trading Plan Project
### This project seeks to figure out which stocks consistently go up over 30, 60 or 90 day periods, year after year, at least 80% of the time. It turns out there are many of them.

In this notebook, I created a process to download full historical EOD price data on each of the S&P 500 index components and analyze the historical patterns to find situations where seasonal trends can be taken advantage of over the course of a year.

For each stock, I calculated the following statistics:

Holding Period:  The number of days held in each rolling period - 30, 60 and 90 days each.

% Up Rows:  The percentage of rolling periods where, each year, the stock went up at least 80% of the time.

Max Up Return: The highest return for each rolling period where the return was positive.

Min Up Return: The lowest return for each rolling period where the return was positive.

Avg Up Return:  The average return for each rolling period where the return was positive.

Avg Down Return:  The average return for each rolling period where the return was negative.

Expected Return: The average return for all rolling periods.

StDev Returns: The average standard deviation of returns for all rolling periods. This means that the expected return should be within this +/- range from the average 67% of the time.

% Downside:  The expected average return minus the standard deviation. This means that you should at least exceed this return 67% of the time. This could provide guidance for setting stop-loss levels.

Worst Return:  The worst return of all rolling periods.

Least Pain Pt:  The time interval that offers the best case 67% downside within the sweet spot period. This is used to pinpoint the lowest risk time frame.

Best Begin Date:  The recommended entry date for best risk/reward scenario.

Best End Date:  The recommended exit date for best risk/reward scenario.

Max Consec 80%+:  The longest consecutive number of rolling periods above 80% up years. You want to see more that a few consecutive up years over the threshold to provide assurance that the trend is robust.

Total Years:  The number of years of historical data available for each stock. I filtered for stocks with 10+ years of history.

In [1]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import yfinance as yf
from pandas_datareader import data as pdr

In [2]:
# This is the path to where I store this notebook and downloaded files. You need to change this to a convenient
# spot on your own hard drive.

my_path = '/Users/bnsheehy/Documents/Investments/Seasonal_Analyses/Python'
threshold = 0.80

In [3]:
# You need to go to Yahoo and download a list of the S&P 500 components. Make sure to save it to
# a CSV file with column headers that include "Symbol", "Date" and "Close" 
file_sp500_tickers = my_path + "/SP500_TickerList.csv"

In [4]:
# Upload a list of the S&P 500 components downloaded from Yahoo.

df_sp500_tickers = pd.read_csv (file_sp500_tickers)
df_sp500_tickers.head()

Unnamed: 0,Symbol,Company Name,GICS Sector,GICS Sub Industry
0,A,Agilent Technologies Inc,Health Care,Health Care Equipment
1,AAL,American Airlines Group,Industrials,Airlines
2,AAP,Advance Auto Parts,Consumer Discretionary,Automotive Retail
3,AAPL,Apple Inc.,Information Technology,"Technology Hardware, Storage & Peripherals"
4,ABBV,Rickers,Health Care,Pharmaceuticals


In [24]:
# This module loops through the S&P 500 tickers, downloads the data from Yahoo and creates a separate CSV 
# file of historical data for each ticker (e.g. AAPL.csv).
# Skip this routine if you already have the CSV files available.

for index, ticker in df_sp500_tickers.iterrows():
    global df
    
    my_ticker = ticker['Symbol']
    
    yf_ticker = yf.Ticker(my_ticker)
    data = yf_ticker.history(period="max")
    df = pd.DataFrame(data)
    df.reset_index(level=0, inplace=True)
    df['Symbol'] = my_ticker
    df = df[['Symbol','Date','Close']]
    df.drop_duplicates(subset ="Date", keep = 'first', inplace = True) #Yahoo has a tendency to duplicate the last row.
    df.to_csv(path_or_buf = my_path + "/data/" + my_ticker +".csv", index=False)
    

In [25]:
# Creates the dataframe container for the stats data.

df_tradelist = pd.DataFrame(index=[], columns=['my_ticker', 'hold_per', 'pct_uprows', 'max_up_return', 'min_up_return', 'avg_up_return', 'avg_down_return', 'exp_return', 'stdev_returns', 'worst_return', 'pct_downside', 'least_pain_pt', 'total_years', 'max_consec_beat', 'best_buy_date', 'best_sell_date'])

df_tradelist.head()

Unnamed: 0,my_ticker,hold_per,pct_uprows,max_up_return,min_up_return,avg_up_return,avg_down_return,exp_return,stdev_returns,worst_return,pct_downside,least_pain_pt,total_years,max_consec_beat,best_buy_date,best_sell_date


In [26]:
# Convert prices to 1 month returns based on 20 trading days per month.

def convert_prices_to_periods():
    
    global dperiods
    global dfr
        
    dfr = df.pct_change(periods = dperiods)
    dfr.reset_index(level=0, inplace=True)
    dfr.rename(columns={'Close':'Returns'}, inplace=True)
    dfr = dfr.round(4)

In [27]:
# Separate out the date column into separate month, year and day values.

def separate_date_column():
    
    global dfr
    
    dfr['Month'] = pd.DatetimeIndex(dfr['Date']).month
    dfr['Day'] = pd.DatetimeIndex(dfr['Date']).day
    dfr['Year'] = pd.DatetimeIndex(dfr['Date']).year
    dfr['M-D'] = dfr['Month'].astype(str)+'-'+dfr['Day'].astype(str)
    pd.set_option('display.max_rows', len(dfr))

In [28]:
# Pivot the table to show years across the top and Month-Day values in the first column on the left.

def pivot_the_table():
    
    global dfr_pivot
    
    dfr_pivot = dfr.pivot(index='M-D', columns='Year', values='Returns')
    dfr_pivot.reset_index(level=0, inplace=True)
    dfr_pivot = pd.DataFrame(dfr_pivot)
    dfr_pivot.columns.name="Index"

    # The pivot operation created empty cells for weekends and holiday, so I filled them with EOD values from
    # the previous trading day.
    dfr_pivot.fillna(method='ffill', inplace=True)
    
    # As of this date, 1/22/2020, we are only evaluating results through 12/31/2019, so we will drop the
    # 2020 year column.
    if 2020 in dfr_pivot.columns:
        dfr_pivot.drop(2020, axis=1, inplace=True)
    

In [29]:
# Add additional calculated columns to facilitate statistic calculations for each stock.

def add_calculated_items():
    
    global dfr_pivot
    
    dfr_pivot['YearCount'] = dfr_pivot.count(axis=1, numeric_only=True)
    dfr_pivot['UpCount'] = dfr_pivot[dfr_pivot.iloc[:,1:len(dfr_pivot.columns)] > 0].count(axis=1)-1
    dfr_pivot['DownCount'] = dfr_pivot[dfr_pivot.iloc[:,1:len(dfr_pivot.columns)] < 0].count(axis=1)
    dfr_pivot['PctUp'] = dfr_pivot['UpCount']/dfr_pivot['YearCount']
    dfr_pivot['PctDown'] = dfr_pivot['DownCount']/dfr_pivot['YearCount']
    dfr_pivot['AvgReturn'] = dfr_pivot.iloc[:,1:len(dfr_pivot.columns)-5].mean(axis=1)
    dfr_pivot['StDevReturns'] = dfr_pivot.iloc[:,1:len(dfr_pivot.columns)-6].std(axis=1)
    dfr_pivot['67PctDownside'] = dfr_pivot['AvgReturn']-dfr_pivot['StDevReturns']
    dfr_pivot['MaxReturn'] = dfr_pivot.iloc[:,1:len(dfr_pivot.columns)-8].max(axis=1)
    dfr_pivot['MinReturn'] = dfr_pivot.iloc[:,1:len(dfr_pivot.columns)-9].min(axis=1)

In [30]:
# Add a fictional date column in Python date/time format so the table can be sorted by date. Then sort by Date.
# Reset the index and round the float values to 4 decimals.

def sortbydate_resetindex_export():
    
    global dfr_pivot
    
    dfr_pivot['Date'] = '2000-' + dfr_pivot['M-D'].astype(str)
    dfr_pivot['Date'] = pd.to_datetime(dfr_pivot['Date'], infer_datetime_format=True)
    dfr_pivot.sort_values(by='Date',ascending=True, inplace=True)
        
    dfr_pivot.reset_index(inplace=True)
    dfr_pivot = dfr_pivot.round(4)


In [31]:
# Calculate the trading statistics for the rolling holding periods for the stock.

def calc_trading_stats():
    
    global lookback
    global dfr_pivot
    global pct_uprows
    global max_up_return
    global min_up_return
    global avg_up_return
    global avg_down_return
    global exp_return
    global stdev_returns
    global pct_downside
    global worst_return
    global least_pain_pt
    global total_years
    global n_consec
    global max_n_consec
    global max_consec_beat
    global best_sell_date
    global best_buy_date
    
    pct_uprows = (dfr_pivot.loc[dfr_pivot['PctUp'] > threshold, 'PctUp'].count() / dfr_pivot.loc[:, 'PctUp'].count()).astype(float).round(4)
    max_up_return = dfr_pivot.loc[dfr_pivot['PctUp'] > threshold, 'MaxReturn'].max()
    min_up_return = dfr_pivot.loc[dfr_pivot['PctUp'] > threshold, 'MaxReturn'].min()
    avg_up_return = dfr_pivot.loc[dfr_pivot['PctUp'] > 0.5, 'AvgReturn'].mean()
    avg_up_return = np.float64(avg_up_return).round(4)
    avg_down_return = dfr_pivot.loc[dfr_pivot['PctDown'] > 0.5, 'AvgReturn'].mean()
    avg_down_return = np.float64(avg_down_return).round(4)
    exp_return = dfr_pivot['AvgReturn'].mean().round(4)
    stdev_returns = dfr_pivot['StDevReturns'].mean()
    stdev_returns = np.float64(stdev_returns).round(4)
    worst_return = dfr_pivot['MinReturn'].min()
    pct_downside = (avg_up_return - stdev_returns)
    pct_downside = np.float64(pct_downside).round(4)
    least_pain_pt = dfr_pivot.loc[dfr_pivot['PctUp'] > threshold, '67PctDownside'].max()
    total_years = dfr_pivot['YearCount'].max()
    
    n_consec = 0
    max_n_consec = 0

    for x in dfr_pivot['PctUp']:
        if (x > threshold):
            n_consec += 1
        else: # check for new max, then start again from 1
            max_n_consec = max(n_consec, max_n_consec)
            n_consec = 1

    max_consec_beat = max_n_consec

    try:
        best_sell_date = dfr_pivot.loc[dfr_pivot['67PctDownside'] == least_pain_pt, 'M-D'].iloc[0]
    except:
        best_sell_date = "nan"

    try:
        row = dfr_pivot.loc[dfr_pivot['M-D'] == best_sell_date, 'M-D'].index[0] - lookback
        col = dfr_pivot.columns.get_loc('M-D')
        best_buy_date = dfr_pivot.iloc[row,col]
    except:
        best_buy_date = "nan"

In [32]:
# If the pct_uprows and history conditions are met, then create the array of stat values and append 
# it to the recommended trade list.

def filter_and_append_stats():
    
    global statsdata
    global df_tradelist
    
    if pct_uprows > 0.1:
        if total_years > 9:
            statsdata = np.array([my_ticker, hold_per, pct_uprows, max_up_return, min_up_return, avg_up_return, avg_down_return, exp_return, stdev_returns, pct_downside, worst_return, least_pain_pt, total_years, max_consec_beat, best_buy_date, best_sell_date])
            df_tradelist = df_tradelist.append(dict(zip(df_tradelist.columns, statsdata)), ignore_index=True)
            

In [33]:
# This module grabs each ticker file, transforms it and calculates the statistics needed for a 90 day holding period.

def calc_3month_returns():
    
    global dfr
    global dfr_pivot
    global df_tradelist
    global threshold
    global hold_per
    global dperiods
    global lookback
    
    dperiods = 60
    hold_per = "3 Mos"
    lookback = 90
    
    convert_prices_to_periods()
    
    separate_date_column()

    pivot_the_table()

    add_calculated_items()

    sortbydate_resetindex_export()
    # Export the pivot table to CSV for further research if desired.
    dfr_pivot.to_csv(path_or_buf = my_path + "/data/" + my_ticker + "_dfr_pivot_3mo.csv", index=False)

    calc_trading_stats()
    
    filter_and_append_stats()



In [34]:
# This module grabs each ticker file, transforms it and calculates the statistics needed for a 60 day holding period.

def calc_2month_returns():
    
    global dfr
    global dfr_pivot
    global df_tradelist
    global threshold
    global hold_per
    global dperiods
    global lookback
    
    dperiods = 40
    hold_per = "2 Mos"
    lookback = 60

    convert_prices_to_periods()
    
    separate_date_column()

    pivot_the_table()

    add_calculated_items()

    sortbydate_resetindex_export()
    # Export the pivot table to CSV for further research if desired.
    dfr_pivot.to_csv(path_or_buf = my_path + "/data/" + my_ticker + "_dfr_pivot_2mo.csv", index=False)

    calc_trading_stats()
    
    filter_and_append_stats()
            

In [35]:
# This module grabs each ticker file, transforms it and calculates the statistics needed for a 30 day holding period.

def calc_1month_returns():
    
    global dfr
    global dfr_pivot
    global df_tradelist
    global threshold
    global hold_per
    global dperiods
    global lookback
    
    dperiods = 20
    hold_per = "1 Mo"
    lookback = 30

    convert_prices_to_periods()
    
    separate_date_column()

    pivot_the_table()

    add_calculated_items()

    sortbydate_resetindex_export()
    # Export the pivot table to CSV for further research if desired.
    dfr_pivot.to_csv(path_or_buf = my_path + "/data/" + my_ticker + "_dfr_pivot_1mo.csv", index=False)

    calc_trading_stats()
    
    filter_and_append_stats()
            

In [36]:

# Read CSV files by ticker, transform and extract stats from each one.

for index, ticker in df_sp500_tickers.iterrows():
    
    global df
    global dfr
    
    my_ticker = ticker['Symbol']

    df = pd.read_csv (my_path + "/data/" + my_ticker + ".csv")
    df.set_index('Date', inplace=True)
    df = df['Close']
    df = pd.DataFrame(df, columns=['Close'])
    
    calc_1month_returns()
    
    calc_2month_returns()
    
    calc_3month_returns()


In [37]:
# Make a copy and convert the trade list to a Pandas dataframe.
df_tradelist_copy = df_tradelist.copy()
df_tradelist = pd.DataFrame(df_tradelist)

In [38]:
#df_tradelist.to_csv(path_or_buf = my_path + "/df_tradelist.csv", index=False)
#df_tradelist_copy.to_csv(path_or_buf = my_path + "/df_tradelist_copy.csv", index=False)

In [39]:
# Clean it up by removing rows with NaN's and infinity values and dropping duplicates.
df_tradelist.replace("inf", np.nan, inplace=True)
df_tradelist.dropna(inplace=True)
df_tradelist = df_tradelist[~df_tradelist.max_up_return.str.contains("nan")]
df_tradelist = df_tradelist[~df_tradelist.avg_down_return.str.contains("nan")]
df_tradelist.sort_values(by=['pct_uprows'], ascending=False)
df_tradelist.drop_duplicates(subset ="my_ticker", keep = 'first', inplace = True) 

In [40]:
# Preview the end of the trade list to see how many rows and make sure the data is showing up where it is supposed to.

df_tradelist.tail(10)

Unnamed: 0,my_ticker,hold_per,pct_uprows,max_up_return,min_up_return,avg_up_return,avg_down_return,exp_return,stdev_returns,worst_return,pct_downside,least_pain_pt,total_years,max_consec_beat,best_buy_date,best_sell_date
48,MSCI,1 Mo,0.1074,0.4515,0.088,0.0476,-0.0323,0.0382,0.1157,-0.0681,-0.3936,0.0026,13,9,12-11,1-12
51,MTD,2 Mos,0.1433,0.5714,0.1808,0.0357,0.0034,0.0335,0.1042,-0.0685,-0.3982,0.0194,22,18,9-30,11-29
55,RSG,2 Mos,0.2424,0.5872,0.1315,0.0356,-0.0493,0.0234,0.099,-0.0634,-0.5492,-0.0069,22,15,3-1,4-30
57,SYK,3 Mos,0.1102,0.7455,0.3282,0.0647,0.0007,0.0616,0.146,-0.0813,-0.5,0.0011,40,18,11-7,2-7
60,TEL,3 Mos,0.1295,1.3839,0.1426,0.0453,0.0018,0.0424,0.1548,-0.1095,-0.6157,0.0247,13,14,11-30,3-1
61,ULTA,3 Mos,0.1928,1.4654,0.2382,0.078,-0.0211,0.0701,0.2041,-0.1261,-0.5625,0.0276,12,22,2-16,5-16
62,V,1 Mo,0.1983,0.339,0.0503,0.0244,-0.0049,0.0213,0.0625,-0.0381,-0.2975,0.0136,12,9,2-4,3-5
65,VRSK,1 Mo,0.1791,0.1499,0.0353,0.0184,-0.0024,0.0151,0.0438,-0.0254,-0.1552,0.0069,11,12,6-6,7-7
68,WCG,1 Mo,0.1433,1.0482,0.1163,0.0408,-0.0129,0.0326,0.136,-0.0952,-0.8006,0.0111,16,7,5-3,6-2
69,YUM,2 Mos,0.1543,0.5174,0.175,0.0368,-0.0075,0.0302,0.0957,-0.0589,-0.3573,0.0246,23,25,3-2,5-1


In [41]:
df_tradelist.head()
#df_tradelist.shape

Unnamed: 0,my_ticker,hold_per,pct_uprows,max_up_return,min_up_return,avg_up_return,avg_down_return,exp_return,stdev_returns,worst_return,pct_downside,least_pain_pt,total_years,max_consec_beat,best_buy_date,best_sell_date
0,AIZ,3 Mos,0.1047,0.4205,0.1288,0.0413,0.0278,0.0406,0.1271,-0.0858,-0.7749,-0.0167,16,19,3-21,6-19
1,AVGO,1 Mo,0.2507,0.2795,0.0881,0.0395,-0.0022,0.0312,0.0801,-0.0406,-0.2726,0.0427,11,10,5-3,6-2
4,AWK,1 Mo,0.1708,0.2013,0.0493,0.0196,-0.0046,0.0167,0.0451,-0.0255,-0.2233,0.0115,12,12,3-7,4-6
7,BKNG,2 Mos,0.1129,1.6303,0.3431,0.0546,-0.0009,0.042,0.2529,-0.1983,-0.8837,0.0067,21,10,2-5,4-5
9,BLK,3 Mos,0.1653,0.4916,0.2297,0.0608,0.0369,0.06,0.1411,-0.0803,-0.53,0.0226,21,13,10-10,1-10


In [42]:
# Export the trade list to CSV files for execution and/or further research if desired.
df_tradelist.to_csv(path_or_buf = my_path + "/df_tradelist.csv", index=False)