# Weekly Backtest Template

This notebook will go through the steps in a successful backtest run of our trading strategy.

Inputs:
    - backtest_start, backtest_end = First and last date to backtest
    - universe object

1. Function initialization:
    - sim_date = backtest_start advanced to Saturday. Tracks current simulated day.
    - Initialize status_df, Indexed on stocks.keys(). Columns:
        - "lagged_return_coefficient" column
        - "prior_return" column
        - "predicted_return" column
    - Initialize results_df. Columns:
        - "week_starting"
        - "long_stock"
        - "short_stock"
        - "long_return"
        - "short_return
1. Do while sim_date <= backtest_end
    1. Saturday:
        - Update "prior_return" column
        - Update "lagged_return_coefficient" by running beta model
        - Multiply "predicted_return"
        - Add 2 to sim_date
    1. Monday:
        - Update results_df
            - New "week_starting" row equal to sim_date
            - long_stock, short_stock = stocks with highest and lowest predicted_return
        - Add 4 to sim_date
    1. Friday:
        - Update results_df
            - long_return, short_return
        - Add 1 to sim_date
1. return results_df

## Import Packages

In [1]:
import datetime as dt
import pandas as pd
import numpy as np
from pandas_datareader import data
import statsmodels.formula.api as sm
import time
import wmcm

## Load Universe
I created a CSV listing all stocks included in SPDRâ€™s sector funds as of January 24th, 2016. It is loaded as a pandas dataframe. In order to make calls to Yahoo, all periods are replaced with hyphens.

We also pull all sector tickers and SPY as a proxy for the market as a whole.

In [2]:
universe_stocks = pd.read_csv('inputs/stocks.csv', index_col='symbol')
universe_stocks.index = map(lambda x: x.replace('.', '-'), universe_stocks.index)

universe_sectors = pd.read_csv('inputs/sectors.csv', index_col='symbol')
universe_sectors.index = map(lambda x: x.replace('.', '-'), universe_sectors.index)

# limited to a few stocks
universe_stocks = universe_stocks.ix[['GOOG', 'PM', 'XOM']]
# universe_stocks.head()

## Pull Price History

Price history is pulled from Yahoo into separate Stock instances. Stocks were pulled from January 1st 2010 through December 31st, 2015.

In [4]:
stocks = wmcm.Universe(universe_stocks.index, 'SPY', interval='w', verbose=False)
# stocks.save('sp500_uni.p')
#stocks = wmcm.Universe.load('sp500_uni.p')

Stock : GOOG
        Starting Date : 2011-01-01 00:00:00
        Ending Date : 2015-12-31 00:00:00
        Frequency : w

## Advance Day Formula
New function to advance a day to the next day of a given week. Values:

0 = Monday
1 = Tuesday
2 = Wednesday
3 = Thursday
4 = Friday
5 = Saturday
6 = Sunday

In [15]:
# New function to adance to a given weekday.

def next_weekday(d, weekday):
    days_ahead = weekday - d.weekday()
    if days_ahead < 0: # Target day already happened this week
        days_ahead += 7
    return d + dt.timedelta(days_ahead)

next_weekday(dt.datetime.strptime('2016-5-9', '%Y-%m-%d'), 0)

datetime.datetime(2016, 5, 9, 0, 0)

## Backtest

In [None]:
def weekly_backtest(universe=stocks, backtest_start = '2012-01-01', backtest_end='2015-12-31'):
    
    # need assertion that universe is a Universe class with weekly interval
    
    
    # set sim_date equal to first Saturday in backtest range
    sim_date = next_weekday(dt.datetime.strptime(backtest_start, '%Y-%m-%d'), 5)
    
    # create empty status_df
    status_df = pd.DataFrame(index=stocks.keys(),
                             columns=['lagged_return_coefficient',
                                      'prior_return',
                                      'predicted_return'])
    status_df.drop('market', inplace=True)
    
    # create empty results_df
    results_df = pd.DataFrame(columns=['week_starting',
                                      'long_stock',
                                      'short_stock',
                                      'long_return',
                                      'short_return'])
    
    while sim_date <= backtest_end:
        
        if sim_date.weekday() == 6: # Saturday
            
            # Update status_df. THIS CODE DOESN'T WORK.
            for tic in status_df.index():
                status_df[tic, 'lagged_return_coefficient'] = universe.factor_model(tic, 'return ~ return_market + lag(return)')
                status_df[tic, 'prior_return'] = 0
                status_df[tic, 'predicted_return'] = status_df[tic, 'lagged_return_coefficient'] * status_df[tic, 'prior_return']
            
            # To Monday
            sim_date = next_weekday(sim_date, 0)
        
        if sim_date.weekday() == 0: # Monday
            
            # new week_starting entry in results_df
                # long_stock = highest predicted_return
                # short_stock = lowest predicted_return
                # others = np.nan
            
            # To Friday
            sim_date = next_weekday(sim_date, 4)

        if sim_date.weekday() == 4: # Friday
            
            # add long_return and short_return to results_df
            
            # To Saturday
            sim_date = next_weekday(sim_date, 6)
    
    return results_df

## Changes needed to .factor_model():

1. I need to be able to specify the range used when performing beta calculations.
1. I need to be able to specify the return type when performing beta calculations. Our holding strategy will be from the week's open to the week's close. Therefore, research should be based on ret_oc, not ret_cc.

## Misc Notes
This code is more complex than it needs to be. However, I wanted to perform actions on the days when they would be performed to make it easier to adapt this code to Quantopian in the future.

We will have success if the distributions of results_df['long_return'] is (statistically) significantly higher than the distribution of results_df['short_return'].

We should record both predicted_return and actual return, but this data structure doesn't allow for that. Doing so would allow for more model calibration, but less "omg, we made $X dollars."