In this notebook I'll show the results of mantaining a portfolio through ten year by rebalancing it at regular intervals of 1 (daily), 5 (weekly), 20 (monthly), and 60 (quarterly) days.

First just the usual stuff... library declaration...

In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import cvxopt as opt
from cvxopt import blas, solvers
import pandas as pd

np.random.seed(123)

#Turn off progress printing
solvers.options['show_progress']=False

import plotly.plotly as py
import plotly.tools as tls
from plotly.graph_objs import *

import plotly
py.sign_in('linobi', 'xlupudvz62')
import cufflinks
plotly.__version__

from datetime import datetime
from zipline.utils.factory import load_bars_from_yahoo

Now we define the quadratic optimizator

In [2]:
def optimal_portfolio(returns):
    n = len(returns)
    returns = np.asmatrix(returns)
    
    N = 100
    mus = [10**(5.0*t/N - 1.0) for t in range(N)]
    
    #Convert to cvxopt matrices
    S = opt.matrix(np.cov(returns))
    pbar = opt.matrix(np.mean(returns, axis=1))
    
    #Create constraint matrices
    G = -opt.matrix(np.eye(n))
    h = opt.matrix(0.0, (n, 1))
    A = opt.matrix(1.0, (1, n))
    b = opt.matrix(1.0)
    
    # Calculate efficient frontier weights using quadratic programming
    portfolios = [solvers.qp(mu*S, -pbar, G, h, A, b)['x'] 
                  for mu in mus]
    
    ## Calculate risks and returns for frontier
    returns = [blas.dot(pbar, x) for x in portfolios]
    risks = [np.sqrt(blas.dot(x, S*x)) for x in portfolios]
    
    ## Calculate the 2nd degree polynomial of the frontier curve
    m1 = np.polyfit(returns, risks, 2)
    x1 = np.sqrt(m1[2] / m1[0])
    
    ## Calculate the optimal portfolio
    wt = solvers.qp(opt.matrix(x1 * S), -pbar, G, h, A, b)['x']
    return np.asarray(wt), returns, risks

We request the data we want to use from yahoo

In [3]:
end = pd.Timestamp(datetime(2016, 1, 1))
start = end - 2500* pd.tseries.offsets.BDay() #Ten years approx.

data = load_bars_from_yahoo(stocks=['IBM', 'GLD', 'XOM', 'AAPL', 
                                    'MSFT', 'TLT'],
                            start=start, end=end)

#in case you wanna take a look a it:
data.loc[:, :, 'price'].iplot(filename='prices', yTitle='price in $', world_readable=True, asDates=True)


load_bars_from_yahoo is deprecated, please register a yahoo_equities data bundle instead



Now we call the backtester (zipline). This could be up with rest of the libraries declaration.

In [4]:
import zipline
from zipline.api import (history,
                         set_slippage,
                         slippage,
                         set_commission,
                         order_target_percent)

from zipline import TradingAlgorithm

To make use of the backtester we need to instantiate two functions: initialize & handle_data. Initialize serve to declare global variables and stuff like that; handle_data is called upon each tick, which, in the case of the yahoo data resolution, is "daily". 

Note: This is the old fashioned use of zipline. The newer version includes data embedded and strongly suggest against using the raw handle_data function. Instead, it suggest to schedulue portfolio rebalancing every now and then (i.e. not paying attention to each tick). For now we will use this code, but it probably should be updated soon to reflect the new structure of zipline.

In [5]:
def initialize(context):
    '''
    Called once at the very beginning of a backtest (and live trading). 
    Use this method to set up any bookkeeping variables.
    
    The context object is passed to all the other methods in your algorithm.

    Parameters

    context: An initialized and empty Python dictionary that has been 
             augmented so that properties can be accessed using dot 
             notation as well as the traditional bracket notation.
    
    Returns None
    '''
    context.tick = 0

In [21]:
def handle_data(context, data):
    '''
    Called when a market event occurs for any of the algorithm's 
    securities. 

    Parameters

    data: A dictionary keyed by security id containing the current 
          state of the securities in the algo's universe.

    context: The same context object from the initialize function.
             Stores the up to date portfolio as well as any state 
             variables defined.

    Returns None
    '''
    # Allow history to accumulate 100 days of prices before trading
    # and rebalance every day thereafter.
    context.tick += 1
    if context.tick < 100:
        return
    # Get rolling window of past prices and compute returns
    
    #rebalance only every i days
    i = 60
    if (context.tick % i) != 0:
        return
    
    prices = history(100, '1d', 'price').dropna()
    returns = prices.pct_change().dropna()
    try:
        # Perform Markowitz-style portfolio optimization
        weights, _, _ = optimal_portfolio(returns.T)
        weights = np.around(weights)
        # Rebalance portfolio accordingly
        for stock, weight in zip(prices.columns, weights):
            order_target_percent(stock, weight)
    except ValueError as e:
        # Sometimes this error is thrown
        # ValueError: Rank(A) < p or Rank([P; A; G]) < n
        pass

We instantiate and run the algorithm by passing as a parameter the data previously fetched from yahoo:

In [22]:
# Instantinate algorithm        
algo = TradingAlgorithm(initialize=initialize, 
                        handle_data=handle_data)

In [23]:
# Run algorithm
results = algo.run(data)


The `history` method is deprecated.  Use `data.history` instead.


pd.rolling_count is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=21).count()


pd.rolling_count is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=20).count()


pd.rolling_count is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=23).count()


pd.rolling_count is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=22).count()


pd.rolling_count is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=19).count()


pd.rolling_count is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=64).count()


pd.rolling_count is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=63).coun

Now let's plot the performance of our portfolio daily rebalanced.

In [12]:
results.portfolio_value.iplot(filename='portfolio_daily', yTitle='Cumulative capital in $', world_readable=True, asDates=True)

Not bad at all. But let's see if we can do better by rebalacing not so often. Let's say, weekly, that is, every five days. By changing the value of variable i in handle_data function we can set our algorithm to rebalance every i days. So, when i = 5:

In [16]:
results.portfolio_value.iplot(filename='portfolio_weekly', yTitle='Cumulative capital in $', world_readable=True, asDates=True)

That was it for rebalancing every 5 days. There were some 50K extra by the end.

Now, for i = 20:

In [20]:
results.portfolio_value.iplot(filename='portfolio_monthly', yTitle='Cumulative capital in $', world_readable=True, asDates=True)

Rebalancing every month (approximately) sank the portfolio performance. The inverval between rebalance and rebalance is way to big. But just for the sake of it, let's see what happens if we rebalance four times a year:

In [24]:
results.portfolio_value.iplot(filename='portfolio_quarterly', yTitle='Cumulative capital in $', world_readable=True, asDates=True)

There isn't much of a difference between rebalacing monthly and rebalancing every three months in terms of final return; but notice how after the 2008 crisis this portfolio achieves a higher sharpe ratio (is less volatile) than the monthly portfolio.