## Tips for accessing historical data
This tutorial focuses on some of the nitty-gritty details of reading in historical data. In many cases you'll want to configure some information such as a list of symbols to work on in an input file, then read those in and operate on them in some way. 
Note: This example computes the performance of a portfolio that assumes daily rebalancing because it makes the back test a little easier. Daily rebalancing may not be possible or appropriate for all applications. (Homework 1 for the coursera course requires "buy and hold" and not daily rebalancing).

In [1]:
import QSTK.qstkutil.qsdateutil as du
import QSTK.qstkutil.tsutil as tsu
import QSTK.qstkutil.DataAccess as da
import datetime as dt
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

## The data describing a portfolio
The general idea is that we have a portfolio described in a CSV file (QSTK/Examples/tutorial3portfolio.csv). And we'd like to see how that portfolio would have performed if we had held it in the past. Here's the file describing the portfolio:

>symbol, allocation
<p>SPY,0.3
<p>GABBABOOM,0.2
<p>GLD,0.3
<p>7ABBA, 0.2

## Reading in the portfolio description

In [5]:
na_portfolio = np.loadtxt('tutorial3portfolio.csv', dtype='S5,f4',
                        delimiter=',', comments="#", skiprows=1)
print na_portfolio

[('SPY', 0.30000001192092896) ('GABBA', 0.20000000298023224)
 ('GLD', 0.30000001192092896) ('7ABBA', 0.20000000298023224)]


The second line (dtype=) defines the format for each column. I think the other arguments are self explanatory. 
<p>
Later on it will be helpful if our data is sorted by symbol name, so we'll do that next:

In [6]:
na_portfolio = sorted(na_portfolio, key=lambda x: x[0])
print na_portfolio

[('7ABBA', 0.20000000298023224), ('GABBA', 0.20000000298023224), ('GLD', 0.30000001192092896), ('SPY', 0.30000001192092896)]


Now we build two lists, one that contains the symbols and one that contains the allocations: