**Portfolio**: an allocation of funds to a set of stocks.  
We are going to follow a buy-and-hold strategy where we invest in a set of stocks with a certain allocation and then observe how things go moving forward.    
We assume the allocation sum to 1.0 .  

### Daily portfolio values
We want to calculate the total value of a portfolio day by day by day.  
Once we have that infomation, we can compute the statistics on the overall portfolio.  
example:  
+ start_val=1,000,000
+ start-date=2009-1-1
+ end_date=2011-12-31
+ symbol=['SPY','XOM','GOOG','GLD']
+ allocs=[0.4,0.4,0.1,0.1]

+ normalized these price -- normed=prices/prices[0]   
[1.0,1.0,1.0,1.0]
The frist row are all 1.0. The data precedes after that are essentially cumulative returns starting from the start date.
+ multiply the normed values by the allocation of each of equities -- alloced=normed\*allocs  
[0.4,0.4,0.1,0.1]
The data after each row will be sized accordingly.
+ multiply our alloced dataframe times start-val -- pos_vals=alloced\*start_val   
[400000,400000,100000,100000]
position values: at each day how much that position is worth.
In the first row, the amount of cash allocated to each asset, and then going forward to see the amount of that asset overtime.
+ Now we have the value of each day for the individual assets. We can calculate the total value for the portfolio each day by summing across each day -- port_val=pos_vals.sum(axis=1)
the first day: 1000,000; the second day will change a little bit.

In [16]:
def compute_daily_portfolio_values(dates,symbols,allocs,initial_capital):
    df=get_data(symbols,dates)
    normed=normalize_data(df)
    alloced=normed*allocs
    pos_vals=alloced*initial_capital
    port_val=pos_vals.sum(axis=1)
    return port_val
#test run
dates=pd.date_range('2009-01-01','2012-12-31')
symbols=['SPY','XOM','GOOG','GLD']
allocs=[0.4,0.4,0.1,0.1]
initial_capital=1000000
port_val=compute_daily_portfolio_values(dates,symbols,allocs,initial_capital)
print port_val.head(3)

2010-01-04    1.000000e+06
2010-01-05    1.002089e+06
2010-01-06    1.004981e+06
dtype: float64


In [9]:
from IPython.display import display
import pandas as pd
#prices
dates=pd.date_range('2009-01-01','2012-12-31')
symbols=['SPY','XOM','GOOG','GLD']
df=get_data(symbols,dates)
display(df.head(2))
#normed
normed=normalize_data(df)
display(normed.head(2))   #cumulative returns starting from the start date
#alloced
allocs=[0.4,0.4,0.1,0.1]
alloced=normed*allocs
display(alloced.head(2))
#pos_vals-values at each day for the individual assets
pos_vals=alloced*1000000
display(pos_vals.head(2))
#total value for the portfolio each day
port_val=pos_vals.sum(axis=1)
display(port_val.head(5))

Unnamed: 0,SPY,XOM,GOOG,GLD
2010-01-04,97.788948,69.150002,313.062468,109.800003
2010-01-05,98.047805,69.419998,311.683844,109.699997


Unnamed: 0,SPY,XOM,GOOG,GLD
2010-01-04,1.0,1.0,1.0,1.0
2010-01-05,1.002647,1.003904,0.995596,0.999089


Unnamed: 0,SPY,XOM,GOOG,GLD
2010-01-04,0.4,0.4,0.1,0.1
2010-01-05,0.401059,0.401562,0.09956,0.099909


Unnamed: 0,SPY,XOM,GOOG,GLD
2010-01-04,400000.0,400000.0,100000.0,100000.0
2010-01-05,401058.839492,401561.798943,99559.632936,99908.919857


2010-01-04    1.000000e+06
2010-01-05    1.002089e+06
2010-01-06    1.004981e+06
2010-01-07    1.002515e+06
2010-01-08    1.004001e+06
dtype: float64

## Daily Portfolio Statistics
1. **cumulative return**
2. **average daily return**
3. **risk-std of daily return**
4. **sharpe ratio**

In [10]:
def compute_portfolio_statistics(port_val,frequency,annual_daily_rf_bank=0.1):
    port_rets=compute_daily_returns(port_val)[1:]
    
    cum_ret=(port_val[-1]/port_val[0]-1)
    avg_daily_ret=port_rets.mean()
    std_daily_ret=port_rets.std()
    
    if frequency=='daily':
        K=252**(1.0/2)
    elif frequency=='weekly':
        K=52**(1.0/2)
    elif frequency=='monthly':
        K=12**(1.0/2)
        
    daily_rf=(1+annual_daily_rf_bank)**(1.0/252)-1  
    sharp_ratio=K*(avg_daily_ret-daily_rf)/std_daily_ret
    
    return cum_ret,avg_daily_ret,std_daily_ret,sharp_ratio
compute_portfolio_statistics(port_val,'daily',0.1)

(0.29593536344951765,
 0.00039744149035923554,
 0.010305580925831418,
 0.029505910121487757)

In [48]:
#the first value is 0, as on the first day there's no change
#we wanna exclude that value for any calculations
port_rets=compute_daily_returns(port_val)[1:]
display(port_rets.head(5))

2010-01-05    0.002089
2010-01-06    0.002886
2010-01-07   -0.002454
2010-01-08    0.001482
2010-01-11    0.006254
dtype: float64

In [49]:
cum_ret=(port_val[-1]/port_val[0]-1)
print cum_ret

avg_daily_ret=port_rets.mean()
print avg_daily_ret

std_daily_ret=port_rets.std()
print std_daily_ret

0.29593536345
0.000397441490359
0.0103055809258


## Sharp Ratio-risk adjusted return   
Sharp Ratio enables us in a quantitative way to assess each portfolios.  


all else being equal
+ lower risk is better
+ higher return is better   

sharp ratio also considers      
+ risk free rate of return
+ 0%
+ the interest rate you would get on your money if you put is in a risk free asset like bank account
+ maybe the asset we've got isn't performing as well as the return I would get if I just put it in the bank account.

formula
+ (Rp-Rf)/std_daily_ret
+ (portfolio return-risk free return)/std of portfolio return
+ numerator Rp=As port return goes us, the metric goes up.
+ numerator Rf=As risk free return increases, the metric decreases.
+ denominator=Most finance folks consider risk to be standard deviation or volatility. As things become more volatile, sharp ratio goes down.


## Computing Sharp Ratio
S=E(Rp-Rf)/std(Rp-Rf)
+ **S=mean(daily_rets-daily_rf)/std(daily_rets-daily_rf)**

traditional shortcut:  
+ risk_free_rate=LIBOR/3 month treasury rate/0  
S=mean(daily_rets-daily_rf)/std(daily_rets)   


Start with annual Rf=0.1, at the end of the year we have 1.1  
What's the interest rate per day that enable us to get this value?  
**daily_rf=$\sqrt[252]{1.0+0.1}-1$**

+ SR can vary widely depending on **frequency** you sample
+ SR is an annual measure
+ SRannualized=K\*SR
+ K=$\sqrt{numberofsamplesperyear}$
+ daily sample-K=$\sqrt{252}$
+ weekly sample-K=$\sqrt{52}$
+ monthly sample-K=$\sqrt{12}$

In [61]:
daily_rf=1.1**(1.0/252)-1  
sharp_ratio=252**(1.0/2)*(avg_daily_ret-daily_rf)/std_daily_ret
print sharp_ratio

0.0295059101215


In [6]:
import pandas as pd
import os
def symbol_to_path(symbol, base_dir="data"):
    """Return CSV file path given ticker symbol."""
    return os.path.join(base_dir, "{}.csv".format(str(symbol)))
########################################################################
def get_data(symbols, dates):
    """Read stock data (adjusted close) for given symbols from CSV files."""
    df = pd.DataFrame(index=dates)# 1create empty df for designated date
    
    if 'SPY' not in symbols:  # 2add SPY for reference, if absent
        symbols.insert(0, 'SPY')

    for symbol in symbols:
        df_temp=pd.read_csv(symbol_to_path(symbol, base_dir="data"), # 3read in data from the symbol
                           index_col='Date',
                           parse_dates=True,
                           usecols=['Date','Adj Close'],
                           na_values=['nan'])
        df_temp=df_temp.rename(columns={'Adj Close':symbol})       # 4rename the adjust close column to the symbol name
        df=df.join(df_temp) 
        if symbol =='SPY':#5drop rows where SPY is na/ensure SPY is used as a reference-we don't have na values in the spy column
            df=df.dropna(subset=['SPY'])
    return df
########################################################################
def normalize_data(df):
    return df/df.iloc[0,:]
########################################################################
def plot_selected(df, columns, start_index, end_index):
    """Plot the desired columns over index values in the given range."""
    # TODO: Your code here
    # Note: DO NOT modify anything else!
    plot_data(df.loc[start_index:end_index,columns],title='Selected data')

def plot_data(df, title="Stock prices"):
    """Plot stock prices with a custom title and meaningful axis labels."""
    ax = df.plot(title=title, fontsize=12)
    ax.set_xlabel("Date")
    ax.set_ylabel("Price")
    plt.show()
########################################################################
def get_bollinger_bands(rm,rstd):
    upper_band=rm+2*rstd
    lower_band=rm-2*rstd
    return upper_band,lower_band

def get_rolling_mean(values, window):
    """Return rolling mean of given values, using specified window size."""
    return values.rolling( window=window,center=False).mean()


def get_rolling_std(values, window):
    """Return rolling standard deviation of given values, using specified window size."""
    # TODO: Compute and return rolling standard deviation
    return values.rolling(window=window,center=False).std()
########################################################################

def compute_daily_returns(df):
    """Compute and return the daily return values."""
    daily_returns=df.copy()
    daily_returns.iloc[1:]=(df[1:]/df[:-1].values)-1
    daily_returns.iloc[0]=0
    return daily_returns
########################################################################
def fill_missing_values(df_data):
    df_data.fillna(method='ffill',inplace=True)#fill forward
    df_data.fillna(method='bfill',inplace=True)#fill backward