# Getting The Data

#### Brian Bahmanyar


___

[Quandl](https://www.quandl.com) provides free daily financial data which will be used in the analyses to come. They also provide a free, but somewhat lackluster Python API.

In [24]:
import numpy as np
import pandas as pd
import Quandl

Below is a function written to serve as a wrapper around Quandl's Python API and provide some needed functionality.

In [50]:
def get_adj_close(tickers, start, end="", ratios=[], log_transforms=[]):
    """
    Args:
        tickers (list): collection of ticker symbols for which to collect adj. close 
                daily prices for
        start (string, format: 2013-01-01): start date for which to collect prices after
        end (string, format: 2013-01-01): optional end date, today if not specified
        ratios (list): collection of tuples of tickers from 'tickers' list to calculate 
                price ratios for (the stock with larger mean is numerator)
        log_transforms (list): collection of tickers from 'tickers' to include additional 
                natural log transformed copies
    
    Returns (dataframe): all adj. close prices, ratios, and log transforms specified
    """ 
    result = {}
    
    for ticker in tickers:
        try:
            result[ticker] = Quandl.get('WIKI/'+ticker, trim_start=start, trim_end=end)['Adj. Close']
        except DatasetNotFound:
            print('ERROR:')
            print(ticker, 'is not a vaild ticker')

    for ratio in ratios:
        try:
            ticker1, ticker2 = ratio
            if result[ticker1].mean() > result[ticker2].mean():
                result[ticker1+'/'+ticker2] = result[ticker1]/result[ticker2]
            else:
                result[ticker2+'/'+ticker1] = result[ticker2]/result[ticker1]
        except KeyError:
            print('ERROR:')
            print(ticker1, 'or', ticker2, 'are not in the list of specified tickers')
    
    for log_transform in log_transforms:
        try:
            result['ln('+log_transform+')'] = np.log(result[log_transform])
        except KeyError:
            print('ERROR:')
            print(log_transform, 'is not in the list of specified tickers')
    
    return pd.DataFrame(result).dropna() # drop na here because of differences in lenght of history for stocks

A copy of this function is placed into api_wrapper.py for use in other notebooks.

***

### Usage

In [42]:
portfolio = get_adj_close( ['FB', 'AMZN', 'CMG'], 
                           start='2013-01-01', 
                           ratios=[('FB','AMZN'), ('FB','CMG')],
                           log_transforms=['FB', 'CMG'] )

In [49]:
portfolio.head()

Unnamed: 0_level_0,AMZN,AMZN/FB,CMG,CMG/FB,FB,ln(CMG),ln(FB)
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2013-01-02,257.31,9.189643,301.06,10.752143,28.0,5.70731,3.332205
2013-01-03,258.48,9.307886,300.95,10.837234,27.77,5.706944,3.323956
2013-01-04,259.15,9.010779,300.18,10.437413,28.76,5.704382,3.358986
2013-01-07,268.46,9.125085,299.59,10.183209,29.42,5.702415,3.381675
2013-01-08,266.38,9.166552,297.76,10.246387,29.06,5.696288,3.369363


### Tests

In [45]:
assert len(portfolio.columns) == 7
assert portfolio.isnull().sum().sum() == 0