### DATA

In [9]:
import pandas as pd

In [10]:
#Let's the list of 30 companies the portfolio managers would like 
# us to analyze

def get_tickers(path):
    companies = pd.read_csv(path)
    ticker_list = list(companies['Ticker'])
    print('Retrieved', str(len(ticker_list)),'ticker symbols.')
    return ticker_list


In [6]:
# Then, will need to extract the ticker symbol for each company
# and make a call to the Quandl API in order to retrieve the historical prices 
# for that company's stock.

In [11]:
import quandl

In [12]:
def get_prices(ticker):
    print('Retrieving data for', ticker)
    quandl.ApiConfig.api_key = 'CDmDYdJsPfiXTBfxL_zs' 
    prices = quandl.get('WIKI/'+ ticker)['Adj. Close'].reset_index()
    prices['Ticker']= ticker #new column with prices
    return prices

In [13]:
data = []
ticker_list = get_tickers('/Users/ajiacovic/Downloads/companies.csv')

for ticker in ticker_list:
    prices = get_prices(ticker)
    data.append(prices)

Retrieved 30 ticker symbols.
Retrieving data for AAPL


LimitExceededError: (Status 429) (Quandl Error QELx01) You have exceeded the anonymous user limit of 50 calls per day. To make more calls today, please register for a free Quandl account and then include your API key with your requests.

In [2]:
#When this is run, you should see the following output.


### WRANGLE-TIME!


first thing we are going to do is concatenate the list of data frames using the concat method 
so that they are all in a single data frame. Then, we are going to pivot the data using the 
pivot_table method so that the rows each represent a single date, each column represents a company, 
and the values in the pivot table are the company's stock price on a specific date.

In [14]:
def concat_pivot(data, rows, columns, values):
    df = pd.concat(data, sort=True)
    pivot = df.pivot_table(values = values,columns = columns, index = rows)
    return pivot


Now we have company stock prices per day, but the metrics we need 
for our analysis need to be calculated from the stocks' returns or 
the percentage change in the stock's price from day to day. We can 
create another pivot table containing these returns pretty easily 
by using the pct_change method.


In [15]:
def compute_returns(df):
    returns = df.pct_change()#returns the pct change between current row and last
    return returns

### ANALYSIS-TIME

At this point, we have daily historical stock price returns for each company of interest. Our data is now at a point where we can be analyzed, so let's jump right into it.

We are going to create a function that filters our returns to just the last X number of days and computes the mean return for each company and the standard deviation of returns. From there, we divide each company's mean by its standard deviation to get the average return per unit of risk metric the portfolio managers wanted to see.

In [41]:
def return_risk_ratio(df, days=30):
    means = pd.DataFrame(df.tail(days).mean())
    std = pd.DataFrame(df.tail(days).std())
    ratios = pd.concat([means, std], axis=1).reset_index()
    ratios.columns = ['Company', 'Mean', 'Std']
    ratios['Ratio'] = ratios['Mean']/ratios['Std']
    return ratios


nce we have this, we can sort in descending order and filter for just the top 10 companies that had the highest ratios.

In [None]:
top10 = ratios.sort_values('Ratio', ascending=False).head(10)


And once we have this top 10 list, we can then compare the correlations of their returns over varying periods of time so that the portfolio managers can limit their risk by not investing in stocks whose returns are too highly correlated with each other using the .corr method.

In [None]:
def corr_matrix(df, days=30):
    corr_matrix = df.tail(days).corr()
    return corr_matrix

target_list = returns[list(top10['Company'])]
correlation = corr_matrix(target_list)


### REPORTING

Now that we have completed the necessary steps to analyze the data, the next step is to produce and distribute the reports. First, we are going to put together a horizontal bar chart that shows the top 10 companies and their return vs. risk ratios.

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

def barchart(df, x, y, length=8, width=14, title=""):
    df = df.sort_values(x, ascending=False)
    plt.figure(figsize=(width,length))
    chart = sns.barplot(data=df, x=x, y=y)
    plt.title(title + "\n", fontsize=16)
    return chart

bar_plot = barchart(top10, 'Ratio', 'Company', title='Stock Return vs. Risk Ratios')


Next, we will produce a correlation matrix heatmap that visually shows the correlations between each company's returns.

In [None]:
import numpy as np

def correlation_plot(corr, title=""):
    mask = np.zeros_like(corr, dtype=np.bool)
    mask[np.triu_indices_from(mask)] = True
    #Triu is upper triangle of the correlation

    plt.subplots(figsize=(15, 10))
    cmap = sns.diverging_palette(6, 255, as_cmap=True)
    #diverging_palette = 6 is scale, 255 is color
    chart = sns.heatmap(corr, mask=mask, cmap=cmap, center=0, linewidths=.5, annot=True, fmt='.2f')
    plt.title(title, fontsize=16)
    return chart

corr_plot = correlation_plot(correlation, title='Stock Return Correlation')


We will save each chart we create as a .PNG image file so that the portfolio managers can view them and make their investment decisions based on the information presented.

In [None]:
def save_viz(chart, title):
    fig = chart.get_figure()
    fig.savefig(title + '.png')


Now that we have most of the code base, it only takes a little bit of modification and reorganization to structure this so that it is repeatable for 90, 180, and 360 day time periods. In a Python (.py) file, we will perform all our imports first, then include each the functions we wrote above, create new functions for each pipeline stage that wraps the rest of our code, and finally include an "if main" statement at the bottom that executes everything for each time period requested.

The complete code should look something like the following and should produce a total of six charts - a bar chart and a correlation heatmap for 90, 180, and 360 days.

In [None]:
import quandl
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

def get_tickers(path):
    companies = pd.read_csv(path)
    ticker_list = list(companies['Ticker'])
    print('Retrieved', str(len(ticker_list)), 'ticker symbols.')
    return ticker_list

def get_prices(ticker):
    print('Retrieving data for', ticker)
    quandl.ApiConfig.api_key = 'CDmDYdJsPfiXTBfxL_zs' 
    prices = quandl.get('WIKI/' + ticker)['Adj. Close'].reset_index()
    prices['Ticker'] = ticker
    return prices

def concat_pivot(data, rows, columns, values):
    df = pd.concat(data, sort=True)
    pivot = df.pivot_table(values=values, columns=columns, index=rows)
    return pivot

def compute_returns(df):
    returns = df.pct_change()
    return returns

def return_risk_ratio(df, days=30):
    means = pd.DataFrame(df.tail(days).mean())
    std = pd.DataFrame(df.tail(days).std())
    ratios = pd.concat([means, std], axis=1).reset_index()
    ratios.columns = ['Company', 'Mean', 'Std']
    ratios['Ratio'] = ratios['Mean']/ratios['Std']
    return ratios

def corr_matrix(df, days=30):
    corr_matrix = df.tail(days).corr()
    return corr_matrix

def barchart(df, x, y, length=8, width=14, title=""):
    df = df.sort_values(x, ascending=False)
    plt.figure(figsize=(width,length))
    chart = sns.barplot(data=df, x=x, y=y)
    plt.title(title + "\n", fontsize=16)
    return chart

def correlation_plot(corr, title=""):
    mask = np.zeros_like(corr, dtype=np.bool)
    mask[np.triu_indices_from(mask)] = True

    plt.subplots(figsize=(15, 10))
    cmap = sns.diverging_palette(6, 255, as_cmap=True)
    
    chart = sns.heatmap(corr, mask=mask, cmap=cmap, center=0, linewidths=.5, annot=True, fmt='.2f')
    plt.title(title, fontsize=16)
    return chart

def save_viz(chart, title):
    fig = chart.get_figure()
    fig.savefig(title + '.png')

def acquire():
    data = []

    ticker_list = get_tickers('/Users/ajiacovic/Downloads/companies.csv')

    for ticker in ticker_list:
        prices = get_prices(ticker)
        data.append(prices)
    return data

def wrangle(data):
    pivot = concat_pivot(data, 'Date', 'Ticker', 'Adj. Close')
    returns = compute_returns(pivot)
    return returns

def analyze(returns, days=30):
    ratios = return_risk_ratio(returns, days=days)
    top10 = ratios.sort_values('Ratio', ascending=False).head(10)
    
    target_list = returns[list(top10['Company'])]
    correlation = corr_matrix(target_list)
    return top10, correlation

def report(top10, correlation):
    bar_plot = barchart(top10, 'Ratio', 'Company', title='Stock Return vs. Risk Ratios - ' + str(day) + ' Days')
    save_viz(bar_plot, 'Return vs. Risk Top 10 - ' + str(day) + ' Days')
    
    corr_plot = correlation_plot(correlation, title='Stock Return Correlation - ' + str(day) + ' Days')
    save_viz(corr_plot, 'Correlation Plot - ' + str(day) + ' Days')

if __name__ == "__main__":
    data = acquire()
    returns = wrangle(data)

    num_days = [90,180,360]

    for day in num_days:
        top10, correlation = analyze(returns, days=day)
        report(top10, correlation)


We can save the Python file as stock_analysis.py and run it from the command line as follows.

$ python stock_analysis.py

