Ryan_Harty_rmh3337  
Hannah_Ho_hh24994  
Vinay_Khanijow_vpk97  
Derrick_Hung_dsh989

Instructions:

-Open file in Jupyter notebook or Jupyter lab

-Program will query for number of stocks to analyze and dates of interest (start/end).

-User should enter ticker symbols. Program will query for appropriate benchmark. User can select S&P 500, Russell 3000, Nasdaq or their own index symbol.

-Program will request user to enter the appropriate risk free rate to use.

-Program will scrape the price data from Yahoo.

-Program will convert price data to returns, run regression, interpret output and write data to file as a multi-tabbed excel file with each stock's regression in one tab. 

In [1]:
import pandas as pd
import pandas_datareader.data as web
import numpy as np
import random
import statsmodels.api as sm
from statsmodels import regression
%pylab inline

def linreg(x,y):
    
    #runs the regression
    x1 = sm.add_constant(x)
    model = regression.linear_model.OLS(y,x1).fit()
    
#   returns the beta, alpha and model summary
    return model.params[0], model.params[1], model.summary2()

# menu of benchmark options
def get_benchmark():
    
    benchmark = input('Which benchmark ticker would you like to use? \n1 for S&P500, 2 for Russell 3000, 3 for NASDAQ, or enter your own\n')
    
    if benchmark == '1':
        benchmark = '^GSPC'
        print('You chose S&P500')
        return benchmark
    elif benchmark == '2':
        benchmark = '^RUA'
        print('You chose Russell 3000')
        return benchmark
    elif benchmark == '3':
        benchmark = '^IXIC'
        print('You chose NASDAQ')
        return benchmark
    else:
        return benchmark

def betas():
    
    #Gather user information
    benchmark = get_benchmark()
    
    rf = float(input('What would you like to use for risk free rate? (Enter as a decimal. Ex: 2% would be 0.02)\n')) 
    rf_daily = np.power(1+rf,(1/365)) - 1
    print('Daily risk free rate:', rf_daily)
    #if you want to use a treasury bond, uncomment this code
#     rf = input('What would you like to use for risk free rate?\n IRX: 13 Week \n FVX: 5 Year \nTNX: 10 Year\n') 
#     rf = '^'+rf

    num_stocks = int(input('How many companies do you want to look at?\n'))
    
    stocks = [benchmark]
    
    #Gets the ticker symbols for the stocks
    while num_stocks != 0:
        s = input('please enter ticker symbol for one stock:\n')
        stocks.append(s)
        num_stocks -= 1
        
    start_year = input('What year do you want to start on?\nFormat: MM/DD/YYYY\n ')
    start_year_name = start_year.replace('/','-')
    
    end_year = input('What year do you want to end on?\nFormat: MM/DD/YYYY\n  ')
    end_year_name = end_year.replace('/','-')
    
    #scrape the info using yahoo finance 
    data = (web.DataReader(stocks,data_source='yahoo',start=start_year, 
                       end=end_year)['Adj Close'])
    
    #get monthly returns if you do not want to use daily returns
#     data = data.groupby(pd.TimeGrouper(freq='MS'))[stocks].mean()
        
    #create and name an excel notebook
    writer = pd.ExcelWriter('capm {} to {}.xlsx'.format(start_year_name, end_year_name), engine='xlsxwriter')
    workbook  = writer.book
    
    data[benchmark] = data[benchmark]

    for stock in stocks[1:]:
        
        data[stock] = data[stock]
        
        #calculate the returns for the market and the stock
        #and subtract risk free rate from each stock to premiums
        returns = data[[benchmark, stock]].pct_change() - rf_daily
        returns = returns.dropna()
        print(returns)
        
        #create X and y for the regression
        y = returns[stock]
        X = returns[benchmark]
        
        #returns alpha, beta and the summary of regression
        a, b, model = linreg(X,y)
        
        #create a dataframe for the first part of the summary
        df = model.tables[0]
        
        #create a dataframe for the second part of the summary
        df1 = model.tables[1]
        
        #add the two dataframes together so there is only one dataframe to add to the excel spreedsheet
        table = df.append(df1)
        
        #write the dataframe to excel
        table.to_excel(writer, sheet_name='{}'.format(stock)) 
        
        #create plot of security characteristic line
        plt.figure(figsize = (20,10))
        plt.scatter(X,y, alpha=0.3)
        p = np.linspace(X.min(),X.max(), 100)
        #b = beta and a = alpha
        y1 = b*p + a
        plt.plot(p, y1, 'r', alpha=.9) 
        plt.grid(True, which ='both')
        plt.axhline(y=0, color = 'k')
        plt.axvline(x=0, color = 'k')
        plt.title('{} Security Characteristic Line'.format(stock))
        plt.xlabel('Market Return')
        plt.ylabel('{} return'.format(stock))
        plt.savefig('{} {} to {}.png'.format(stock, start_year_name, end_year_name))
        
        #save the plot to the spreedsheet
        worksheet = writer.sheets['{}'.format(stock)]
        worksheet.insert_image('A15', '{} {} to {}.png'.format(stock, start_year_name, end_year_name))
        
        #clears the plot so that each plot is unique for the designated stock
        plt.clf()
        
    #save and close the excel writer
    writer.save()
    writer.close()
    
    print('\nSee excel file and images.\n')

#run the program
betas()

`%matplotlib` prevents importing * from pylab and numpy
  "\n`%matplotlib` prevents importing * from pylab and numpy"


Populating the interactive namespace from numpy and matplotlib


Which benchmark ticker would you like to use? 
1 for S&P500, 2 for Russell 3000, 3 for NASDAQ, or enter your own
 1


You chose S&P500


What would you like to use for risk free rate? (Enter as a decimal. Ex: 2% would be 0.02)
 .177


Daily risk free rate: 0.0004465896319580942


How many companies do you want to look at?
 1
please enter ticker symbol for one stock:
 spdw
What year do you want to start on?
Format: MM/DD/YYYY
  04/26/2007
What year do you want to end on?
Format: MM/DD/YYYY
   12/19/2019


Symbols        ^GSPC      spdw
Date                          
2007-04-27 -0.000567 -0.000141
2007-04-30 -0.008278 -0.002894
2007-05-01  0.002205 -0.003206
2007-05-02  0.006026  0.005088
2007-05-03  0.003878 -0.000141
...              ...       ...
2019-12-13 -0.000374  0.004360
2019-12-16  0.006701  0.009120
2019-12-17 -0.000111 -0.002973
2019-12-18 -0.000879 -0.002030
2019-12-19  0.002496 -0.001071

[3186 rows x 2 columns]


  return ptp(axis=axis, out=out, **kwargs)



See excel file and images.



<Figure size 1440x720 with 0 Axes>

In [None]:
# Plot of returns