# Assignment 5: Calculating the beta (sensitivity to the market) of a stock.

We're choosing 10 years data of Eicher Motors Ltd. stock (NSE: EICHERMOT) for the calculation. Also, we're choosing NSE's Index (NIFTY) for market data to reflect EICHERMOT's sensitivity to NIFTY (market). Data for both the stock and the market were obtained from *Yahoo Finance*.  
#### Questions for this assignment:
1. Create a function which calculates the Beta of a stock given a dataframe object as an input parameter.  
   Your function should NOT use NumPy's .var() or .cov() methods. Instead, it should estimate the Beta manually (i.e. applying the formula for the Beta from scratch.)
2. Calculate the Beta of your stock using the covariance and variance functions / methods built in to NumPy.
3. Estimate the Beta of your stock using an appropriate module from SciPy. You may also use other packages, for instance, StatsModels.
4. Comment on why your Beta estimates may be different, even though you're using exactly the same dataset for all 3 preceding questions. Please think about why, even if your own Beta estimates were identical for all 3 cases.

In [1]:
import numpy as np
import pandas as pd

In [2]:
# Extracting 'EICHERMOT' and 'NIFTY' 10 years price data
df = pd.read_csv("EICHERMOT.NS.csv")
nse = pd.read_csv("^NSEI.csv")
df = df[['Date','Adj Close']]
df.rename(columns = {'Date' : 'date', 'Adj Close' : 'price_eicher'}, inplace = True)
df['price_nifty'] = nse['Adj Close']
df.set_index(['date'], inplace = True)
df

Unnamed: 0_level_0,price_eicher,price_nifty
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2012-08-06,178.564651,5282.549805
2012-08-07,182.342163,5336.700195
2012-08-08,183.598373,5338.000000
2012-08-09,183.413879,5322.950195
2012-08-10,181.718414,5320.399902
...,...,...
2022-07-27,3057.399902,16641.800781
2022-07-28,3054.000000,16929.599609
2022-07-29,3093.449951,17158.250000
2022-08-01,3088.399902,17340.050781


In [3]:
# Finding daily price returns
returns_df = df.pct_change(1)
returns_df.dropna(inplace = True)
returns_df.rename(columns = {'price_eicher':'returns_eicher', 'price_nifty':'returns_nifty'}, inplace = True)
returns_df

Unnamed: 0_level_0,returns_eicher,returns_nifty
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2012-08-07,0.021155,0.010251
2012-08-08,0.006889,0.000244
2012-08-09,-0.001005,-0.002819
2012-08-10,-0.009244,-0.000479
2012-08-13,0.005221,0.005169
...,...,...
2022-07-27,0.011279,0.009582
2022-07-28,-0.001112,0.017294
2022-07-29,0.012917,0.013506
2022-08-01,-0.001632,0.010596


## Part 1:

In [4]:
# Calculating deviations from the mean
deviations = returns_df - returns_df.mean()
deviations.rename(columns = {'returns_eicher':'dev_eicher', 'returns_nifty':'dev_nifty'}, inplace = True)
deviations

Unnamed: 0_level_0,dev_eicher,dev_nifty
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2012-08-07,0.019772,0.009710
2012-08-08,0.005507,-0.000298
2012-08-09,-0.002388,-0.003361
2012-08-10,-0.010627,-0.001020
2012-08-13,0.003838,0.004628
...,...,...
2022-07-27,0.009896,0.009041
2022-07-28,-0.002495,0.016753
2022-07-29,0.011535,0.012965
2022-08-01,-0.003015,0.010054


In [5]:
# Calculating product of deviations
product = deviations['dev_eicher'] * deviations['dev_nifty']
deviations['product_dev'] = product
deviations

Unnamed: 0_level_0,dev_eicher,dev_nifty,product_dev
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2012-08-07,0.019772,0.009710,0.000192
2012-08-08,0.005507,-0.000298,-0.000002
2012-08-09,-0.002388,-0.003361,0.000008
2012-08-10,-0.010627,-0.001020,0.000011
2012-08-13,0.003838,0.004628,0.000018
...,...,...,...
2022-07-27,0.009896,0.009041,0.000089
2022-07-28,-0.002495,0.016753,-0.000042
2022-07-29,0.011535,0.012965,0.000150
2022-08-01,-0.003015,0.010054,-0.000030


In [6]:
# Calculating covariance, variance and beta
cov_eicher = deviations['product_dev'].sum() / (len(deviations['dev_eicher']) - 1)
squared_dev_nifty = deviations['dev_nifty'] ** 2
var_nifty = squared_dev_nifty.sum() / (len(deviations['dev_nifty']) - 1)
beta_eicher = cov_eicher / var_nifty

print('Covariance of EICHERMOT & NIFTY =', cov_eicher)
print('Variance of NIFTY =', var_nifty)
print('Beta of EICHERMOT =', beta_eicher)

Covariance of EICHERMOT & NIFTY = 0.00010767365606828264
Variance of NIFTY = 0.00011672859190361362
Beta of EICHERMOT = 0.9224274388334273


## Part 2:

In [7]:
# Generating a Covariance matrix using NumPy
cov_matrix = np.cov(returns_df['returns_eicher'], returns_df['returns_nifty'])
cov_matrix

array([[0.00044542, 0.00010767],
       [0.00010767, 0.00011673]])

In [8]:
# Finding covariances and variances from the matrix

# Covariance of EICHERMOT and NIFTY
cov_eicher = cov_matrix[0][1]
# another way
cov_eicher2 = cov_matrix[1][0]

# Variance of NIFTY
var_nifty = cov_matrix[1][1]

# Variance of EITHERMOT (for sake of it)
var_eicher = cov_matrix[0][0]

# Beta of EICHERMOT
beta_eicher = cov_eicher / var_nifty

print('Covariance of EICHERMOT & NIFTY is', cov_eicher)
print('Also, by using the another way results in the same', cov_eicher2)
print('Variance of NIFTY is', var_nifty)
print('Beta of EICHERMOT is', beta_eicher)

Covariance of EICHERMOT & NIFTY is 0.00010767365606828264
Also, by using the another way results in the same 0.00010767365606828264
Variance of NIFTY is 0.00011672859190361364
Beta of EICHERMOT is 0.922427438833427


## Part 3:

In [12]:
# Calculating Slope(i.e. Beta) using linregress
from scipy.stats import linregress

In [10]:
# 'returns_eicher' are dependent variables i.e. variable 'y'
# 'returns_nifty' are independent variables i.e. variable 'x'
statistical_test = linregress(y = returns_df['returns_eicher'], x = returns_df['returns_nifty'])
statistical_test

LinregressResult(slope=0.9224274388334269, intercept=0.0008834619163263924, rvalue=0.4722124465242419, pvalue=3.9400623778685043e-137, stderr=0.03469583884350522, intercept_stderr=0.0003752514059685537)

In [11]:
print('Beta of EICHERMOT is', statistical_test.slope)

Beta of EICHERMOT is 0.9224274388334269


## Part 4:

The Beta resulting using the traditional formula way, NumPy and scipy.stats.linregress are almost identical. The minmuscule difference arises due to the difference in how these methods round off the decimals while calculating.