# #8 Calculating the Beta as a Regression

These lines of codes follows:

[CAPM Analysis: Calculating stock Beta as a Regression with Python](https://medium.com/python-data/capm-analysis-calculating-stock-beta-as-a-regression-in-python-c82d189db536) or one may consult [GitHub link](https://github.com/PyDataBlog/Python-for-Data-Science/blob/master/Tutorials/Beta%20Tutorial/Beta.py):

In [1]:
import pandas as pd
import statsmodels.api as sm
import yfinance as yf

msft = yf.download("MSFT", start='2010-01-01', end='2020-01-31', interval='1mo')
sp500 = yf.download("SPY", start='2010-01-01', end='2020-01-31', interval='1mo')

df = pd.concat([msft['Adj Close'], sp500['Adj Close']], axis=1)
df.head()

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Adj Close,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2010-01-01,21.462662,83.190315
2010-02-01,21.835859,85.785446
2010-03-01,22.412386,90.634773
2010-04-01,23.368872,92.415909
2010-05-01,19.741875,85.073029


In [2]:
df.columns = ['MSFT', 'SPY']
df.head()

Unnamed: 0_level_0,MSFT,SPY
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2010-01-01,21.462662,83.190315
2010-02-01,21.835859,85.785446
2010-03-01,22.412386,90.634773
2010-04-01,23.368872,92.415909
2010-05-01,19.741875,85.073029


In [3]:
ret = df.pct_change(1)
ret.head()

Unnamed: 0_level_0,MSFT,SPY
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2010-01-01,,
2010-02-01,0.017388,0.031195
2010-03-01,0.026403,0.056529
2010-04-01,0.042677,0.019652
2010-05-01,-0.155206,-0.079455


In [4]:
ret = ret.dropna(axis=0)
ret.head()

Unnamed: 0_level_0,MSFT,SPY
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2010-02-01,0.017388,0.031195
2010-03-01,0.026403,0.056529
2010-04-01,0.042677,0.019652
2010-05-01,-0.155206,-0.079455
2010-06-01,-0.104115,-0.056231


In [6]:
X = ret['SPY']
y = ret['MSFT']

X1 = sm.add_constant(X)

model = sm.OLS(y, X1)

result = model.fit()
result.summary()

0,1,2,3
Dep. Variable:,MSFT,R-squared:,0.406
Model:,OLS,Adj. R-squared:,0.401
Method:,Least Squares,F-statistic:,80.7
Date:,"Thu, 22 Jun 2023",Prob (F-statistic):,5.05e-15
Time:,13:48:27,Log-Likelihood:,197.52
No. Observations:,120,AIC:,-391.0
Df Residuals:,118,BIC:,-385.5
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,0.0067,0.005,1.481,0.141,-0.002,0.016
SPY,1.0641,0.118,8.983,0.000,0.830,1.299

0,1,2,3
Omnibus:,6.589,Durbin-Watson:,2.276
Prob(Omnibus):,0.037,Jarque-Bera (JB):,11.211
Skew:,0.063,Prob(JB):,0.00368
Kurtosis:,4.492,Cond. No.,27.6


alternative using Scipy's `linregress` method:

In [7]:
from scipy import stats
slope, intercept, r_value, p_value, std_err = stats.linregress(X, y)

print(slope)

beta = slope

1.0640900572886889


In [10]:
stats.linregress(X, y)

LinregressResult(slope=1.0640900572886889, intercept=0.006673306994599174, rvalue=0.6372835216790184, pvalue=5.051067511493539e-15, stderr=0.11845415011200548, intercept_stderr=0.0045069790406032095)