## This sections Objectives are:

1. Single index model
2. Capital Asset Pricing Model (CAPM)
3. Arbitrage Pricing Theory (APT)

Let's get the historical data on adjusted closing prices of TESLA and General Electric from Yahoo using "*yfinance*" library. The time period we want is from Jan 16, 2019 to Dec 31, 2019. For the market, we are going to download the adjusted closing price of the S\&P 500 index for the same period. Compute returns of individual stocks and the S\&P 500 index, you should have daily returns from Jan 17, 2019 to Dec 30, 2019. 

In [12]:
#Prepare the data we need to use in the following tasks:

import pandas as pd
import numpy as np
import yfinance as yf

df = yf.download("TSLA GE ^GSPC", start="2019-01-16", end="2019-12-31")

#keep only adj close price
df = df.iloc[: , :3]
df.columns = ['TSLA', 'GE', 'SP500']

#Calculate daily percentage change as the return
daily_ret = df[['TSLA', 'GE', 'SP500']].pct_change()
daily_ret = daily_ret.iloc[1:,]

daily_ret.index = daily_ret.index.date

print(daily_ret)
#daily_ret.info()

[*********************100%%**********************]  3 of 3 completed

                TSLA        GE     SP500
2019-01-17  0.017817  0.003641  0.007591
2019-01-18 -0.008753 -0.129711  0.013183
2019-01-22 -0.044150 -0.011050 -0.014157
2019-01-23  0.008083 -0.037903  0.002203
2019-01-24  0.005727  0.013631  0.001376
...              ...       ...       ...
2019-12-23  0.011786  0.033605  0.000866
2019-12-24  0.002688  0.014384 -0.000195
2019-12-26  0.003575  0.013380  0.005128
2019-12-27 -0.004453 -0.001300  0.000034
2019-12-30 -0.008944 -0.036433 -0.005781

[240 rows x 3 columns]





# I: Single Index Model

Lets start with single index model, and please follow these steps:

1. First, you should have the daily return data for TSLA, GE, and S&P 500 in the dataframe *daily_ret*. 

2. We want to check the Single Index Model. It is as follows: $$  R_{it} = \alpha_{i} + \beta_{i}R_{Mt} + e_{it}  $$ We have the returns for stocks ($ i $ is stocks TSLA and GE) and also the returns for the market ($ R_{Mt} $ is the S\&P 500 in our model.) Now, we need to run the regression for each stock. 

3. To do this, first we need to import a useful package *statsmodels.api*. The *sm.OLS* function can do the OLS analysis well. Just please do not forget to add the constant as the independent variable (the Intercept). You can use print(model.summary()) to show your regression results.

4. The variables that are important for us are Coefficients for Intercept and Market, p-values for these coefficients and Adjusted R-square. Explain what these values mean. 

In [13]:
#First we need to install the statsmodels package:
import sys
!{sys.executable} -m pip install --upgrade statsmodels



In [14]:
import statsmodels.api as sm

## print(sm.add_constant(daily_ret.SP500))

In [15]:
print(sm.add_constant(daily_ret.SP500))

            const     SP500
2019-01-17    1.0  0.007591
2019-01-18    1.0  0.013183
2019-01-22    1.0 -0.014157
2019-01-23    1.0  0.002203
2019-01-24    1.0  0.001376
...           ...       ...
2019-12-23    1.0  0.000866
2019-12-24    1.0 -0.000195
2019-12-26    1.0  0.005128
2019-12-27    1.0  0.000034
2019-12-30    1.0 -0.005781

[240 rows x 2 columns]


## Let's try a regression for the single index model on TSLA:

In [16]:
import statsmodels.formula.api as sm

In [17]:
res_TSLA = sm.ols(formula="TSLA ~ SP500", data=daily_ret).fit()

In [19]:
#the statsmodels.api is a useful tool in Python for regression analysis

#import statsmodels.formula.api as sm
import statsmodels.api as sm

# #First we look at the linear model regresson for TESLA:

#X = daily_ret.SP500
#Y = daily_ret.TSLA
#X = sm.add_constant(daily_ret.SP500)
#print(X)

#res_TSLA = sm.ols(formula="TSLA ~ SP500", data=daily_ret).fit()

res_TSLA = sm.OLS(daily_ret.TSLA, sm.add_constant(daily_ret.SP500)).fit()

#res_TSLA = sm.OLS(Y, X).fit() 
print(res_TSLA.summary())
#print(res_TESLA)

#print(res_TESLA.params)
#print(res_TESLA.pvalues)
#print(res_TESLA.resid)

#resi = res_TSLA.resid
#print(resi.mean())

                            OLS Regression Results                            
Dep. Variable:                   TSLA   R-squared:                       0.218
Model:                            OLS   Adj. R-squared:                  0.214
Method:                 Least Squares   F-statistic:                     66.23
Date:                Wed, 27 Mar 2024   Prob (F-statistic):           2.24e-14
Time:                        18:31:21   Log-Likelihood:                 571.73
No. Observations:                 240   AIC:                            -1139.
Df Residuals:                     238   BIC:                            -1132.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const      -3.794e-05      0.001     -0.026      0.9

In [None]:
**Interpretation**
#The model has been well-fitted, with both the R^2 and adjusted R^2 yielding relatively acceptable values
#The F-statistic indicates that the regression model is overall significant.
#The Durbin-Watson statistic suggests no autocorrelation in the model.
#Of particular importance is the estimated beta coefficient, which stands at 1.5751 and is statistically significant.
#Moreover, this suggests that the stock falls into the category of high-risk stocks because is more volatile than market index.
#Because the coefficient for estimating Beta is 1.5751 which is statistically significant.
#Furthermore, the p-value statistics indicate that the estimates are valid and significant at a 99% confidence level.

In [20]:
res_GE = sm.OLS(daily_ret.GE, sm.add_constant(daily_ret.SP500)).fit()
print(res_GE.summary())

                            OLS Regression Results                            
Dep. Variable:                     GE   R-squared:                       0.092
Model:                            OLS   Adj. R-squared:                  0.088
Method:                 Least Squares   F-statistic:                     24.07
Date:                Wed, 27 Mar 2024   Prob (F-statistic):           1.73e-06
Time:                        18:32:43   Log-Likelihood:                 506.30
No. Observations:                 240   AIC:                            -1009.
Df Residuals:                     238   BIC:                            -1002.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0001      0.002      0.059      0.9

In [None]:
**interpretation**
#Although the model has been fitted, the R^2 and adjusted R^2 values are not relatively acceptable.
#The F-statistic indicates that the regression model is overall significant.
#The Durbin-Watson statistic suggests no autocorrelation in the model.
#Of particular importance is the estimated beta coefficient, which stands at 1.2470 and is statistically significant.
#Furthermore, this suggests that the stock falls into the category of high-risk stocks.
#However, in comparison to Tesla, it is less risky because its Beta estimation is much lower than that one.
#Additionally, the p-value statistics indicate that the estimates are valid and significant at a 99% confidence level.

# II: CAPM

We need more data. Let's start getting them and continue with the rest. Please follow these steps:

1. We still need to use the daily return data of TSLA and GE. They are still in the dataframe *daily_ret*.

2. Now, its time to get the risk-free returns. We are going to get it from Kenneth French web page. There are a lot of ways for you to download data from a website in Python. Luckily, there is a very convenient package, *pandas_datareader*, can help you do this. Use the *get_data_famafrench* function you can get data for Fama French factors from the Kenneth French library and returns it as a Pandas dataframe.

3. Try to use the *get_data_famafrench('F-F_Research_Data_5_Factors_2x3_daily', startdate, enddate)[0]* to get Fama/French 3 Factors (daily). Keep the daily values which can match our stock daily returns. Bring the columns with names Mkt-RF (it is market excess return, market return subtracted by risk-free return; ), SMB (Small minus Big), HML (High minus Low), RF (risk-free rate). You should divide each value by 100. **(Note: Here, we are using the Market return estimates in the Kenneth French website. This is actually the value-weight return of all CRSP firms incorporated in the US and listed on the NYSE, AMEX, or NASDAQ. This estimate of the market return is widely used in research, and it is highly correlated to the S&P 500.)**  

4. Compute the excess returns for TSLA and GE (stock returns minus risk-free rate). Construct these new columns as *TSLA_Excess*, *GE_Excess*. 
5. We have the excess returns for the stocks and excess returns for the market. Now, let's see what we want to estimate. 
6. The CAPM model is as follows:$$E[{R_{i}}] = R_{F} + \beta_{i}E[R_{M} - R_{F}]$$  $$E[{R_{i}}-R_{F} ] = \beta_{i}E[R_{M} - R_{F}] $$ So, we need to regress excess returns for the stocks on the market excess returns. **(Note: In regression, we are going to consider a constant as $ a_{i} $. So, we are estimating: $$  R_{it} - R_{F}  = a_{i} + \beta_{i}[R_{Mt} - R_{F}] + {e}_{it}  $$ $ a_{i} $ is Jensen's alpha.**

In [21]:
#First we need to install the pandas_datareader package:
import sys
!{sys.executable} -m pip install --upgrade pandas_datareader



In [28]:
#import the library to easily get FF factor data
import pandas_datareader as pdr
#import famafrench.famafrench as ff

ff = pdr.get_data_famafrench('F-F_Research_Data_5_Factors_2x3_daily',start=2019, end=2020)[0]
# the [0] is because the imported obect is a dictionary, 
# and key=0 is the first dataframe that contains the data we need


ff.rename(columns={"Mkt-RF":"mkt_excess"}, inplace=True)
ff = ff/100
ff = ff.join(daily_ret,how='inner')

# convert to excess returns in prep for regressions
for stock in ['TSLA', 'GE']:    
    ff[stock+'_excess'] = ff[stock] - ff['RF']

print(ff)

  ff = pdr.get_data_famafrench('F-F_Research_Data_5_Factors_2x3_daily',start=2019, end=2020)[0]


            mkt_excess     SMB     HML     RMW     CMA       RF      TSLA  \
2019-01-17      0.0075  0.0009 -0.0024  0.0006  0.0000  0.00010  0.017817   
2019-01-18      0.0129 -0.0029  0.0012  0.0020 -0.0027  0.00010 -0.008753   
2019-01-22     -0.0153 -0.0035  0.0032  0.0026  0.0034  0.00010 -0.044150   
2019-01-23      0.0015 -0.0041 -0.0013  0.0035  0.0029  0.00010  0.008083   
2019-01-24      0.0023  0.0048 -0.0001 -0.0015 -0.0019  0.00010  0.005727   
...                ...     ...     ...     ...     ...      ...       ...   
2019-12-23      0.0010  0.0016 -0.0035 -0.0013  0.0034  0.00007  0.011786   
2019-12-24      0.0001  0.0037  0.0000 -0.0028  0.0004  0.00007  0.002688   
2019-12-26      0.0048 -0.0056  0.0000  0.0025 -0.0018  0.00007  0.003575   
2019-12-27     -0.0010 -0.0055 -0.0007  0.0025  0.0010  0.00007 -0.004453   
2019-12-30     -0.0057  0.0028  0.0056  0.0011  0.0038  0.00007 -0.008944   

                  GE     SP500  TSLA_excess  GE_excess  
2019-01-17  0.0036

In [29]:
import statsmodels.api as sm

In [30]:
y=ff.TSLA_excess
x=ff.mkt_excess
res_TSLA = sm.OLS(y, sm.add_constant(x)).fit()
print(res_TSLA.summary())

                            OLS Regression Results                            
Dep. Variable:            TSLA_excess   R-squared:                       0.232
Model:                            OLS   Adj. R-squared:                  0.229
Method:                 Least Squares   F-statistic:                     71.98
Date:                Wed, 27 Mar 2024   Prob (F-statistic):           2.33e-15
Time:                        20:54:36   Log-Likelihood:                 573.98
No. Observations:                 240   AIC:                            -1144.
Df Residuals:                     238   BIC:                            -1137.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const      -7.004e-06      0.001     -0.005      0.9

In [None]:
**Interpretation**
#The model has been well-fitted, with both the R^2 and adjusted R^2 yielding relatively acceptable values
#The F-statistic indicates that the regression model is overall significant.
#The Durbin-Watson statistic suggests no autocorrelation in the model.
#Of particular importance is the estimated beta coefficient, which stands at 1.5506 and is statistically significant.
#Moreover, this suggests that the stock falls into the category of high-risk stocks
#Because the coefficient for estimating Beta is 1.5506 which is statistically significant.
#Furthermore, the p-value statistics indicate that the estimates are valid and significant at a 99% confidence level.

In [26]:
y=ff.GE_excess
x=ff.mkt_excess
res_TSLA = sm.OLS(y, sm.add_constant(x)).fit()
print(res_TSLA.summary())

                            OLS Regression Results                            
Dep. Variable:              GE_excess   R-squared:                       0.104
Model:                            OLS   Adj. R-squared:                  0.100
Method:                 Least Squares   F-statistic:                     27.66
Date:                Wed, 27 Mar 2024   Prob (F-statistic):           3.23e-07
Time:                        20:53:42   Log-Likelihood:                 507.92
No. Observations:                 240   AIC:                            -1012.
Df Residuals:                     238   BIC:                            -1005.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const       8.883e-05      0.002      0.047      0.9

In [None]:
**interpretation**
#Although the model has been fitted, the R^2 and adjusted R^2 values are not relatively acceptable.
#The F-statistic indicates that the regression model is overall significant.
#The Durbin-Watson statistic suggests no autocorrelation in the model.
#Of particular importance is the estimated beta coefficient, which stands at 1.2657 and is statistically significant.
#Furthermore, this suggests that the stock falls into the category of high-risk stocks
#However, in comparison to Tesla, it is less risky because its Beta estimation is much lower than that one.
#Additionally, the p-value statistics indicate that the estimates are valid and significant at a 99% confidence level.

# III: APT

1. Continue to use the excess returns for the stocks, Market Excess return (Mkt-Rf), SMB, and HML that we have got previously

2. We are going to run a factor model as follows:
$$  R_{it}-R_F = a_{i} + \beta_{1}(Mkt-RF)_t + \beta_{2}SMB_t + \beta_{3}HML_t + e_{it}   $$

3. Let's estimate this model for each stock. **Note: Choose three columns of (Mkt-Rf), SMB, and HML, at the the same time as the independent variables.** 

4. Find coefficients $\beta_{1} $, $\beta_{2} $ and $\beta_{3} $. 
5. Are they significant? What do they mean? Discuss it for each stock. Check the constant.

In [31]:
x=ff[['mkt_excess', 'SMB', 'HML']]
y=ff.TSLA_excess
res_TSLA = sm.OLS(y,sm.add_constant(x)).fit()
print(res_TSLA.summary())

                            OLS Regression Results                            
Dep. Variable:            TSLA_excess   R-squared:                       0.268
Model:                            OLS   Adj. R-squared:                  0.258
Method:                 Least Squares   F-statistic:                     28.75
Date:                Wed, 27 Mar 2024   Prob (F-statistic):           7.00e-16
Time:                        20:57:06   Log-Likelihood:                 579.65
No. Observations:                 240   AIC:                            -1151.
Df Residuals:                     236   BIC:                            -1137.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0005      0.001      0.332      0.7

In [None]:
**Interpretation**
#Although the model has been fitted, the R^2 and adjusted R^2 values are not relatively acceptable and improved.
#The F-statistic indicates that the regression model is overall significant.
#The Durbin-Watson statistic suggests no autocorrelation in the model.
#Of particular importance is the estimated beta coefficient, which stands at 1.4429 and is statistically significant.
#Furthermore, this suggests that the stock falls into the category of high-risk stocks
#Additionally, the p-value statistics indicate that the estimates are valid and significant at a 99% confidence level.

In [32]:
x = ff[['mkt_excess', 'SMB', 'HML']]
y = ff.GE_excess

res_GE = sm.OLS(y,sm.add_constant(x)).fit()
print(res_GE.summary())

                            OLS Regression Results                            
Dep. Variable:              GE_excess   R-squared:                       0.119
Model:                            OLS   Adj. R-squared:                  0.108
Method:                 Least Squares   F-statistic:                     10.60
Date:                Wed, 27 Mar 2024   Prob (F-statistic):           1.46e-06
Time:                        20:57:20   Log-Likelihood:                 509.89
No. Observations:                 240   AIC:                            -1012.
Df Residuals:                     236   BIC:                            -997.9
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0004      0.002      0.191      0.8

In [None]:
**Interpretation**
#Although the model has been fitted, the R^2 and adjusted R^2 values are not relatively acceptable and improved.
#The F-statistic indicates that the regression model is overall significant.
#The Durbin-Watson statistic suggests no autocorrelation in the model.
#Of particular importance is the estimated beta coefficient, which stands at 1.1216 and is statistically significant.
#Furthermore, this suggests that the stock falls into the category of high-risk stocks.
#GE is less volatile than TSLA thus it is less risky, it might be a good suggestion for risk-averse investores.
#Additionally, the p-value statistics indicate that the estimates are valid and significant at a 99% confidence level.

# IV: APT continued

We are going to build a portfolio and compute its betas.

1. Create a portfolio with $w_{1} = 0.5$ in TSLA and $w_{2} =0.5$ in GE.
2. Estimate the following factor model for this portfolio:
$$  R_{it} -R_F = a_{i} + \beta_{1}(Mkt-RF)_t + \beta_{2}SMB_t + \beta_{3}HML_t + e_{it}  $$
3. Now, you have beta values for Market excess return, SMB and HML. 
4. Compare the values you estimated to the individual betas you estimated in the last Task for Market excess return, SMB and HML. Is the portfolio beta the weighted average of betas estimated for each stock?  

In [33]:
import statsmodels.api as sm

In [34]:
ff['port_excess']=0.5*ff['TSLA']+0.5*ff['GE']-ff['RF']
x=ff[['mkt_excess', 'SMB','HML']]
y=ff.port_excess
res_port= sm.OLS(y, sm.add_constant(x)).fit()
print(res_port.summary())

                            OLS Regression Results                            
Dep. Variable:            port_excess   R-squared:                       0.282
Model:                            OLS   Adj. R-squared:                  0.273
Method:                 Least Squares   F-statistic:                     30.95
Date:                Wed, 27 Mar 2024   Prob (F-statistic):           6.60e-17
Time:                        20:58:53   Log-Likelihood:                 615.13
No. Observations:                 240   AIC:                            -1222.
Df Residuals:                     236   BIC:                            -1208.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0004      0.001      0.341      0.7

In [None]:
#Although the model has been fitted, the R^2 and adjusted R^2 values are not relatively acceptable and improved also.
#Considering that R^2 has improved, but the variable HML is not statistically significant,
#it is recommended to conduct heteroscedasticity tests or examine the behavior of the residual variable.
#The F-statistic indicates that the regression model is overall significant.
#The Durbin-Watson statistic suggests no autocorrelation in the model.
#Of particular importance is the estimated beta coefficient, which stands at 1.2823 and is statistically significant.
#SMB is is statistically significant but the intercept and HML arent statistically significant.
#Furthermore, this suggests that the stock falls into the category of high-risk stocks
#Additionally, the p-value statistics indicate that the estimates are valid and significant at a 99% confidence level.
                       ***Suggestion***
##The results show that diversification and mixing various assets in a recommended portfolio is a professional approach##

In [41]:
#excess retutn of portfolio annually
print('Excess Retutn of portfolio annually(%) =', 0.0004*252*100)

Excess Retutn of portfolio annually(%) = 10.08


In [45]:
#standard error of constant is :
print('Standard Error of portfolio(%) =', 0.001*252*100)
#which is 25% and it means our excess return is fluctuated
#because "t"  is high and "p-value" is low on market excess it shows that this test is good fit, it avoids type I error

Standard Error of portfolio(%) = 25.2


## End