## Capital Asset Pricing Model 

#### This is a notebook about the CAPM model and all the definitions related to it. 


In [11]:
import yfinance as yf
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from pandas_datareader import data as wb
import datetime as dt

### 1. Portfolio Risk
<b>Portfolio Risk</b> : There are usually two types of risk that can affect a portfolio(stocks, assets,...). 
<br>The first one is called <b>systematic risk</b>, which is basically the risk from general economics conditions, such as inflations, interest rats, exchange rates,... In another word - Macroeconomic factors that are non-predicted.
<br> The second type is <b>non-systematic risk</b> or diversifiable risk, are those who are unique to a specific company or industry (labor, regulatory changes, shortages of raw materials,...). This kind of risk can be minimized  and eliminated by diversification

### 2. Covariance and Correlation
I will do a super brief overview on this section. Consider we are working on a variable space : 
<br>
1.<b>Covariance</b> : A key determinant of portfolio risk is the extent to which
the returns on the two assets vary either in <b>tandem or in opposition</b>. The <b>sign</b> of the covariance is determined by whether deviations from the mean move together and whether they are small or large at the same time 
<p style="text-align: center;">
    $Cov(X_p,X_q)=\sum f_i [x_{i,p}-E(X_p)] [x_{i,q}-E(X_q)]$
          </p>
$f_i $ - probability-weighted average
<br>
2.<b>Correlation ($\rho$)</b> : measure the linear relationship between 2 variables.
<br>i. Correlation can range from [-1;1]
<br>ii. A correlation of -1 indicates that one asset’s return varies perfectly inversely with the other’s. t. Conversely, a correlation of +1 would indicate perfect positive correlation
<br>
<p style="text-align: center;">
    $Corr(X_p,X_q)=\frac{cov(X_p,X_q)}{\sigma_p \sigma_q}$
          </p>

In [11]:
#Pharmaceutical sector.I choose the 5 biggest enterprises on the market 
#pct_change : calculates the percentage change in the values through a series


tickers=['LLY','NVO','JNJ','PFE','AZN','^GSPC']
def get_data(tickers,start,end):
    lst= []
    #end=dt.datetime.now()
    #start=dt.date(end.year-1,end.mont,end.day)
    for symbol in tickers:
        ticker = yf.download(symbol, start,end)
        
        opt= ticker['Close']
        lst.append(opt)
        df= pd.DataFrame(lst)
        df=df.transpose()
    df.columns=tickers 
    return df

    
start='2023-10-14'
end='2024-10-14'   
df=get_data(tickers,start,end)
df

  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed


Unnamed: 0_level_0,LLY,NVO,JNJ,PFE,AZN,^GSPC
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2023-10-16,616.640015,101.150002,157.529999,33.270000,67.820000,4373.629883
2023-10-17,608.309998,101.160004,156.089996,32.750000,69.279999,4373.200195
2023-10-18,607.239990,100.570000,152.729996,31.410000,65.239998,4314.600098
2023-10-19,590.799988,97.660004,152.320007,31.190001,64.419998,4278.000000
2023-10-20,584.640015,96.290001,153.000000,30.650000,63.830002,4224.160156
...,...,...,...,...,...,...
2024-10-07,898.400024,117.769997,159.529999,29.200001,76.870003,5695.939941
2024-10-08,913.719971,117.199997,159.690002,29.180000,76.870003,5751.129883
2024-10-09,919.739990,117.000000,160.649994,30.190001,77.510002,5792.040039
2024-10-10,910.690002,117.529999,160.509995,29.340000,76.870003,5780.049805


In [12]:
# building variance covariance matrix 
df=df.pct_change()
df.cov()

Unnamed: 0,LLY,NVO,JNJ,PFE,AZN,^GSPC
LLY,0.000353,0.000224,-1.1e-05,-1.1e-05,3.6e-05,5.5e-05
NVO,0.000224,0.000375,-6e-06,-2e-06,5.1e-05,5.5e-05
JNJ,-1.1e-05,-6e-06,9.4e-05,6.9e-05,3.6e-05,4e-06
PFE,-1.1e-05,-2e-06,6.9e-05,0.000249,2.6e-05,1.5e-05
AZN,3.6e-05,5.1e-05,3.6e-05,2.6e-05,0.000155,2e-05
^GSPC,5.5e-05,5.5e-05,4e-06,1.5e-05,2e-05,6.2e-05


In [13]:
df.var()

LLY      0.000353
NVO      0.000375
JNJ      0.000094
PFE      0.000249
AZN      0.000155
^GSPC    0.000062
dtype: float64

In [14]:
df.corr()

Unnamed: 0,LLY,NVO,JNJ,PFE,AZN,^GSPC
LLY,1.0,0.614644,-0.061669,-0.035547,0.154872,0.372398
NVO,0.614644,1.0,-0.032313,-0.005521,0.21229,0.358798
JNJ,-0.061669,-0.032313,1.0,0.452069,0.300642,0.057688
PFE,-0.035547,-0.005521,0.452069,1.0,0.133356,0.118706
AZN,0.154872,0.21229,0.300642,0.133356,1.0,0.199025
^GSPC,0.372398,0.358798,0.057688,0.118706,0.199025,1.0


In [15]:
df.mean()

LLY      0.001835
NVO      0.000874
JNJ      0.000146
PFE     -0.000405
AZN      0.000606
^GSPC    0.001176
dtype: float64

So we have a little bit overview and introduction here. 
<br> Moving to the following section, we will talk about the beta, alpha metrics and the CAPM model 


### 3. Risky-Assets-Portfolios
These are the 3 main properties of two risky assets portfolios
<br><b>1.</b> The rate of return on a portfolio of 2 assets is the weighted average of returns on the component securities, with the investment proportions as weights
<p style="text-align: center;">
    $r_p= w_b r_b+ w_s b_s$
          </p>
$w_i$ is basically the investing proportion for the one specific asset, it means how much you willing to spend comparing to the amount that you spend on a specific portfolio, in another word $w_i= \frac{amountspendingoni}{totalamountspending}$
<br>$r_i$ is the average rate of returns for each asset 
<br><b>2.</b> The expected rate of return on a portfolio is the weighted average of the expected returns on the component securities, with the portfolio proportions as weights.
<p style="text-align: center;">
    $E(r_p)= w_b E(r_b) + w_s E(r_s)$
          </p>
<br><b>3.</b>The variance of the rate of return on a two-risky-asset portfolio is
<p style="text-align: center;">
    $\sigma (b+s)=(w_b \sigma_b)^2 + (w_s \sigma_s)^2 + 2(w_b \sigma_b)^2 (w_s \sigma_s)^2 \rho_{bs}$
          </p>

So basically these 3 rules are simple probability rules with linearity's properties.

---! After reading several different sources of information to gain a better understading about this model, I realize it's simply just statistical learning and a touch of optimization, so if you have a sufficient base in probability and statistics, you already got 80% of this model ---

### 4. Capital Asset Pricing Model

<b>Sharpe Ratio</b>: measures the performance of a security compared to a risk-free asset, after adjusting for its risk. This is the excess return($r_i$) per unit of risk of an investment :
<p style="text-align: center;">
    $Sharpe = \frac{Risk premium }{\sigma}=\frac{\bar{r} -r_f}{\sigma_i}$
          </p>
<br>When Sharpe > 1, GOOD risk-adjusted returns
<br>When Sharpe > 2, VERY GOOD risk-adjusted returns
<br>When Sharpe > 3, EXCELLENT risk-adjusted returns

<b>Beta</b>: This is an very efficient risk measurement when it comes to the non-systematic risk. <b>Beta</b> measures the ratio of the stock and the market portfolio. In the other words, it is a measure of how much risk the investment will add to a portfolio that looks like the market :
<p style="text-align: center;">
    $\beta= \frac{\sigma_{i,m}}{\sigma_{m}^2}$
          </p>
<br>When beta = 0, it means that there's no relationship.
<br>When beta < 1, it means that the stock is defensive (less prone to high highs and lows)
<br>When beta > 1, it means that the stock is aggresive (more prone to high highs and lows)

<b>CAPM</b> : provides a precise prediction of the relationship we should observe between the risk of an asset and its expected return. It gives us a benchmark rate of return for evaluating possible investments and helps us make an educated guess as to the expected return on assets that have not yet been traded in the marketplace
<br>
This model is used in the context where we assume (i) Markets for securities are perfectly competitive and equally profitable to all investors and (ii) Investors choose investment portfolios in the same manner
<br>
<b>Expected Return CAPM</b>: calculates the expected return of a security adjusted to the risk taken. This equates to the return expected from taking the extra risk of purchasing this security.
<p style="text-align: center;">
    $\bar{r_i}=r_f+\beta_i(\bar{r_m} -r_f)$
          </p>
$\bar{r_i}= E(r_i)$: Expected return of a security $i$
<br>$\bar{r_m}=E(r_m)$: Expected market return 

 
 


In [35]:
# Calculate the covariance of each asset with the market 
cov_i= df.cov()
cov_i=cov_i['^GSPC']
cov_i=cov_i.drop(cov_i.index[-1])
cov_i.rename('cov_i')

LLY    0.000055
NVO    0.000055
JNJ    0.000004
PFE    0.000015
AZN    0.000020
Name: cov_i, dtype: float64

In [53]:
# Calculate the std of each asset with the market 
std_i= df.std()
std_i=std_i.drop(std_i.index[-1])
std_i

LLY    0.018776
NVO    0.019376
JNJ    0.009701
PFE    0.015772
AZN    0.012446
dtype: float64

In [40]:
# Calculate the variance of the market return
var_market=df['^GSPC'].var()
var_market

6.217133957039135e-05

In [42]:
#beta
beta= cov_i/var_market
beta

LLY    0.886759
NVO    0.881686
JNJ    0.070975
PFE    0.237450
AZN    0.314157
Name: ^GSPC, dtype: float64

In [46]:
#Rate free 19/10/2024 in the US market: 4.63%
riskfree= 0.0463
riskpremium=(df['^GSPC'].mean())-riskfree
# By applying the formula
capm=riskfree+beta*riskpremium
capm.rename("CAPM")

LLY    0.006286
NVO    0.006514
JNJ    0.043097
PFE    0.035585
AZN    0.032124
Name: CAPM, dtype: float64

In [55]:
# Now we calculate the Sharpe
sharpe= capm/std_i
sharpe.rename('Sharpe')

LLY    0.334774
NVO    0.336218
JNJ    4.442571
PFE    2.256193
AZN    2.581037
Name: Sharpe, dtype: float64

In [12]:
# so now I will test some function 
class get_data():
    def __init__(self,tickers, market,start_date,end_date):
        self.tickers=tickers
        self.market=market
        self.start_date=start_date
        self.end_date=end_date
    def get_data_ticker(self):
        df=pd.DataFrame()
        for i in self.tickers:
            df[i]=yf.download(i,self.start_date,self.end_date)['Close']
        df=df.pct_change()
        return df
    def get_data_market(self):
        #market=yf.download(self.market,self.start_date,self.end_date)['Close']
        df=pd.DataFrame()
        for i in self.market:
            df[i]= yf.download(self.market,self.start_date,self.end_date)['Close']
        df=df.pct_change()
        return df
    

In [327]:
start='2023-10-14'
end='2024-10-14'
tickers=['LLY','NVO','JNJ','PFE','AZN']
market=["^GSPC"]
a= get_data(tickers, market,start,end)
tc=a.get_data_ticker()
mr=a.get_data_market()
tc = tc.dropna(how='all')
mr=mr.dropna(how='all')
mr


  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed


Unnamed: 0_level_0,^GSPC
Date,Unnamed: 1_level_1
2023-10-17,-0.000098
2023-10-18,-0.013400
2023-10-19,-0.008483
2023-10-20,-0.012585
2023-10-23,-0.001686
...,...
2024-10-07,-0.009586
2024-10-08,0.009689
2024-10-09,0.007113
2024-10-10,-0.002070


In [182]:
bis=pd.concat([tc,mr], axis=1, join="inner")
bis=bis.cov()

bis=bis.take([-1], axis=1)
bis
#cov_i=cov_i.drop(cov_i.index[-1])

Unnamed: 0,Close
LLY,5.5e-05
NVO,5.5e-05
JNJ,4e-06
PFE,1.5e-05
AZN,2e-05
Close,6.2e-05


In [13]:
class capm():
    def __init__(self,tickers,market,start,end,riskfree):
        self.tickers=tickers
        self.market=market
        self.start=start
        self.end=end
        self.riskfree=riskfree
        a=get_data(tickers,market,start,end)
        self.df_mrk=a.get_data_market()
        self.df_tck=a.get_data_ticker()
    def beta(self):
        mrk_var= self.df_mrk.var()
        cov_bis= pd.concat([self.df_mrk,self.df_tck],axis=1,join='inner') #concat 2 df to calculate the covariance of each tickers with the marker
        cov_bis=cov_bis.cov()
        cov=cov_bis.take([0],axis=1)
        cov= cov.drop(cov.index[0])
        beta= cov/mrk_var
        return beta
    def capm(self):
        riskpre=self.df_mrk.mean() - self.riskfree
        beta=self.beta()
        beta_bis= beta.multiply(riskpre)
        capm= self.riskfree+beta_bis
        return capm 
    def sharpe(self):
        bis= self.capm() - self.riskfree
        tck_std= self.df_tck.std()
        tck_std=pd.DataFrame(tck_std)
        df=pd.concat([bis,tck_std],axis=1,join='inner',ignore_index=True)
        re=pd.DataFrame( df[0]/df[1])
        return re


In [14]:

start='2023-10-14'
end='2024-10-14'
tickers=['LLY','NVO','JNJ','PFE','AZN']
market=["^GSPC"]
b=capm(tickers, market,start,end,0.025)
capm_test=b.capm()
capm_test
sharpe=b.sharpe()
sharpe

  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed


Unnamed: 0,0
LLY,-1.12521
NVO,-1.084118
JNJ,-0.174305
PFE,-0.358673
AZN,-0.601359


Try this with some companies in the energy sector


In [1]:
import capm
from capm import get_data, capm

In [2]:
start='2023-10-14'
end='2024-10-14'
tickers=['HES','FANG','PSX','OKE','COP','CVX']
market=["^GSPC"]

In [3]:
a= get_data(tickers, market,start,end)
b=capm(tickers, market,start,end,0.0463)

  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed


In [4]:
tc=a.get_data_ticker()
mr=a.get_data_market()
tc = tc.dropna(how='all')
mr=mr.dropna(how='all')
print('Historical Data of the components :', tc)
print('Historical Data of the market :', mr)

  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed
  df.index += _pd.TimedeltaIndex(dst_error_hours, 'h')
[*********************100%%**********************]  1 of 1 completed

Historical Data of the components :                  HES      FANG       PSX       OKE       COP       CVX
Date                                                                  
2023-10-17  0.016170  0.006332  0.019055 -0.007992  0.001197  0.013241
2023-10-18  0.005792  0.008607  0.020543 -0.012516  0.009007  0.007936
2023-10-19  0.006728 -0.005885 -0.009118  0.008304 -0.001659  0.001066
2023-10-20 -0.018425 -0.014445 -0.014324 -0.023551 -0.018357 -0.013424
2023-10-23 -0.010551 -0.006968 -0.004316 -0.012430 -0.021925 -0.036864
...              ...       ...       ...       ...       ...       ...
2024-10-07 -0.000428  0.025127 -0.000866 -0.004095  0.001572  0.002521
2024-10-08 -0.020958 -0.028722 -0.044549 -0.013075 -0.034170 -0.015683
2024-10-09  0.008301 -0.000103  0.018061  0.012286  0.001534  0.006050
2024-10-10  0.006860  0.001600  0.012619 -0.000633  0.008020  0.006482
2024-10-11  0.002582  0.004638 -0.000513  0.020277 -0.012247  0.005311

[249 rows x 6 columns]
Historical Data o




In [5]:
beta=b.beta()
capm=b.capm()
sharpe=b.sharpe()


In [6]:
print('Beta:',beta)
print('CAPM Expected Return:', capm)
print('Sharpe Ratio:', sharpe)

Beta:          Close
HES   0.535323
FANG  0.449959
PSX   0.495477
OKE   0.559441
COP   0.321759
CVX   0.429959
CAPM Expected Return:          Close
HES   0.022144
FANG  0.025996
PSX   0.023942
OKE   0.021056
COP   0.031781
CVX   0.026898
Sharpe Ratio:              0
HES  -1.561566
FANG -1.221065
PSX  -1.405606
OKE  -2.071511
COP  -1.076846
CVX  -1.508495
