# Capital Asset Pricing Model (CAPM) Example

The Capital Asset Pricing Model (CAPM) is a financial model used to determine the expected return on an investment based on its risk relative to the overall market. The CAPM formula is:

$$
E(R_i) = R_f + \beta_i (E(R_m) - R_f)
$$

**Risk Free + The premium on the risk**

Where:
- $E(R_i)$  = Expected return of the investment
- $R_f$  = Risk-free rate
- $*\beta_i$  = Beta of the investment (measure of how much the investment’s returns move with the market)
- $E(R_m)$  = Expected return of the market
- $(E_m) - R_f$  = Market risk premium


In [2]:
import yfinance as yf
import pandas as pd
import numpy as np
import statsmodels.api as sm
import datetime

In [3]:
# 1) Download data for Apple (AAPL) and the S&P 500 (^GSPC)
stock_ticker = "AAPL"
market_ticker = "^GSPC"

end_date = datetime.date.today()
start_date = end_date - datetime.timedelta(days=3*365)  # 3 years of data

df_stock = yf.download(stock_ticker, start=start_date, end=end_date, auto_adjust=False)
df_market = yf.download(market_ticker, start=start_date, end=end_date, auto_adjust=False)

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed


In [4]:
df_stock.head()

Price,Adj Close,Close,High,Low,Open,Volume
Ticker,AAPL,AAPL,AAPL,AAPL,AAPL,AAPL
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
2022-02-25,162.221436,164.850006,165.119995,160.869995,163.839996,91974200
2022-02-28,162.487137,165.119995,165.419998,162.429993,163.059998,95056600
2022-03-01,160.597733,163.199997,166.600006,161.970001,164.699997,83474400
2022-03-02,163.904205,166.559998,167.360001,162.949997,164.389999,79724800
2022-03-03,163.579437,166.229996,168.910004,165.550003,168.470001,76678400


In [5]:
df_market.head()

Price,Adj Close,Close,High,Low,Open,Volume
Ticker,^GSPC,^GSPC,^GSPC,^GSPC,^GSPC,^GSPC
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
2022-02-25,4384.649902,4384.649902,4385.339844,4286.830078,4298.379883,5177060000
2022-02-28,4373.939941,4373.939941,4388.839844,4315.120117,4354.169922,6071370000
2022-03-01,4306.259766,4306.259766,4378.450195,4279.540039,4363.140137,5846230000
2022-03-02,4386.540039,4386.540039,4401.47998,4322.560059,4322.560059,5337870000
2022-03-03,4363.490234,4363.490234,4416.779785,4345.560059,4401.310059,5039890000


The `pct_change()` function in pandas calculates the percent change between the current row’s value and the previous row’s value.

For example, if your data has two consecutive days of adjusted closing prices, `price[t-1]` on day `t-1` and `price[t]` on day `t`, then:

$$
\text{pct change}[t] = \frac{\text{price}[t] - \text{price}[t-1]}{\text{price}[t-1]}
$$

So, if you have:

- `price[t-1] = 100`
- `price[t] = 110`

then

$$
\text{pct change}[t] = \frac{110 - 100}{100} = 0.10 \quad \text{or} \quad 10\%
$$

You calculate percentage returns so you can analyze how much a stock (or market index) went up or down (in percentage terms) each period. This is crucial in finance because:

1.	Returns Are Comparable Over Time:
	- Looking at absolute prices can be misleading, especially if the stock splits or pays dividends.
	- By using pct_change(), you’re converting price differences into percentage gains or losses, which are directly comparable across time (and even across different stocks).
2. Key to Performance Analysis:
	- Most portfolio and risk models (including the CAPM) rely on returns rather than raw prices.
	- For example, CAPM requires you to compare the stock’s returns to market returns to find beta.
3. Using Adjusted Close:
	- If you want your returns to reflect the true experience of someone holding the stock—including dividends or splits—you use the adjusted closing price.
	- The adjusted close is retroactively adjusted so that it’s on a consistent basis before/after dividends or splits, giving an accurate reflection of performance.

In short,

```Python
df_stock["Returns"] = df_stock["Adj Close"].pct_change()
df_market["Returns"] = df_market["Adj Close"].pct_change()
```
creates two new columns in each DataFrame showing the day-to-day percentage changes in the adjusted prices. This is usually the right choice for time-series analysis and models like CAPM.

If you use `df_stock["Close"].pct_change()`, you’re calculating returns on the unadjusted prices. This might cause misleading spikes or drops on dividend or split dates, so analysts typically prefer the adjusted version for accurate return calculations.

In [6]:
# 2) Calculate daily returns
df_stock["Returns"] = df_stock["Adj Close"].pct_change()
df_market["Returns"] = df_market["Adj Close"].pct_change()
df_stock["Close"].pct_change()

Ticker,AAPL
Date,Unnamed: 1_level_1
2022-02-25,
2022-02-28,0.001638
2022-03-01,-0.011628
2022-03-02,0.020588
2022-03-03,-0.001981
...,...
2025-02-14,0.012711
2025-02-18,-0.000531
2025-02-19,0.001636
2025-02-20,0.003920


When a company pays dividends or does a stock split, the raw closing price on any given date can look very different from what you’d expect if you’re measuring the true performance of holding that stock. The “Adj Close” (adjusted close) is a version of the closing price that has been retroactively adjusted to reflect these corporate actions, so that historical prices are consistent over time.

Here’s why it matters:
1. Dividends: If a stock pays a dividend, part of the company’s value is effectively handed back to shareholders as cash.
	- The next day, you might see the stock price drop roughly by the dividend amount.
	- The raw close might show a drop, but if you were holding the stock, you received that dividend, so your total return is not just the drop in price.
2.	Stock Splits: When a company splits its stock (e.g., 2-for-1 split), the number of shares doubles and the price roughly halves.
	- If you look at the raw historical price before the split, it may appear artificially high compared to after the split.
	- The adjusted close will factor in the split so that the price history is on a comparable scale.

By adjusting historical prices for these events, the “Adj Close” series reflects the true value of holding the stock over time, including reinvested dividends and after accounting for splits. This is crucial for calculating accurate returns.
- If you ignore splits and dividends (using the raw “Close”), you might see big “jumps” or “drops” on certain days that don’t truly represent gains or losses for a long-term holder.
- If you use “Adj Close,” your time series is smoothed out to reflect a consistent basis for return calculations.

In older versions of yfinance, you’d see both "Close" and "Adj Close" columns side by side. With the newer default setting (auto_adjust=True), you get a single "Close" column where the data is already adjusted for splits and dividends under the hood. That’s why you might not see "Adj Close" anymore—it’s effectively built into "Close".

In [7]:
# 3) Merge on same dates and drop NaNs
df_merged = pd.DataFrame({
    "Stock": df_stock["Returns"],
    "Market": df_market["Returns"]
}).dropna()

df_merged.head()

Unnamed: 0_level_0,Stock,Market
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2022-02-28,0.001638,-0.002443
2022-03-01,-0.011628,-0.015474
2022-03-02,0.020589,0.018643
2022-03-03,-0.001981,-0.005255
2022-03-04,-0.018408,-0.007934


## Regression

You have two variables:
- Y = Stock returns (the dependent variable)
- X = Market returns (the independent variable

you are adding a column of 1s to X, allowing the regression to estimate two parameters:


$$Y = \beta X + \alpha + \varepsilon, $$


where:
- $\alpha$ is the intercept or constant term,
- $\beta$ is the slope with respect to the market returns,
- $\varepsilon$ is the error term.

### Using the formula for $\beta$ in a simple linear regression of Y on X:

$$\beta = \frac{\text{Cov}(Y, X)}{\text{Var}(X)},$$

where $\text{Cov}(Y, X)$ is the covariance between $Y$ (stock returns) and $X$ (market returns), and $\text{Var}(X)$ is the variance of $X$.

### Solving for $\alpha$:

$$\alpha = \overline{Y} \;-\; \beta \,\overline{X},$$

where $\overline{Y}$ and $\overline{X}$ are the means of $Y$ and $X$ respectively.

This is precisely what ordinary least squares would give you in a single-variable (univariate) regression. Below is a fully worked-out pandas example assuming you have two columns of returns in a DataFrame, `df_merged["Stock"]` and `df_merged["Market"]`.



In [8]:
# 4) Perform a linear regression to find beta
X = df_merged["Market"]
Y = df_merged["Stock"]
X = sm.add_constant(X)  # Adds a constant term to the predictor

model = sm.OLS(Y, X).fit()
alpha, beta = model.params
print("Alpha (Intercept):", alpha)
print("Beta (Slope)     :", beta)
# print(model.summary())

Alpha (Intercept): 0.00012411739362783517
Beta (Slope)     : 1.195827577221563


### 1. Simple Regression Model

We have one independent variable $X$ (e.g., market returns) and one dependent variable $Y$ (e.g., stock returns). The model is:


$$Y = \alpha + \beta X + \varepsilon$$,

where:
- $\alpha$ is the intercept or constant term,
- $\beta$ is the slope with respect to the market returns,
- $\varepsilon$ is the error term.

Goal: Find $\alpha$ and $\beta$ that make the line “fit” the data best in the least squares sense.

---

### 2. Deriving the Formulas via Minimizing Errors

The ordinary least squares (OLS) method says: choose \alpha and \beta to minimize the sum of squared residuals:


$$\text{SSE} = \sum_{i=1}^n (Y_i - [\alpha + \beta X_i])^2.$$


To do that, you set the partial derivatives of SSE with respect to $\alpha$ and $\beta$ to zero.

1. Partial derivative w.r.t. $\alpha$:

$$\frac{\partial}{\partial \alpha} \text{SSE}
= -2 \sum (Y_i - \alpha - \beta X_i)
= 0 \quad \Longrightarrow \quad \alpha = \overline{Y} - \beta\,\overline{X}.$$

This says the best-fit line should pass through $(\overline{X}, \overline{Y})$ — the means of $X$ and $Y$.

2. Partial derivative w.r.t. $\beta$:

$$\frac{\partial}{\partial \beta} \text{SSE}
= -2 \sum X_i \,(Y_i - \alpha - \beta X_i)
= 0.$$

Plugging in $\alpha = \overline{Y} - \beta\,\overline{X}$ and rearranging leads to

$$\sum (X_i - \overline{X})(Y_i - \overline{Y})
= \beta \, \sum (X_i - \overline{X})^2.$$

Divide both sides by $\sum (X_i - \overline{X})^2:$

$$\beta
= \frac{\sum (X_i - \overline{X})(Y_i - \overline{Y})}
{\sum (X_i - \overline{X})^2}.$$


And that’s exactly:


$$\beta
= \frac{\text{Cov}(Y, X)}{\text{Var}(X)},$$


because by definition,
- $\mathrm{Cov}(Y, X) = \frac{1}{n-1}\sum (X_i - \overline{X})(Y_i - \overline{Y}),$
- $thrm{Var}(X) = \frac{1}{n-1}\sum (X_i - \overline{X})^2.$

So the $\frac{1}{n-1}$ factor cancels out in the fraction, leaving


$$\beta
= \frac{\sum (X_i - \overline{X})(Y_i - \overline{Y})}
{\sum (X_i - \overline{X})^2}
= \frac{\mathrm{Cov}(Y, X)}{\mathrm{Var}(X)}.$$

---
### 3. Intuitive Explanation
1.	Slope Means “Change in $Y$ per Unit of $X$”. If every time $X$ goes up by 1, $Y$ goes up on average by, say, 1.5, the slope is 1.5.
2.	Covariance Tells You How $X$ and $Y$ Move Together
$\mathrm{Cov}(Y, X)$ is large and positive if Y and X move in the same direction most of the time and large in magnitude.
    -  goes up when Y goes up, the product $(X_i - \overline{X})(Y_i - \overline{Y})$ is often positive.
    - The bigger that product is on average, the bigger the covariance.
3.	Variance Tells You How “Spread Out” X is. $\mathrm{Var}(X)$ measures how much X fluctuates around its mean.
4.	Ratio = “Joint Movement” / “X’s Own Spread”
The slope $\beta = \frac{\mathrm{Cov}(Y, X)}{\mathrm{Var}(X)}$ can be seen as:

$$\text{Slope}
= \frac{\text{How much Y and X move together}}{\text{How much X moves by itself}}.$$

- If X hardly moves at all (tiny variance), a small absolute covariance can lead to a large slope (because a small change in $X$ is associated with a bigger relative change in $Y$).
- If X has a massive variance, the same covariance leads to a smaller slope (it takes a bigger move in X to produce that same difference in $Y$).


In [9]:
mean_stock  = df_merged["Stock"].mean()
mean_market = df_merged["Market"].mean()

var_market = df_merged["Market"].var()  # Variance of Market
cov_sm = df_merged[["Stock", "Market"]].cov().iloc[0, 1]  # Cov(Stock, Market)

beta = cov_sm / var_market
alpha = mean_stock - beta * mean_market

print("Mean Stock Return :", mean_stock)
print("Mean Market Return:", mean_market)
print("Variance of Market:", var_market)
print("Cov(Stock,Market) :", cov_sm)

print("Beta  =", beta)
print("Alpha =", alpha)

Mean Stock Return : 0.0006979957398525363
Mean Market Return: 0.0004799005786085602
Variance of Market: 0.00011629989574879891
Cov(Stock,Market) : 0.00013907462256440648
Beta  = 1.1958275772215623
Alpha = 0.00012411739362783582


In [10]:
# 5) Estimate the annualized market return
annual_market_return = mean_market * 252  # ~252 trading days in a year
risk_free_rate = 0.03  # 3% annual risk-free rate (placeholder)
print("Annual Market Return (approx.): {:.2%}".format(annual_market_return))

Annual Market Return (approx.): 12.09%


In [11]:
# 6) Compute AAPL's expected return using CAPM
expected_return_aapl = risk_free_rate + beta * (annual_market_return - risk_free_rate)
print("Annual Market Return (approx.): {:.2%}".format(annual_market_return))
print("AAPL Expected Return (CAPM)   : {:.2%}".format(expected_return_aapl))

Annual Market Return (approx.): 12.09%
AAPL Expected Return (CAPM)   : 13.87%


In [12]:
print(f"{risk_free_rate:.2f} + {beta:.2f} * ({annual_market_return:.2f} - {risk_free_rate:.2f}) = {expected_return_aapl:.2f}")
print(f"{risk_free_rate:.2f} + {beta:.2f} * ({annual_market_return - risk_free_rate:.2f}) = {expected_return_aapl:.2f}")


print(f"{risk_free_rate:.2f} + {1:.2f} * ({annual_market_return - risk_free_rate:.2f}) = {risk_free_rate + 1 *  (annual_market_return - risk_free_rate):.2f}")



0.03 + 1.20 * (0.12 - 0.03) = 0.14
0.03 + 1.20 * (0.09) = 0.14
0.03 + 1.00 * (0.09) = 0.12


### 1. CAPM Formula

Capital Asset Pricing Model (CAPM) states that the expected return  E(R_i)  of a stock (or asset)  i  is:


$$E(R_i)
= R_f
	•	\beta_i \, \bigl( E(R_m) - R_f \bigr),$$


where:
- $R_f$  = Risk-free rate (e.g., return on short-term Treasury bills).
- $E(R_m)$  = Expected return of the market (often proxied by a broad index like the S&P 500).
- $\beta_i$ = the beta of the stock i. It measures how sensitive R_i is to R_m.
- $E(R_m) - R_f$  is the market risk premium (the extra return investors expect over the risk-free rate).

Interpretation: If a stock is riskier than the overall market (i.e., $\beta$ > 1), then its expected return should be higher than the market’s. If it’s less volatile than the market ($\beta$ < 1), investors expect a lower return than the market as compensation for less risk.

---

### 2. Where the Formula Comes From (Intuitive Explanation)
1.	Investors Need Compensation for Time & Risk
	- They can always earn $R_f$ by investing in “safe” government bonds (theoretically risk-free).
	- If an asset is riskier, investors demand a premium above $R_f$.
2.	One-Factor Model
	- CAPM assumes that the main source of systematic **(non-diversifiable)** risk is the entire market’s fluctuation.
	- Hence, we only need one factor: the market return.
	- The size of that exposure to the market’s ups and downs is what $\beta$ measures.
3. Market Risk Premium
	- The difference $E(R_m) - R_f$ is how much more the market is expected to return above the risk-free rate.
	- Each stock’s expected return scales with this market risk premium based on $\beta$.

---

### 3. Why $\beta$ Is Central in CAPM
1.	Measure of Systematic Risk:
$\beta$ captures the part of the stock’s movement that cannot be eliminated by diversification.
	- For instance, if the market tanks, a high-$\beta$ stock typically falls even more.
	- Low-$\beta$ stocks move less than the market but still move in the same general direction.
2.	Determines Required Return:
	- In the CAPM view, the only risk investors need compensation for is systematic risk (market risk), measured by $\beta$.
	- Idiosyncratic (firm-specific) risk can be **diversified away**, so CAPM does not reward that kind of risk.
3.	Slope in Regression:
	- Empirically, $\beta$ is often found by regressing the stock’s returns on the market’s returns, i.e.

        $$R_{\text{stock}} = \alpha + \beta \, R_{\text{market}} + \varepsilon.$$
    -  The slope $\beta$ tells you how many percentage points the stock typically moves for each 1% move in the market.
4. Example:
	-  If $\beta$ = 1.2, when the market is up 1%, we’d expect the stock to be up about 1.2%. If the market is down 1%, the stock might be down 1.2% (on average).

---

### 4. Putting It All Together

- CAPM says a stock’s expected return is risk-free return plus a premium for taking on market risk.
- $\beta$ is how you quantify how much market risk the stock carries.
- $\beta > 1$: more volatile than the market, thus higher expected return.
- $\beta < 1$: less volatile, so lower expected return.
- If $\beta = 0$, the stock’s expected return equals the risk-free rate (no market exposure).

Hence, $\beta$ is the heart of CAPM: it’s the scaling factor that turns the market’s risk premium into the stock’s expected risk premium. Investors in a high-beta stock expect higher returns as compensation for higher systematic risk.