# Assignment 1: Testing the CAPM

The CAPM implies that
\begin{equation*}
E(R_{i}-R_{f})=\beta _{i}E(R_{m}-R_{f}),\qquad \beta _{i}=\frac{\mathrm{cov}%
(R_{i},R_{m})}{\mathrm{var}(R_{m})},
\end{equation*}

where $R_i$ is the return on a stock, and $R_m$ is the market return. Suppose we have time series on $n$ different stock or portfolio returns $\{R_{it},i=1,\ldots ,n\}_{t=1}^{T}$ and on a "market return" (value weighted index) $\{R_{mt}\}_{t=1}^{T}$.  We also have observations on a risk-free interest rate $\{R_{ft}\}_{t=1}^{T}$ and construct with these the excess returns $r_{it}=R_{it}-R_{ft}$ and $r_{mt}=R_{mt}-R_{ft}$. Now $\beta_i$ can be estimated from the time-series regression \begin{equation*}
r_{it}=\alpha _{i}+\beta _{i}r_{mt}+\varepsilon _{it},\qquad t=1,\ldots ,T.
\end{equation*}

One way to test the CAPM is as follows: denote by $\bar{r}_{i}$
and $\hat{\beta}_{i}$ the average excess return and estimated $\beta$ of stock $i$. If the model $E(r_{i})=\beta_{i}E(r_{m})$ is valid, then $(\hat{\beta}_{i},\bar{r}_{i})$ should lie on a line with zero intercept and slope $\lambda =E(r_{mt})$. This line is called the *security market line*, and $\lambda$ is known as the *market risk premium*.

We can estimate $\lambda $ by OLS in the cross-section regression
\begin{equation*}
\bar{r}_{i}=\lambda \hat{\beta}_{i}+\alpha _{i},\qquad i=1,\ldots ,n.
\end{equation*}
Note that $\hat{\beta_i}$ is the regressor, $\lambda$ the coefficient, and $\alpha_i$ the error term.

The assignment is to estimate the CAPM betas of the 30 constituent stocks of the Dow (using the return on the Dow as the market return, and the 3 month T-bill rate as the risk-free rate), then estimate the above cross-section regression, and finally make a plot of the security market line superimposed on a scatter plot of $(\hat{\beta}_{i},\bar{r}_{i})$.

**Import the relevant libraries**:

**Obtain, from Yahoo Finance, the daily adjusted closing prices on the Dow (^DJI) from 1/1/2010 to today. Convert them into percentage log returns $r_t=100\log(P_t/P_{t-1})$ and store them in a DataFrame `df`, which has the date as index and 'DJIA' as column name (not ^DJI; this creates a problem for the statsmodels package) **:

**Obtain, from the FRED database, daily data on the 3-month T-bill rate (DTB3) for the same period and divide them by 365**:

**The list `tickers` below contains the ticker symbols of the 30 constituent stocks of the Dow. Obtain the adjusted closing prices for all of the (using a `for` loop), convert them into log returns, and store them in `df`, using the ticker as column name**:

In [4]:
tickers=["AXP", "AAPL", "BA", "CAT", "CSCO", "CVX", "XOM", "GE", "GS", "HD", "IBM", "INTC", "JNJ", "KO", "JPM",
          "MCD", "MMM", "MRK", "MSFT", "NKE", "PFE", "PG", "TRV", "UNH", "UTX", "VZ", "V", "WMT", "DIS", "DWDP" ]

**Convert the raw returns in `df` to excess returns by subtracting the T-bill rate from all columns:**

**Drop all rows from `df` that contain at least one NaN**:

**Use a `for` loop to estimate a CAPM time series regression for each stock, and store the estimated slope coefficient in a list**. Hint: use string interpolation to construct the regression equation.

**Create a new dataframe that has `tickers` as index, and two columns: `beta`, containing the 30 estimated betas, and `meanret`, containing the mean excess returns of the 30 stocks**:

**Estimate the security market line by a cross-sectional regression (without intercept), and print a summary of the result**:

**Make a scatter plot of $(\hat{\beta}_{i},\bar{r}_{i})$ and overlay it with a red regression line. Add a title and legend, and label the axes**: