# Beta Regression Example

In this example, we'll use linear regression to estimate the Beta for General Electric (GE) vs. the S&P 500 using monthly data.

The file "Stock Market Data.txt" is a tab-delimited text file containing monthly data on GE and the S&P 500 from the start of 1990 to the end of October 2018.  The columns are:     
Column 1: The date in text, e.g, 1/1/2007.  
Column 2: The "adjusted" month end closing price for the S&P 500.  
Column 3: The "adjusted" month end closing price for GE.

This file was created using data from Yahoo Finance.  Yahoo Finance provides "adjusted closing prices".  These aren't the actual closing prices on the day, but instead are back-adjusted for dividends and stock splits.  They are adjusted in such a way that you can calculate the return for a stock on a given day as:

$\large \text{Return}=\frac{\text{Adj. Close}_t}{\text{Adj. Close}_{t-1}}-1$

### Step 1

Pull the data from the "Stock Market Data.txt" into a Pandas Python dataframe called stocks.  You'll need to import the correct packages first.  Make sure to import pandas and the statsmodel.formula.api in addition to numpy and pyplot.  Show the first five lines of the dataframe.

### Step 2

Change the date column to a datetime64 and type and then set this to be a DateTime index instead of a column.  Show the .info() on the stocks dataframe.

### Step 3

Plot GE's adjusted close over time.

### Step 4

Calculate returns for the S&P 500 and for GE.  Hint:  to get the lagged (t-1) price on a column in a dataframe, you can use .shift().  For example,  
`stocks[GE_lag]=stocks['GE_ADJ_CLOSE'].shift()`  
will create a column of stock prices one period back.  Create columns for lagged prices for the S&P and GE prior to calculating the returns so you can take a look at how this works.

Calculate the returns below.

### Step 5

Use stocks.describe() to get summary statistics on the returns.

### Step 6

Plot the return on GE (y-axis) vs. the return on the S&P 500 (x-axis).

### Step 7

Run a linear regression of GE's return (y) vs. the SP returns (x).  Provide a basic regression summary.  What is your estimate of GE's beta with respect to the market?

### Step 7

Repeat the plot above but add the regression line to the diagram.