In [2]:
## Load in the required packages

library(quantmod) 
library(tidyverse) 
library(reshape2) 
library(glmnet) 
 

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Description" data-toc-modified-id="Description-0"><strong><em>Description</em></strong></a></span></li><li><span><a href="#Read-in-Functions" data-toc-modified-id="Read-in-Functions-1"><strong><em>Read in Functions</em></strong></a></span><ul class="toc-item"><li><span><a href="#Yahoo-Finance-Function:" data-toc-modified-id="Yahoo-Finance-Function:-1.1"><strong><em>Yahoo Finance Function:</em></strong></a></span></li></ul></li></ul></div>

## ***Description*** 

We are looking to predict the S&P 500 returns using a min of 10 features. 
*******
Start by downloading the data
******* 
Define the outcome variable as the next period return of the S&P 500 
******
Find 10 feature variables that can be used as the predictors of the S&P 500 returns 
****** 
Build a Ridge Regression and a Lasso Regression using R's glmnet package 
Calibrate the models by selecting the lambda values that minimize the out of sample MSE 
***** 
Summarize the key insights found from the two different models 


## ***Read in Functions***

We are going to begin by writing functions responsible for reading in the data 

### ***Yahoo Finance Function:*** 

We begin by writing a function to bring in stock data from quantmod via Yahoo Finance. 
We only output the adjusted close and the date


In [3]:
Read_Yahoo_Data <- function(start = as.Date('01-01-2020'), 
                           end = Sys.Date(), ticker){
    ## Function to read in data from yahoo finance
    
    ## Bring in data from yahoo finance, from the start to the end date
    df <- getSymbols(ticker, 
                    src = 'yahoo',
                    auto.assign = F, 
                    from = start, 
                    to = end) 
    
    
    ## Add in the date column
    dates <- index(df) 
    df <- as_tibble(df) 
    df$date <- dates
    
    ## Rename all the columns 
    names(df) <- c('open', 'high', 'low', 'close', 'volume', 'adjusted', 'date') 
    
   
    
    ## We really only need the adjusted close and the date column
    return(df[,c('date', 'adjusted')]) 
    
}