## Lasso, Ridge, & Elastic Net Regression

Based on this [YouTube Video](https://www.youtube.com/watch?v=ctmNq7FgbvI). Code is [HERE](https://github.com/StatQuest/ridge_lasso_elastic_net_demo/blob/master/ridge_lass_elastic_net_demo.R)

In [None]:
library(glmnet)
set.seed(42)

### Create a dataset for testing

In [None]:
n = 1000 # 1000 samples
p = 5000 # 5000 parameters to estimate
real_p = 15 # 15 params will help predict the outcome, the others will just be random noise

x = matrix(rnorm(n*p), nrow=n, ncol=p) # Randome matrix with n*p values, spread across n rows and p cols

In [None]:
# Apply will return a vector of 1,000 values that are the sums of the first 15 columns in x
# This way only the first 15 params have anything to do with the outcome of interest
y = apply(x[,1:real_p], 1, sum) + rnorm(n) # + rnorm(n) adds a little noise to the sums

### Train-test split

In [None]:
# First param gives range to sample from (from 1 to n), second gives number of sample to draw (2/3 of n)
train_rows = sample(1:n, .66*n)

x.train = x[train_rows,] # Apply mask to x for test
x.test = x[-train_rows,] # Apply opposite of mask to x for train

# Repeat with y
y.train = y[train_rows]
y.test = y[-train_rows]

### Ridge Regression

###### Fit

[Documentation](https://www.rdocumentation.org/packages/glmnet/versions/4.1-1/topics/cv.glmnet) for `cv.glmnet()` and [documentation](https://www.rdocumentation.org/packages/glmnet/versions/4.1-1/topics/glmnet) for `glmnet()` for which `cv.glmnet()` wraps a cv function around in order to get the best Lambda.

In [None]:
# When alpha is set to 0, cv.glmnet() does a Ridge regression

alpha0.fit = cv.glmnet(
    x=x.train,
    y=y.train,
    type.measure='mse',
    nfolds=10,
    alpha=0.1,
    family='gaussian' # This arg is passed through to glmnet()
)

alpha0.fit

Note the coefficients drop off in value at V16 and beyond.

In [None]:
coef(alpha0.fit)

###### Predict
[Documentation](https://www.rdocumentation.org/packages/glmnet/versions/1.1-1/topics/predict.glmnet) for `predict()`

In [None]:
alpha0.predicted = predict(
    object=alpha0.fit,
    newx=x.test,
    s=alpha0.fit$lambda.1se
)

###### Evaluate

In [None]:
mean((y.test - alpha0.predicted)^2)

### Lasso Regression

###### Fit

In [None]:
# When alpha is set to 1, glmnet() does a Lasso regression

alpha1.fit = cv.glmnet(
    x=x.train,
    y=y.train,
    type.measure='mse',
    nfolds=10,
    alpha=1,
    family='gaussian' # This arg is passed through to glmnet()
)

alpha1.fit

Note the coefficients are mostly zero from V16 onward.

In [None]:
coef(alpha1.fit)

###### Predict

In [None]:
alpha1.predicted = predict(
    object=alpha1.fit,
    newx=x.test,
    s=alpha1.fit$lambda.1se
)

###### Evaluate

In [None]:
mean((y.test - alpha1.predicted)^2)

### ElasticNet Regression

###### Fit

In [None]:
# When alpha is set to 1, glmnet() does a Lasso regression

alpha0.5.fit = cv.glmnet(
    x=x.train,
    y=y.train,
    type.measure='mse',
    nfolds=10,
    alpha=0.5,
    family='gaussian' # This arg is passed through to glmnet()
)

alpha0.5.fit

###### Predict

In [None]:
alpha0.5.predicted = predict(
    object=alpha0.5.fit,
    newx=x.test,
    s=alpha0.5.fit$lambda.1se
)

###### Evaluate

In [None]:
mean((y.test - alpha0.5.predicted)^2)

### Hyperparamter Tuning for `alpha`

In [None]:
# Initialize an empty list to store information
list.of.fits = list()

###### Fit

In [None]:
# Loop through 11 values

for (i in 0:10) {
    print(paste0("Fitting at alpha = ", i/10))
    
    # Name the element
    fit.name = paste0("alpha", i/10)
    
    # Train the model
    list.of.fits[[fit.name]] = cv.glmnet(
        x=x.train,
        y=y.train,
        type.measure='mse',
        nfolds=10,
        alpha=i/10,
        family='gaussian' # This arg is passed through to glmnet()
    )
}

###### Predict

In [None]:
# Loop through 11 values

results = data.frame() # Initialize empty df

for (i in 0:10) {
    print(paste0("Predicting at alpha = ", i/10))
    
    # Name the element
    fit.name = paste0("alpha", i/10)
    
    # Predict
    predicted = predict(
        object=list.of.fits[[fit.name]],
        newx=x.test,
        s=list.of.fits[[fit.name]]$lambda.1se
    )
        
    mse = mean((y.test - predicted)^2)
        
    temp = data.frame(alpha=i/10, mse=mse, fit.name=fit.name)
    print(temp)
    
    results = rbind(results, temp)
    
}

In [None]:
print(results)

### Conclusion

Since `mse` is lowest at `alpha=1`, **lasso** is still our best model! Might vary from time to time due to randomness, but `alpha=1` should be lowest or within just a few fractions of a point.