# Machine Learning for Aerospace Engineers
### **Homework 2: SVD, Regularization and the Pseudoinverse**


### General Homework Policies 
* You must not import any python libraries unless the question explicitly asks you to do so. 
* Some exercises may have a free response question at the end. We are not looking for anything more than a short paragraph for these questions. In some cases, just a sentence will suffice. 

***
## Exercise 4: Regularization to Combat Instability from Perturbation
**10 Points**

Now we'll explore regularization as a stratgegy to combat the behavior observed in **Exercise 3**. We have provided some code and set $\varepsilon=0.01$. We've defined a `lambda_vals` list which contains logarithmically spaced values of $\lambda$ from $10^{-9}$ to $10^5$. To get credit for this exercise: 

* Compute the SVD of X using `np.linalg.svd(X)`. 
* Loop through the `lambda_vals` vector of regularization parameters to try out. 
* For each loop iteration, solve for $\beta_\varepsilon$ using the regularized Normal Equations (do not use `np.linalg.inv()` ): 
$$ \beta_{\varepsilon} = (X^\top X + \lambda I)^{-1} X^\top Y $$ 
* For each loop iteration, compute the error between $X \beta_\varepsilon$ and $Y$. Append this value to the `errors` list. 
$$ \text{error} = ||X \beta_{\varepsilon} - Y||_2^2 $$ 

* For each loop iteration, compute the 2-norm of $\beta_{\varepsilon}$ and append it to the `norms` list. 
* Make a plot with the error values on the y-axis and the norms of $\beta$ on the x-axis using the `plt.loglog` function for logarithmically scaled x and y axes. Add appropriate axis labels, a grid, and a title. 

In [None]:
## CODE PROVIDED; DO NOT MODIFY ---------------------

import matplotlib.pyplot as plt
import numpy as np

# We have set the epsilon to 0.01 
eps = 0.01
# The perturbed X 
X = np.array([
    [1, 100], 
    [0, eps]
])

# The perturbed Y 
Y = np.array([
    1, eps 
])

lambda_vals = np.logspace(-9, 5, 100)

errors = []
norms = []

## BEGIN YOUR CODE: ---------------------------------

# Compute the SVD of X 

# Loop through the values in lambda_vals 
for lambda_val in lambda_vals:
    # Solve for beta using the pseudoinverse definition above 

    # Compute the 2-norm of the residual X beta - Y 

    # Append the error to the errors list and the 2-norm of beta to the norms list

# Plot the 2-norms of beta on the x-axis and the 2-norms of the residuals on the y-axis on a log-log plot

### **Free Response:** Explain the trend in this plot. What does this tell us about how we should choose our $\lambda$ to get as close to the original $\beta^\dagger$ as we can?   

*Your response here*

***
## Exercise 5: Overfitting Revisited 

**20 Points**

We can also use regularization when we have an extremely expressive model. Recall the 1-D Bernoulli equation: 

$$ P_1 + \frac{1}{2} \rho v_1^2 + \rho g h_1 = P_2 + \frac{1}{2} \rho v_2^2 + \rho g h_2 $$ 

Let's say we wish to approximate the pressure $P_2$ and we have experimental measurements of $P_1, \rho, v_1$, and $v_2$. We have given you some starter code to generate the data you need. Do not modify this code. You've been given an `X_df` and `Y_df` which contain the input data and output data. To get full credit for this exercise, you must: 

* Load in the `BernoulliData.csv` file into your notebook as a pandas dataframe. 

* Create an `X_df` containing the `P1`, `rho`, `v1`, `v2` columns. Create a `Y_df` containing the `P2` column. 

* Split the `X_df` and `Y_df` dataframes into train and test sets with 80% of the training data used to train a linear regression model. Name these new arrays `Xtrain`, `Xtest`, `Ytrain`, `Ytest`. You may copy the code we used in Homework 1. Set the `random_state=42` for repeatability. 

* Import the `PolynomialFeatures` class from the `sklearn` library. Use this module to transform your data to have degree-3 polynomial features. You my also use the code from Homework 1. 

* Define a function called `mean_absolute_error` which takes in a `Ytrue` and a `Yhat` and computes the Mean Absolute Error between them. 

* Compute a `beta_hat` using the normal equations (no regularization)

* Compute a `Yhat_train` vector to store the model's predictions for the `Xtrain` data. 

* Compute a `Yhat_test` vector to store the model's predictions for the `Xtest` data. 

* Print out the 2-norm of `beta_hat`.

* Print out the MAE between `Ytrain` and `Yhat_train` using your `mean_absolute_error` function. 

* Print out the MAE between `Ytest` and `Yhat_test` using your `mean_absolute_error` function. 

* Plot a scatterplot with `Yhat_train` and `Ytrain` on the X and Y axes, respectively. Color these points blue. On the same plot, plot a scatterplot of `Yhat_test` and `Ytest` on the X and Y axes, respectively. Add appropriate axis labels, a grid, **a legend**, and a title. 

In [None]:
import pandas as pd 

# Load the 'BernoulliData.csv' file into your workspace as a pandas dataframe 

# Create an `X_df` containing the `P1`, `rho`, `v1`, `v2` columns. Create a `Y_df` containing the `P2` column. 


# Split the X_df and Y_df into Xtrain, Xtest, Ytrain, Ytest numpy arrays 

# Use PolynomialFeatures from sklearn.preprocessing to create degree-3 polynomial features 

# Create a function called mean_absolute_error() to compute the MAE between some Ytrue and Yhat vectors

# Compute a beta_hat using ordinary least-squares/the normal equations without any regularization 

# Use this beta_hat to make predictions on the training and testing inputs. Call these predictions Yhat_train and Yhat_test

# Print out the 2-norm of beta_hat 

# Print out the training and testing MAE for the model 

# Make a scatterplot containing the Yhat_train and Yhat_test on the X-Axis and Ytrain and Ytest on the Y-Axis 

### **Free Response:** In your own words, interpret the MAE for the training and testing data. How did our simple linear regression model perform? Why? Be specific. What does the 2-norm of $\hat{\beta}$ tell us? 

*Your response here*

***
## Exercise 6: Regularization to Combat Overfitting

**20 Points**

Now that we know our model isn't performing well on unseen data, let's employ regularization to fix it. To get full credit for this exercise, you must: 

* Define a `lambda_val` variable as our regularization parameter. 

* Compute a new `beta_hat` using the regularized normal equations i.e. $\hat{\beta} = (X^\top X + \lambda I)^{-1} X^\top Y $. 

* Compute a new `Yhat_train` vector to store the model's predictions for the `Xtrain` data. 

* Compute a new `Yhat_test` vector to store the model's predictions for the `Xtest` data. 

* Print out the new 2-norm of `beta_hat`.

* Print out the new MAE between `Ytrain` and `Yhat_train` using your `mean_absolute_error` function. 

* Print out the new MAE between `Ytest` and `Yhat_test` using your `mean_absolute_error` function. 

* Plot a scatterplot with `Yhat_train` and `Ytrain` on the X and Y axes, respectively. Color these points blue. On the same plot, plot a scatterplot of `Yhat_test` and `Ytest` on the X and Y axes, respectively. Add appropriate axis labels, a grid, **a legend**, and a title. 

* Repeat this process using different values of `lambda_val` until the training MAE is roughly equal to the test MAE (within $\pm 3$ of each other is fine)

In [None]:
# Compute a new beta_hat using the regularized normal equations 

# Use this beta_hat to make predictions on the training and testing inputs. Call these predictions Yhat_train and Yhat_test


# Print out the 2-norm of beta_hat 

# Print out the training and testing MAE for the model 

# Make a scatterplot containing the Yhat_train and Yhat_test on the X-Axis and Ytrain and Ytest on the Y-Axis 

### **Free Response:** 
* **What value of `lambda_val` did you end up choosing?** 
* **Is this larger or smaller than you expected?** 
* **How does our training MAE with regularization compare to our training MAE without regularization?** 
* **What is the 2-norm of the regularized $\hat{\beta}$? How does it compare to the unregularized $\hat{\beta}$?** 
* **What ingredient of the regression problem does regularization change? Give justification using the theory from class.** 

*Your response here*