# Q3
## Q3.1
We saw regularization using L1 and L2 losses. A third commonly-used penalty is ElasticNet, which is a weighted combination of the two. For the squared error, the metric would look like:
\begin{equation}
    Err(Y, \hat{Y}) = \sum (Y-\hat{Y})^2 + \alpha ( \beta L_1 + (1-\beta) L_2 )
\end{equation}  
Here, $\alpha$ behaves as it did before: it balances the focus between the prediction error and the regularization term. The additional parameter, $\beta$ balances the relative weight between the L1 and L2 losses. $\beta$ values of 1 and 0 are the same as L1 and L2 regularization, respectively.  
  
For the noisy quadratic data that we saw in Section 3, include `sklearn`'s ElasticNet regression model in the comparison and examine the behaviour of your model with different values of $\alpha$ and $\beta$.  
Between L1, L2, and ElasticNet, which one performs best? (you may need to increase the range of x or the amount of noise to see an appreciable difference).  

Note: `sklearn` uses `l1_ratio` as the parameter name for $\beta$. 

## Q3.2
This question requires using a Python dictionary. A brief explanation is provided.  
  
Fit the following data using one of the regularized linear regression models:

In [None]:
import numpy as np
from matplotlib import pyplot as plt
x = (np.random.rand(500,1)-0.5)*5
y = (x-3)*(x-0.1)*(x+2)*(4*x+4) + 15*np.random.randn(*x.shape)
plt.plot(x,y,'.')

You can (should) use `sklearn.model_selection.GridSearchCV` to get the optimal value of your regularization weight.  
The function expects a dictionary for the parameter, `parameter_grid`. Python recognizes `{`braces`}` as declaring a dictionary. Values are stored in a dictionary using a key. For example, `my_dict = {'chicken': 4}` would store the data `4` under the key `chicken` in the `my_dict` dictionary. You can get to the stored data using brackets, like you would access an array with indices:  
`my_dict['chicken']`  
  
The dictionary you need to define must use the name of the parameter you're optimizing as a key, followed by a list of values across which you would like to optimize. For example, if you supplied:  
`my_param_dict = {'alpha': [0.1, 0.5, 1, 10]}`  
The `GridSearchCV.fit` method would fit your model to your data using each of those values separately. The results are then stored in your fitted object under `.cv_results_`. See the [documentation](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html?highlight=gridsearchcv#sklearn.model_selection.GridSearchCV) for another example.  

In [None]:
# Dictionary example
my_dict = {'chicken': 4}
print(my_dict['chicken'])

In [None]:
# Example use of GridSearchCV
from sklearn.model_selection import GridSearchCV

my_param_dict = {'l1_ratio':[0,0.1,0.4,0.8,1]}
# pick a model
# mdl = SomeModel()

gsv = GridSearchCV(mdl, param_grid=my_param_dict)
print(gsv.cv_results_)