### Codio Activity 13.6: Use L1 Regularization to Select Features

**Expected Time = 90 minutes** 

**Total Points = 60** 

This activity focuses on using the L1 regularization penalty to select features in a classification setting.  In the following, you will explore the value of different coefficients as you increase regularization.  Be sure to use the `liblinear` solver in your models throughout.

### Index

- [Problem 1](#-Problem-1)
- [Problem 2](#-Problem-2)
- [Problem 3](#-Problem-3)
- [Problem 4](#-Problem-4)
- [Problem 5](#-Problem-5)

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


from sklearn.preprocessing import StandardScaler, PolynomialFeatures
from sklearn.linear_model import LogisticRegression
from sklearn.feature_selection import SelectFromModel
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split, GridSearchCV


import seaborn as sns

### The Data

For this exercise you will use the built in dataset from seaborn containing information on passengers on the Titanic.  Here, you will only use the numeric features.  The data is loaded and prepared below.  We will only use one set for `X` and `y` to explore the effect of added regularization. 

In [None]:
data = sns.load_dataset('titanic').dropna()

In [None]:
data.head()

In [None]:
X, y = data.select_dtypes(np.number).drop('survived', axis = 1), data.survived

[Back to top](#-Index)

### Problem 1

#### Scaling the Data

**10 Points**

Because we are using regularization, it is important to have each of the features represented on the same scale.  To do so, instantiate a `StandardScaler` scaler and assign it to the `scaler` variable. Next, apply the function `fit_transform` to `scaler` with argument `X` to create `X_scaled` below.  

In [None]:
### GRADED

scaler = ''
X_scaled = ''

# YOUR CODE HERE
raise NotImplementedError()

# Answer check
X_scaled.mean()

### Problem 2

#### `C` values to explore

**20 Points**

Next, you want to create an array of different `C` values to explore.  Remember that `C` is actually the inverse of regularization so small values are large amounts of regularization.  

Below, use a `for` loop to iterate over the values of `Cs`. Inside the `for` loop, instatinate a `LogisticRegression` classifier with L1 penalty, liblinear solver, `random_state=42`, and `max_iter=1000` and assign it to `lgr` and fit it to `X_scaled` and `y`.

Finally, append the coefficients of the model (`lgr.coef_[0]`) as a list to `coef_list`.


In [None]:
Cs = np.logspace(-5, .5)

In [None]:
### GRADED

coef_list = []

# YOUR CODE HERE
raise NotImplementedError()

### ANSWER CHECK
coef_list[0]

[Back to top](#-Index)

### Problem 3

#### DataFrame of Coefficients

**10 Points**

Next, create a dataframe, `coef_df`, based on the coefficients in `coef_list`.  Set the index of this dataframe to the `Cs` values.  Assign the column names of the new dataframe from the columns of `X`.

In [None]:
### GRADED

coef_df = ''

# YOUR CODE HERE
raise NotImplementedError()

### ANSWER CHECK
coef_df.head()

[Back to top](#-Index)

### Problem 4

#### Visualizing the Results

**10 Points**

Below, the data from the coefficients is plotted.  Based on this plot, which feature seems more important -- `age` or `parch`?  Assign your answer as a string to `ans4` below.

<center>
    <img src = 'images/coefl1.png' />
</center>

In [None]:
# plt.figure(figsize = (12, 5))
# plt.semilogx(coef_df)
# plt.gca().invert_xaxis()
# plt.grid()
# plt.legend(list(coef_df.columns));
# plt.title('Increasing Regularization on Titanic Features')
# plt.xlabel("Increasing 1/C")
# plt.savefig('images/coefl1.png')

In [None]:
### GRADED

ans4 = ''

# YOUR CODE HERE
raise NotImplementedError()

### ANSWER CHECK
print(ans4)

[Back to top](#-Index)

### Problem 5

#### Using `SelectFromModel`

**10 Points**

In a similar manner, you can use `SelectFromModel` together with `LogisticRegression` to select features based on coefficient values.  

Below, create an instance of the `SelectFromModel` selector with a `LogisticRegression(C = 0.1, penalty = 'l1', solver = 'liblinear', random_state = 43)` as the estimator and assign it to the `selector` variable.  Use the `fit_transform` function on `selector ` to fit the data `X_scaled` and `y` to select the two most important features.  

Assign their names as a list to `best_features` below.

In [None]:
### GRADED

selector = ''
best_features = ''
# YOUR CODE HERE
raise NotImplementedError()

### ANSWER CHECK
print(selector.get_feature_names_out())