# Penalized logistic regression

In this notebook, you will discover how to use penalized linear regression Lasso (L1), Ridge (L2) and Elasticnet (L1 + L2).

These penalties integrated to the cost function will help you train less complex models to avoid overfitting.

# Packages importation

In [None]:
# Importation of the data for our classification example
from sklearn.datasets import load_breast_cancer

# Importation of the function to standardize the data
from sklearn.preprocessing import StandardScaler

# Importation of the train_test_split function which split randomly our data 
# into a train and test set
from sklearn.model_selection import train_test_split

# Importation of the logistic regression algorithm
from sklearn.linear_model import LogisticRegression

# Importation of the performance metrics
from sklearn.metrics import accuracy_score, precision_recall_curve, f1_score, roc_auc_score, roc_curve, confusion_matrix

# Importation of the maplotlib package to create graphics
import matplotlib.pyplot as plt

# Importation of numpy to use of vectors, matrices, tensors.
import numpy as np 

#Data Importation

In [None]:
# Data frame for our classification
breast_cancer = load_breast_cancer()
X_classif = breast_cancer.data[:, ]
y_classif = breast_cancer.target

Use the Sklearn function *train_test_split* to split your dataset into two random set.

Use a random_state of 123 and use 10% of your dataset for the test set.

Feel free to use the [doc](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html).

In [None]:
# Use the function train_test_split to create your train and test set
X_train_classif, X_test_classif, y_train_classif, y_test_classif = None

# Step 1 : Data standardization

For the use of a linear model it is essential to go through a step of normalization of the data.

This step allows to make the model interpretable but also to facilitate the convergence of the model.

Feel free to use the [doc](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html).

In [None]:
# Initialize the StandardScaler function
scaler = None

# Fit the StandardScaler on the trainig set
None

# Standardization of the training set
X_train_classif_norm = None

# Standardization of the validation set
X_test_classif_norm = None

In [None]:
print('Mean of the training set : '+str(X_train_classif_norm.mean(axis=0)))
print('Standard deviation of the training set : '+str(X_train_classif_norm.std(axis=0)))

print('Mean of the testing set : '+str(X_test_classif_norm.mean(axis=0)))
print('Standard deviation of the testing set : '+str(X_test_classif_norm.std(axis=0)))

Answers expected :

Mean of the training set : [-5.13911830e-16 -3.05745013e-16  8.53483950e-16  1.10003152e-15
  1.54650598e-15  7.12049777e-16 -3.42607887e-17  1.31622144e-16
 -4.95350289e-15  7.66292411e-15  1.53913340e-15  1.52915874e-15
  2.89265140e-16 -2.73435788e-16  1.46914815e-15  1.42854478e-15
  1.64798730e-17 -4.59701721e-16 -1.22119112e-15 -1.23165367e-16
  7.01695646e-16  3.83720833e-15 -9.50628465e-16  2.82326246e-16
  4.31078784e-15 -3.93619599e-16  2.29850861e-16  1.19966970e-16
 -4.12918397e-15  3.31635761e-15]


Standard deviation of the training set : [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1.]


Mean of the testing set : [ 0.05124296 -0.18874973  0.03751984  0.03937833 -0.08482864 -0.18283252
 -0.07809671 -0.05164212 -0.19793932 -0.29018854  0.16691644 -0.05227523
  0.14396937  0.1740289   0.00614669 -0.25922462 -0.16597387 -0.12024983
 -0.11072004 -0.29499964  0.04435633 -0.07442118  0.0294974   0.01545994
 -0.02918828 -0.1089334  -0.05586924 -0.0412101  -0.06118219 -0.18438358]

 
Standard deviation of the testing set : [0.91480598 0.66909629 0.90978263 1.03432425 0.99367801 0.87525274
 0.90633893 0.90440753 0.71909467 0.84083134 1.57807166 0.87352948
 1.60204167 1.78225409 1.01625717 0.75721494 0.65688383 0.93341027
 0.99874551 0.56807441 0.84475039 0.80330342 0.84141439 0.89898207
 0.95859295 1.1139185  1.010088   0.95613581 1.03214445 1.24999311]

# Step 2 : Model initialization

In the case of regression, there is no choice of hyperparameter.

It is therefore sufficient to just initialize the function.

Feel free to use the [doc](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).

In [None]:
reg = None

In the case of lasso regression, you have to choose a value for alpha.

Alpha will control the regularization of the model.

$ J(w) =  \frac{1}{2m}[\frac{1}{C}\sum^m_{i=1}(\hat{y}^{(i)}-y^{(i)})^2+\sum^n_{j=1}|w_j|]$ 

For logistic regression in Sklearn you will use the same function for all penalization and just precise the type of penalty you want with the parameter *penalty*.

For this example initialize the regression with an alpha of 0.8, solver='saga' and a random_state of 123.

Feel free to use the [doc](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).

In [None]:
lasso = None

In the case of ridge regression, you have to choose a value for alpha.

Alpha will control the regularization of the model.

$ J(w) =  \frac{1}{2m}[\frac{1}{C}\sum^m_{i=1}(\hat{y}^{(i)}-y^{(i)})^2+\sum^n_{j=1}w_j^2]$ 

For logistic regression in Sklearn you will use the same function for all penalization and just precise the type of penalty you want with the parameter *penalty*.

For this example initialize the regression with an alpha of 0.8 and a random_state of 123.

Feel free to use the [doc](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).

In [None]:
ridge = None

In the case of elasticnet regression, you have to choose a value for alpha and ratio.

Alpha will control the regularization of the model.

ratio is the mixing parameter beween lasso (ratio=0) and ridge (ratio=1)

$ J(w) =  \frac{1}{2m}[\frac{1}{C}\sum^m_{i=1}(\hat{y}^{(i)}-y^{(i)})^2+\frac{1-ratio}{2}\sum^n_{j=1}w_j^2 + ratio\sum^n_{j=1}|w_j|]$ 

For logistic regression in Sklearn you will use the same function for all penalization and just precise the type of penalty you want with the parameter *penalty*.

For this example initialize the regression with an alpha of 0.8, a ratio of 0.5, solver equal saga and a random_state of 123.

Feel free to use the [doc](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).

In [None]:
elasticnet = None

# Step 3 : Model training

You must train the four models.

In [None]:
# Classic linear regression
None

In [None]:
# Lasso regression
None

In [None]:
# Ridge regression
None

In [None]:
# ElasticNet regression
None

# Step 4 : Model validation

Your model is now trained, use it to predict the probability for your training and testing set for the four models.

In [None]:
# Classic linear regression one hot prediction
x_train_reg_prediction = None

x_test_reg_prediction = None

# Classic linear regression probability 
x_train_reg_prediction_proba = None

x_test_reg_prediction_proba = None

In [None]:
# Lasso regression
x_train_lasso_prediction = None

x_test_lasso_prediction = None

# Classic linear regression probability 
x_train_lasso_prediction_proba = None

x_test_lasso_prediction_proba = None

In [None]:
# Ridge regression
x_train_ridge_prediction = None

x_test_ridge_prediction = None

# Classic linear regression probability 
x_train_ridge_prediction_proba = None

x_test_ridge_prediction_proba = None

In [None]:
# ElasticNet regression
x_train_elasticnet_prediction = None

x_test_elasticnet_prediction = None

# Classic linear regression probability 
x_train_elasticnet_prediction_proba = None

x_test_elasticnet_prediction_proba = None

Compute the AUC for each model

In [None]:
# Classic linear regression
auc_train = None

auc_test = None

print('AUC for the training set : '+str(auc_train))

print('AUC for the testing set : '+str(auc_test))

In [None]:
# Lasso regression
auc_train = None

auc_test = None

print('AUC for the training set : '+str(auc_train))

print('AUC for the testing set : '+str(auc_test))

In [None]:
# Ridge regression
auc_train = None

auc_test = None

print('AUC for the training set : '+str(auc_train))

print('AUC for the testing set : '+str(auc_test))

In [None]:
# ElasticNet regression
auc_train = None

auc_test = None

print('AUC for the training set : '+str(auc_train))

print('AUC for the testing set : '+str(auc_test))

# Step 5 : Impact of the regularization term on the coefficient

Impact of the regularization term for the Lasso regression.

In [None]:
for alpha_values in [0.01, 0.1, 0.2, 0.5, 1] :
  # Initiate your model with alpha equal to alpha_values and random_state of 123
  lasso = None

  # Train your model using X_train_reg_norm and y_train_reg
  None
  print('Alpha = '+str(alpha_values))
  print(lasso.coef_)

Impact of the regularization term for the Ridge regression.

In [None]:
for alpha_values in [0.01, 0.1, 1, 10] :
  # Initiate your model with alpha equal to alpha_values and random_state of 123
  ridge = None

  # Train your model using X_train_reg_norm and y_train_reg
  None
  print('Alpha = '+str(alpha_values))
  print(ridge.coef_)

Impact of the regularization term and the l1_ratio for the elasticnet regression.

In [None]:
for alpha_values, ratio_values in zip([0.01, 0.01, 0.1, 0.5, 1, 1], [0, 1, 0.5, 0.5, 1, 0]) :
  # Initiate your model with alpha equal to alpha_values, l1_ratio equal to ratio_values and random_state of 123
  elasticnet = None

  # Train your model using X_train_reg_norm and y_train_reg
  None
  print('Alpha = '+str(alpha_values))
  print('ratio_values = '+str(ratio_values))
  print(elasticnet.coef_)