# Régression logistique prénalisée

Dans ce notebook, vous découvrirez comment utiliser la régression linéaire pénalisée Lasso (L1), Ridge (L2) et Elasticnet (L1 + L2).

Ces pénalités intégrées à la fonction de coût vous aideront à former des modèles moins complexes pour éviter l'overfitting.

# Importation des packages

In [1]:
# Importation of the data for our classification example
from sklearn.datasets import load_breast_cancer

# Importation of the function to standardize the data
from sklearn.preprocessing import StandardScaler

# Importation of the train_test_split function which split randomly our data
# into a train and test set
from sklearn.model_selection import train_test_split

# Importation of the logistic regression algorithm
from sklearn.linear_model import LogisticRegression

# Importation of the performance metrics
from sklearn.metrics import accuracy_score, precision_recall_curve, f1_score, roc_auc_score, roc_curve, confusion_matrix

# Importation of the maplotlib package to create graphics
import matplotlib.pyplot as plt

# Importation of numpy to use of vectors, matrices, tensors.
import numpy as np

# Importation des données

In [2]:
# Data frame for our classification
breast_cancer = load_breast_cancer()
X_classif = breast_cancer.data[:, ]
y_classif = breast_cancer.target

Utilisez la fonction Sklearn *train_test_split* pour diviser votre ensemble de données en deux ensembles aléatoires.

Utilisez un random_state de 123 et utilisez 10% de votre jeu de données pour l'ensemble de test.

N'hésitez pas à utiliser le [doc](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html).

In [3]:
# Use the function train_test_split to create your train and test set
X_train_classif, X_test_classif, y_train_classif, y_test_classif = train_test_split(X_classif, y_classif,
                                                                    test_size=0.10,
                                                                    random_state=123)

# Etape 1 : Standardisation des données

Pour l'utilisation d'un modèle linéaire, il est indispensable de passer par une étape de normalisation des données.

Cette étape permet de rendre le modèle interprétable mais aussi de faciliter la convergence du modèle.

N'hésitez pas à utiliser le [doc](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html).

In [4]:
# Initialize the StandardScaler function
scaler = StandardScaler()

# Fit the StandardScaler on the trainig set
scaler.fit(X_train_classif)

# Standardization of the training set
X_train_classif_norm = scaler.transform(X_train_classif)

# Standardization of the validation set
X_test_classif_norm = scaler.transform(X_test_classif)

In [5]:
print('Mean of the training set : '+str(X_train_classif_norm.mean(axis=0)))
print('Standard deviation of the training set : '+str(X_train_classif_norm.std(axis=0)))

print('Mean of the testing set : '+str(X_test_classif_norm.mean(axis=0)))
print('Standard deviation of the testing set : '+str(X_test_classif_norm.std(axis=0)))

Mean of the training set : [-5.13911830e-16 -3.05745013e-16  8.53483950e-16  1.10003152e-15
  1.54650598e-15  7.12049777e-16 -3.42607887e-17  1.31622144e-16
 -4.95350289e-15  7.66292411e-15  1.53913340e-15  1.52915874e-15
  2.89265140e-16 -2.73435788e-16  1.46914815e-15  1.42854478e-15
  1.64798730e-17 -4.59701721e-16 -1.22119112e-15 -1.23165367e-16
  7.01695646e-16  3.83720833e-15 -9.50628465e-16  2.82326246e-16
  4.31078784e-15 -3.93619599e-16  2.29850861e-16  1.19966970e-16
 -4.12918397e-15  3.31635761e-15]
Standard deviation of the training set : [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1.]
Mean of the testing set : [ 0.05124296 -0.18874973  0.03751984  0.03937833 -0.08482864 -0.18283252
 -0.07809671 -0.05164212 -0.19793932 -0.29018854  0.16691644 -0.05227523
  0.14396937  0.1740289   0.00614669 -0.25922462 -0.16597387 -0.12024983
 -0.11072004 -0.29499964  0.04435633 -0.07442118  0.0294974   0.01545994
 -0.02918828 -0.1089334  -0.0558

Réponse attendue :

Mean of the training set : [-5.13911830e-16 -3.05745013e-16  8.53483950e-16  1.10003152e-15
  1.54650598e-15  7.12049777e-16 -3.42607887e-17  1.31622144e-16
 -4.95350289e-15  7.66292411e-15  1.53913340e-15  1.52915874e-15
  2.89265140e-16 -2.73435788e-16  1.46914815e-15  1.42854478e-15
  1.64798730e-17 -4.59701721e-16 -1.22119112e-15 -1.23165367e-16
  7.01695646e-16  3.83720833e-15 -9.50628465e-16  2.82326246e-16
  4.31078784e-15 -3.93619599e-16  2.29850861e-16  1.19966970e-16
 -4.12918397e-15  3.31635761e-15]


Standard deviation of the training set : [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1.]


Mean of the testing set : [ 0.05124296 -0.18874973  0.03751984  0.03937833 -0.08482864 -0.18283252
 -0.07809671 -0.05164212 -0.19793932 -0.29018854  0.16691644 -0.05227523
  0.14396937  0.1740289   0.00614669 -0.25922462 -0.16597387 -0.12024983
 -0.11072004 -0.29499964  0.04435633 -0.07442118  0.0294974   0.01545994
 -0.02918828 -0.1089334  -0.05586924 -0.0412101  -0.06118219 -0.18438358]


Standard deviation of the testing set : [0.91480598 0.66909629 0.90978263 1.03432425 0.99367801 0.87525274
 0.90633893 0.90440753 0.71909467 0.84083134 1.57807166 0.87352948
 1.60204167 1.78225409 1.01625717 0.75721494 0.65688383 0.93341027
 0.99874551 0.56807441 0.84475039 0.80330342 0.84141439 0.89898207
 0.95859295 1.1139185  1.010088   0.95613581 1.03214445 1.24999311]

# Etape 2 : Initialisation du modèle

Dans le cas de la régression, il n'y a pas de choix d'hyperparamètre.

Il suffit donc d'initialiser la fonction.

N'hésitez pas à utiliser le [doc](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).


In [6]:
reg = LogisticRegression()

Dans le cas de la régression lasso, vous devez choisir une valeur pour alpha.

Alpha contrôlera la régularisation du modèle.

$ J(w) =  \frac{1}{2m}[\frac{1}{C}\sum^m_{i=1}(\hat{y}^{(i)}-y^{(i)})^2+\sum^n_{j=1}|w_j|]$

Pour la régression logistique dans Sklearn, vous utiliserez la même fonction pour toutes les pénalisations et préciserez simplement le type de pénalité que vous souhaitez avec le paramètre *pénalité*.

Pour cet exemple, initialiser la régression avec un alpha de 0.8, solver='saga' et un random_state de 123.

N'hésitez pas à utiliser le [doc](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).

In [7]:
lasso = LogisticRegression(penalty='l1', C=0.8, random_state=123, solver='saga')

Dans le cas de la régression Ridge, vous devez choisir une valeur pour alpha.

Alpha contrôlera la régularisation du modèle.

$ J(w) =  \frac{1}{2m}[\frac{1}{C}\sum^m_{i=1}(\hat{y}^{(i)}-y^{(i)})^2+\sum^n_{j=1}w_j^2]$

Pour la régression logistique dans Sklearn, vous utiliserez la même fonction pour toutes les pénalisations et préciserez simplement le type de pénalité que vous souhaitez avec le paramètre *pénalité*.

Pour cet exemple, initialisez la régression avec un alpha de 0.8 et un random_state de 123.

N'hésitez pas à utiliser le [doc](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).

In [8]:
ridge = LogisticRegression(penalty='l2', C=0.8, random_state=123)

Dans le cas de la régression par elasticnet, vous devez choisir une valeur pour alpha et ratio.

Alpha contrôlera la régularisation du modèle.

ratio est le paramètre de mélange entre lasso (ratio=0) et ridge (ratio=1).

$ J(w) =  \frac{1}{2m}[\frac{1}{C}\sum^m_{i=1}(\hat{y}^{(i)}-y^{(i)})^2+\frac{1-ratio}{2}\sum^n_{j=1}w_j^2 + ratio\sum^n_{j=1}|w_j|]$

Pour la régression logistique dans Sklearn, vous utiliserez la même fonction pour toutes les pénalisations et préciserez simplement le type de pénalité que vous souhaitez avec le paramètre *pénalité*.

Pour cet exemple, initialisez la régression avec un alpha de 0,8, un ratio de 0,5, un solveur égal à saga et un random_state de 123.

N'hésitez pas à utiliser le [doc](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).

In [9]:
elasticnet = LogisticRegression(penalty='elasticnet', C=0.8, l1_ratio=0.5,
                                random_state=123, solver='saga')

# Etape 3 : Entraînement du modèle


Vous devez former les quatre modèles.

In [10]:
# Classic linear regression
reg.fit(X_train_classif_norm, y_train_classif)

In [11]:
# Lasso regression
lasso.fit(X_train_classif_norm, y_train_classif)



In [12]:
# Ridge regression
ridge.fit(X_train_classif_norm, y_train_classif)

In [13]:
# ElasticNet regression
elasticnet.fit(X_train_classif_norm, y_train_classif)



# Etape 4 : Validation des modèles

Votre modèle est maintenant entraîné, utilisez-le pour prédire la probabilité de votre ensemble d'entraînement et de test pour les quatre modèles.

In [14]:
# Classic linear regression one hot prediction
x_train_reg_prediction = reg.predict(X_train_classif_norm)

x_test_reg_prediction = reg.predict(X_test_classif_norm)

# Classic linear regression probability
x_train_reg_prediction_proba = reg.predict_proba(X_train_classif_norm)[:, 1]

x_test_reg_prediction_proba = reg.predict_proba(X_test_classif_norm)[:, 1]

In [15]:
# Lasso regression
x_train_lasso_prediction = lasso.predict(X_train_classif_norm)

x_test_lasso_prediction = lasso.predict(X_test_classif_norm)

# Classic linear regression probability
x_train_lasso_prediction_proba = lasso.predict_proba(X_train_classif_norm)[:, 1]

x_test_lasso_prediction_proba = lasso.predict_proba(X_test_classif_norm)[:, 1]

In [16]:
# Ridge regression
x_train_ridge_prediction = ridge.predict(X_train_classif_norm)

x_test_ridge_prediction = ridge.predict(X_test_classif_norm)

# Classic linear regression probability
x_train_ridge_prediction_proba = ridge.predict_proba(X_train_classif_norm)[:, 1]

x_test_ridge_prediction_proba = ridge.predict_proba(X_test_classif_norm)[:, 1]

In [17]:
# ElasticNet regression
x_train_elasticnet_prediction = elasticnet.predict(X_train_classif_norm)

x_test_elasticnet_prediction = elasticnet.predict(X_test_classif_norm)

# Classic linear regression probability
x_train_elasticnet_prediction_proba = elasticnet.predict_proba(X_train_classif_norm)[:, 1]

x_test_elasticnet_prediction_proba = elasticnet.predict_proba(X_test_classif_norm)[:, 1]

Calculer l'AUC pour chaque modèle

In [18]:
# Classic linear regression
auc_train = roc_auc_score(y_train_classif, x_train_reg_prediction_proba)

auc_test = roc_auc_score(y_test_classif, x_test_reg_prediction_proba)

print('AUC for the training set : '+str(auc_train))

print('AUC for the testing set : '+str(auc_test))

AUC for the training set : 0.9989821381665354
AUC for the testing set : 0.9797979797979799


In [19]:
# Lasso regression
auc_train = roc_auc_score(y_train_classif, x_train_lasso_prediction_proba)

auc_test = roc_auc_score(y_test_classif, x_test_lasso_prediction_proba)

print('AUC for the training set : '+str(auc_train))

print('AUC for the testing set : '+str(auc_test))

AUC for the training set : 0.9984732072498029
AUC for the testing set : 0.9810606060606061


In [20]:
# Ridge regression
auc_train = roc_auc_score(y_train_classif, x_train_ridge_prediction_proba)

auc_test = roc_auc_score(y_test_classif, x_test_ridge_prediction_proba)

print('AUC for the training set : '+str(auc_train))

print('AUC for the testing set : '+str(auc_test))

AUC for the training set : 0.9989493039138428
AUC for the testing set : 0.9797979797979799


In [21]:
# ElasticNet regression
auc_train = roc_auc_score(y_train_classif, x_train_elasticnet_prediction_proba)

auc_test = roc_auc_score(y_test_classif, x_test_elasticnet_prediction_proba)

print('AUC for the training set : '+str(auc_train))

print('AUC for the testing set : '+str(auc_test))

AUC for the training set : 0.9985388757551878
AUC for the testing set : 0.9797979797979799


# Etape 5 : Impact du terme de régularisation sur le coefficient

Impact du terme de régularisation pour la régression Lasso.

In [22]:
for alpha_values in [0.01, 0.1, 0.2, 0.5, 1] :
  lasso = LogisticRegression(penalty='l1', C=alpha_values,
                                random_state=123, solver='saga')
  lasso.fit(X_train_classif_norm, y_train_classif)
  print('Alpha = '+str(alpha_values))
  print(lasso.coef_)

Alpha = 0.01
[[ 0.          0.          0.          0.          0.          0.
   0.          0.          0.          0.          0.          0.
   0.          0.          0.          0.          0.          0.
   0.          0.         -0.26424952  0.         -0.16705349  0.
   0.          0.          0.         -0.53127883  0.          0.        ]]
Alpha = 0.1
[[-0.14507022  0.         -0.12041656  0.          0.          0.
   0.         -0.51510292  0.          0.         -0.24093316  0.
   0.          0.          0.          0.          0.          0.
   0.          0.         -0.91869054 -0.67675036 -0.70933674 -0.38701145
  -0.26333977  0.         -0.09230818 -0.83571792 -0.18593369  0.        ]]




Alpha = 0.2
[[-0.32047349 -0.1616604  -0.26753238 -0.2378532   0.          0.
   0.         -0.4705726   0.          0.         -0.51161577  0.
   0.         -0.17075759  0.          0.          0.          0.
   0.          0.02929312 -0.9082276  -0.79872879 -0.71663174 -0.61963469
  -0.55769254  0.         -0.25854068 -0.78820223 -0.32990701  0.        ]]
Alpha = 0.5
[[-0.44444855 -0.40696847 -0.40738401 -0.43583663  0.          0.
  -0.35461078 -0.58429947  0.          0.12569606 -0.76592696  0.00801643
  -0.31523647 -0.56021762  0.          0.28871981  0.          0.
   0.12038324  0.241992   -0.89618699 -0.92048883 -0.73840128 -0.76301502
  -0.84437162  0.         -0.54846451 -0.8332136  -0.5696944   0.        ]]
Alpha = 1
[[-0.49956233 -0.52281989 -0.47361553 -0.51902645  0.          0.
  -0.6015459  -0.70249323  0.          0.25768842 -0.91244215  0.17541411
  -0.50873419 -0.73945657 -0.14828455  0.51037357  0.          0.
   0.26310448  0.41094537 -0.93100073 -1.09753053 -0.786



Impact du terme de régularisation pour la régression Rige.

In [23]:
for alpha_values in [0.01, 0.1, 1, 10] :
  ridge = LogisticRegression(penalty='l2', C=alpha_values,
                                random_state=123, solver='saga')
  ridge.fit(X_train_classif_norm, y_train_classif)
  print('Alpha = '+str(alpha_values))
  print(ridge.coef_)

Alpha = 0.01
[[-0.22332807 -0.18183998 -0.22109315 -0.21005999 -0.08444644 -0.09918108
  -0.16677098 -0.21736004 -0.06543112  0.08783398 -0.16256353  0.02234146
  -0.13821362 -0.15414772  0.01046901  0.02683516  0.02003664 -0.05175978
   0.04688977  0.0771214  -0.24694482 -0.22193547 -0.23774733 -0.21891494
  -0.1751445  -0.1403803  -0.18043599 -0.24268607 -0.16691238 -0.07579839]]
Alpha = 0.1
[[-0.40740949 -0.41973909 -0.39714213 -0.39459714 -0.12862023 -0.02642965
  -0.374073   -0.43650351 -0.07770733  0.24929369 -0.44283409  0.0979864
  -0.31796138 -0.38646379 -0.09230164  0.24820669  0.03532078 -0.07610843
   0.16401597  0.2510048  -0.51996549 -0.59656969 -0.47592709 -0.46861481
  -0.44791095 -0.15665607 -0.41588697 -0.4977698  -0.43455213 -0.14280643]]
Alpha = 1
[[-0.56050357 -0.61516366 -0.54299684 -0.58942542 -0.14051486  0.17581356
  -0.72922041 -0.73140609 -0.06449194  0.43366429 -0.91011559  0.30619026
  -0.60449697 -0.79908697 -0.29597873  0.66939875 -0.01959067 -0.10291682




Impact du terme de régularisation et du ratio l1 pour la régression elasticnet.


In [24]:
for alpha_values, ratio_values in zip([0.01, 0.01, 0.1, 0.5, 1, 1], [0, 1, 0.5, 0.5, 1, 0]) :
  elasticnet = LogisticRegression(penalty='elasticnet', C=alpha_values, l1_ratio=ratio_values,
                                random_state=123, solver='saga')
  elasticnet.fit(X_train_classif_norm, y_train_classif)
  print('Alpha = '+str(alpha_values))
  print('ratio_values = '+str(ratio_values))
  print(elasticnet.coef_)

Alpha = 0.01
ratio_values = 0
[[-0.22332807 -0.18183998 -0.22109315 -0.21005999 -0.08444644 -0.09918108
  -0.16677098 -0.21736004 -0.06543112  0.08783398 -0.16256353  0.02234146
  -0.13821362 -0.15414772  0.01046901  0.02683516  0.02003664 -0.05175978
   0.04688977  0.0771214  -0.24694482 -0.22193547 -0.23774733 -0.21891494
  -0.1751445  -0.1403803  -0.18043599 -0.24268607 -0.16691238 -0.07579839]]
Alpha = 0.01
ratio_values = 1
[[ 0.          0.          0.          0.          0.          0.
   0.          0.          0.          0.          0.          0.
   0.          0.          0.          0.          0.          0.
   0.          0.         -0.26424952  0.         -0.16705349  0.
   0.          0.          0.         -0.53127883  0.          0.        ]]




Alpha = 0.1
ratio_values = 0.5
[[-0.35273939 -0.24863102 -0.33070369 -0.29041355  0.          0.
  -0.10552724 -0.42305855  0.          0.         -0.33737042  0.
  -0.0894243  -0.18854359  0.          0.          0.          0.
   0.          0.07626549 -0.61630964 -0.5688378  -0.53301465 -0.45161405
  -0.4706329   0.         -0.28159144 -0.60600002 -0.32294222  0.        ]]
Alpha = 0.5
ratio_values = 0.5
[[-0.47015469 -0.48502174 -0.44855412 -0.47625106  0.          0.
  -0.53415911 -0.6353275   0.          0.23255182 -0.78423208  0.13857422
  -0.448386   -0.63366039 -0.11859366  0.41524951  0.          0.
   0.22320067  0.34225448 -0.81729677 -0.95199687 -0.69924996 -0.73819185
  -0.81963894  0.         -0.65601491 -0.81021444 -0.68447462 -0.02014517]]




Alpha = 1
ratio_values = 1
[[-0.49956233 -0.52281989 -0.47361553 -0.51902645  0.          0.
  -0.6015459  -0.70249323  0.          0.25768842 -0.91244215  0.17541411
  -0.50873419 -0.73945657 -0.14828455  0.51037357  0.          0.
   0.26310448  0.41094537 -0.93100073 -1.09753053 -0.78600855 -0.85133288
  -0.94370559  0.         -0.7493958  -0.91069814 -0.77234385  0.        ]]
Alpha = 1
ratio_values = 0
[[-0.56050357 -0.61516366 -0.54299684 -0.58942542 -0.14051486  0.17581356
  -0.72922041 -0.73140609 -0.06449194  0.43366429 -0.91011559  0.30619026
  -0.60449697 -0.79908697 -0.29597873  0.66939875 -0.01959067 -0.10291682
   0.36127102  0.59986868 -0.89478149 -1.14127237 -0.78070502 -0.85613101
  -0.85323908 -0.08753569 -0.8184763  -0.84851447 -0.84370488 -0.24482282]]


