### L1 and L2 ( Lasso and Ridge) Regularization Feature Selection

### Embedded Method 

like the FIlter Method also the Embedded Method makes use of a Machine Learning model. 
The difference between the two methods is that the Embedded Method examines the different training iterations of our ML model 
and then ranks the importance of each feature based on how much each of the 
features contributed to the ML model training (eg. LASSO & Ridge Regularization).

*lasso regression or L1 regularization

*ridge regression or L2 regularization

*elastic nets or L1/L2 regularization

##### detail:

L1 regularization has shrinks some of the coefficients to zero, therefore indicating that a certain predictor 
or certain features will be multiplied by zero to estimate the target. 
Thus, it won’t be added to the final prediction of the target—this means that these features can be removed 
because they aren’t contributing to the final prediction.

L2 regularization, on the other hand, doesn’t set the coefficient to zero, but only approaching zero—that’s
why we use only L1 in feature selection.

L1/L2 regularization is a combination of the L1 and L2. It incorporates their penalties,
and therefore we can end up with features with zero as a coefficient—similar to L1.

In [3]:
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt  
%matplotlib inline

In [42]:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.feature_selection import SelectKBest, SelectPercentile
from sklearn.feature_selection import SelectFromModel

In [5]:
from sklearn.linear_model import LinearRegression, LogisticRegression

In [18]:
from sklearn.preprocessing import LabelEncoder

### Load the Model

In [6]:
titanic = sns.load_dataset("titanic")

In [7]:
titanic.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.925,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


In [8]:
titanic.isnull().sum()

survived         0
pclass           0
sex              0
age            177
sibsp            0
parch            0
fare             0
embarked         2
class            0
who              0
adult_male       0
deck           688
embark_town      2
alive            0
alone            0
dtype: int64

In [9]:
titanic.shape

(891, 15)

In [12]:
titanic.drop(["deck", "age"], axis=1, inplace=True)

In [14]:
titanic =titanic.dropna()

In [15]:
titanic.isnull().sum()

survived       0
pclass         0
sex            0
sibsp          0
parch          0
fare           0
embarked       0
class          0
who            0
adult_male     0
embark_town    0
alive          0
alone          0
dtype: int64

In [17]:
titanic.dtypes

survived          int64
pclass            int64
sex              object
sibsp             int64
parch             int64
fare            float64
embarked         object
class          category
who              object
adult_male         bool
embark_town      object
alive            object
alone              bool
dtype: object

In [21]:
cols=titanic.select_dtypes(exclude=['int64', 'float64']).columns

In [22]:
cols

Index(['sex', 'embarked', 'class', 'who', 'adult_male', 'embark_town', 'alive',
       'alone'],
      dtype='object')

In [25]:
le = LabelEncoder()

for i in cols:
    titanic[i] =le.fit_transform(titanic[i])
titanic.head()    

Unnamed: 0,survived,pclass,sex,sibsp,parch,fare,embarked,class,who,adult_male,embark_town,alive,alone
0,0,3,1,1,0,7.25,2,2,1,1,2,0,0
1,1,1,0,1,0,71.2833,0,0,2,0,0,1,0
2,1,3,0,0,0,7.925,2,2,2,0,2,1,1
3,1,1,0,1,0,53.1,2,0,2,0,2,1,0
4,0,3,1,0,0,8.05,2,2,1,1,2,0,1


In [27]:
titanic.dtypes

survived         int64
pclass           int64
sex              int32
sibsp            int64
parch            int64
fare           float64
embarked         int32
class            int32
who              int32
adult_male       int64
embark_town      int32
alive            int32
alone            int64
dtype: object

In [32]:
titanic.corr()['alive'].sort_values(ascending= False)

alive          1.000000
survived       1.000000
who            0.323191
fare           0.255290
parch          0.083151
sibsp         -0.034040
embark_town   -0.169718
embarked      -0.169718
alone         -0.206207
class         -0.335549
pclass        -0.335549
sex           -0.541585
adult_male    -0.555520
Name: alive, dtype: float64

In [33]:
data = titanic[["pclass", "sex", "sibsp", "parch", "embarked", "who", "alone"]]

In [34]:
data.head()

Unnamed: 0,pclass,sex,sibsp,parch,embarked,who,alone
0,3,1,1,0,2,1,0
1,1,0,1,0,0,2,0
2,3,0,0,0,2,2,1
3,1,0,1,0,2,2,0
4,3,1,0,0,2,1,1


In [35]:
X = data.copy()
y = titanic['survived']

In [50]:
X_train,X_test,y_train,y_test= train_test_split(X, y, test_size=0.3, random_state=43)

In [51]:
X.shape, y.shape

((889, 7), (889,))

### Estimation of co-efficient of linear Regression

In [52]:
sel = SelectFromModel(LinearRegression())

In [53]:
sel.fit(X_train, y_train)

SelectFromModel(estimator=LinearRegression(copy_X=True, fit_intercept=True,
                                           n_jobs=None, normalize=False),
                max_features=None, norm_order=1, prefit=False, threshold=None)

In [54]:
sel.get_support()

array([False,  True, False, False, False, False, False])

In [55]:
sel.estimator_.coef_

array([-0.12325546, -0.57379285, -0.05811876, -0.05730094, -0.03440765,
       -0.10291758, -0.15058774])

In [57]:
np.mean(np.abs(sel.estimator_.coef_))

0.15719728224511445

In [58]:
features =X_train.columns[sel.get_support()]
features

Index(['sex'], dtype='object')

In [59]:
X_Train_reg =sel.transform(X_train)
X_Test_reg =sel.transform(X_test)

In [60]:
X_Train_reg.shape, X_Test_reg.shape

((622, 1), (267, 1))

### Build the ML model

In [69]:
def randomforest(X_train, X_test, y_train, y_test):
    rf= RandomForestClassifier(n_estimators=100 , n_jobs=-1, random_state=0)
    rf.fit(X_train, y_train)
    y_pred =rf.predict(X_test)
    print("Accuracy on test set:", accuracy_score(y_test, y_pred))

In [70]:
%%time
randomforest(X_Train_reg, X_Test_reg, y_train, y_test)

Accuracy on test set: 0.8164794007490637
Wall time: 398 ms


In [71]:
%%time
randomforest(X_train, X_test, y_train, y_test)

Accuracy on test set: 0.8239700374531835
Wall time: 454 ms


### Logistic Regression coefficient with L1 regularization

In [76]:
sel = SelectFromModel(LogisticRegression(penalty="l1" , C=0.05, solver='liblinear'))
sel.fit(X_train, y_train)
sel.get_support()

array([ True,  True, False, False, False,  True, False])

In [77]:
sel.estimator_.coef_

array([[-0.1401699 , -1.11224209,  0.        ,  0.        ,  0.        ,
         0.34498233,  0.        ]])

In [78]:
features =X_train.columns[sel.get_support()]
features

Index(['pclass', 'sex', 'who'], dtype='object')

In [81]:
X_Train_l1 =sel.transform(X_train)
X_Test_l1 =sel.transform(X_test)

In [82]:
X_Train_log.shape

(622, 3)

In [83]:
def randomforest(X_train, X_test, y_train, y_test):
    rf= RandomForestClassifier(n_estimators=100 , n_jobs=-1, random_state=0)
    rf.fit(X_train, y_train)
    y_pred =rf.predict(X_test)
    print("Accuracy on test set:", accuracy_score(y_test, y_pred))

In [84]:
%%time
randomforest(X_Train_l1, X_Test_l1, y_train, y_test)

Accuracy on test set: 0.8239700374531835
Wall time: 382 ms


### Logistic Regression coefficient with L2 regularization

In [85]:
sel = SelectFromModel(LogisticRegression(penalty="l2" , C=0.05, solver='liblinear'))
sel.fit(X_train, y_train)
sel.get_support()

array([False,  True, False, False, False,  True, False])

In [86]:
features =X_train.columns[sel.get_support()]
features

Index(['sex', 'who'], dtype='object')

In [87]:
X_Train_l2 =sel.transform(X_train)
X_Test_l2 =sel.transform(X_test)

In [88]:
def randomforest(X_train, X_test, y_train, y_test):
    rf= RandomForestClassifier(n_estimators=100 , n_jobs=-1, random_state=0)
    rf.fit(X_train, y_train)
    y_pred =rf.predict(X_test)
    print("Accuracy on test set:", accuracy_score(y_test, y_pred))

In [89]:
%%time
randomforest(X_Train_l2, X_Test_l2, y_train, y_test)

Accuracy on test set: 0.8052434456928839
Wall time: 418 ms
