##  L1 REGULARIZATION (simple model)

In data_reg.csv, you'll find data for a bunch of points including six predictor variables and one outcome variable. Use sklearn's Lasso class to fit a linear regression model to the data, while also using L1 regularization to control for model complexity.

In [9]:
#Add import statements
import numpy as np
import pandas as pd
from sklearn.linear_model import Lasso
import matplotlib.pyplot as plt

## Load in the data

In [2]:
'''
Split the data so that the six predictor features (first six columns) are stored in X, 
and the outcome feature (last column) is stored in y.
'''
train_data = pd.read_csv('data_reg.csv',header=None)
X = train_data.iloc[:,:-1]
y = train_data.iloc[:,-1]
X

Unnamed: 0,0,1,2,3,4,5
0,1.25664,2.04978,-6.23640,4.71926,-4.26931,0.20590
1,-3.89012,-0.37511,6.14979,4.94585,-3.57844,0.00640
2,5.09784,0.98120,-0.29939,5.85805,0.28297,-0.20626
3,0.39034,-3.06861,-5.63488,6.43941,0.39256,-0.07084
4,5.84727,-0.15922,11.41246,7.52165,1.69886,0.29022
...,...,...,...,...,...,...
95,-4.58240,-1.27825,7.55098,8.83930,-3.80318,0.04386
96,-10.00364,2.66002,-4.26776,-3.73792,-0.72349,-0.24617
97,-4.32624,-2.30314,-8.16044,4.46366,-3.33569,-0.01655
98,-1.90167,-0.15858,-10.43466,4.89762,-0.64606,-0.14519


## Fit data using linear regression with Lasso regularization
 - Create an instance of sklearn's Lasso class and assign it to the variable lasso_reg. You don't need to set any parameter values: use the default values for this code.
 - Use the Lasso object's .fit() method to fit the regression model onto the data.

In [3]:
#Create the linear regression model with lasso regularization.
lasso_reg = Lasso()

#Fit the model.
lasso_reg.fit(X, y)

Lasso(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=1000,
      normalize=False, positive=False, precompute=False, random_state=None,
      selection='cyclic', tol=0.0001, warm_start=False)

## Inspect the coefficients of the regression model

In [4]:
#Retrieve and print out the coefficients from the regression model.
reg_coef = lasso_reg.coef_
print(reg_coef)

[ 0.          2.35793224  2.00441646 -0.05511954 -3.92808318  0.        ]


### For which of the predictor features(X) has the lasso regularization step zeroed the corresponding coefficient?
##### As you can see that answer is 1st one and last one

##  L1 REGULARIZATION (complex/polynomial model)

In [5]:
from sklearn.preprocessing import PolynomialFeatures

In [6]:
#it will gives 4 degree polynomial object
poly_feat = PolynomialFeatures(degree = 4)

#fit your x a/c to these polynomial features
X_poly = poly_feat.fit_transform(X)
# pd.DataFrame(X_poly)

In [7]:
#Create the linear regression model with lasso regularization.
lasso_reg1 = Lasso(max_iter=2000,tol=1)
#Fit the model.
lasso_reg1.fit(X_poly, y)

Lasso(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=2000,
      normalize=False, positive=False, precompute=False, random_state=None,
      selection='cyclic', tol=1, warm_start=False)

In [8]:
reg_coef1 = lasso_reg1.coef_
print(reg_coef1)

[ 0.00000000e+00  0.00000000e+00  3.00406732e+00  1.78623934e+00
 -2.30803479e-01 -3.71384816e+00  0.00000000e+00 -1.95309557e-02
 -0.00000000e+00 -1.46557498e-03  4.36299806e-03 -0.00000000e+00
  0.00000000e+00 -0.00000000e+00 -2.45727864e-02 -1.01845590e-02
  0.00000000e+00  0.00000000e+00  2.16431684e-03  4.85729985e-04
 -0.00000000e+00  0.00000000e+00  3.10117551e-03 -7.60551752e-03
 -0.00000000e+00  0.00000000e+00 -0.00000000e+00  0.00000000e+00
 -1.79015576e-04 -6.60026412e-03  2.79593429e-03  2.13528975e-03
 -1.85709370e-03  1.02252022e-01 -8.94814737e-04 -0.00000000e+00
 -2.71088626e-03  5.56251694e-03  0.00000000e+00 -7.99245716e-05
 -8.05436897e-05  2.53141781e-03  0.00000000e+00 -2.98481699e-05
  2.48767581e-03  0.00000000e+00 -1.77469974e-03  0.00000000e+00
 -0.00000000e+00 -1.41137351e-03 -5.42600005e-03  1.96914071e-03
 -9.85972197e-03  0.00000000e+00 -2.52634618e-04  2.47306094e-04
  4.12408140e-04  0.00000000e+00 -2.05851988e-04  0.00000000e+00
 -0.00000000e+00  1.79069

#### And here you can see that many co-efficients are 0

#### for l2 regularization you can use Ridge class instead of Lasso

## WHICH ONE TO USE (L1 V/S L2)

<table style="width:100%">
  <tr>
    <th>L1</th>
    <th>L2</th>
  </tr>
  <tr>
    <td>Compuationally inefficient(unless data is sparse) , seems easy because no square values but actually absulate values are hard to differentiate</td>
    <td>Compuationally efficient because (square values have very nice derivatives)</td>
  </tr>
  <tr>
      <td>Most special benefit is FEATURE SELECTION. e.g You have 1000 of data and only 10 are relevant. it will detect irrelevant column and make them 0.As you saw in above code how regularization will remove features from a model (by setting their coefficients to zero)</td>
      <td>No feature selection (treat all columns similar)</td>
  </tr>
</table>