# 1. Lasso Regression (L1 penality)
$$
L_{r_ridge}(w, b) = \sum_{i = 1}^{N}{(y_{i} - (w.x_{i}+b))^{2}} + \alpha\sum_{i = 1}^{N}|{{w_{j}|}}
$$

    * Minimize the sum of absolute values of the coefficients
    * A subset of coefficients are forced to be procesily zero ===> automatic feature selection
    * Regulatization is cotrolled by the parameter alpha
### 1.1 When use Lasso Regression ? 
    * A few features (x) that have medium/large effect on the target variable (y) ==> use Lasso Regression
    * Lots of features (x) that contribute small/medium effect on target variable (y) ==> use Ridge Regression


# 2. Coding side of Lasso Regression (sklearn) 

In [2]:
from sklearn.model_selection import train_test_split
from sklearn import datasets

"""
Dataset load and split
"""
X_diabetes, y_diabetes = datasets.load_diabetes(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X_diabetes, y_diabetes, random_state=0)


"""
##########################
"""

"""
Feature Normalization
"""
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()

X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.fit_transform(X_test)


"""
#########################
"""


"""
Ridge Regression runing
"""

from sklearn.linear_model import Lasso

params = {
    
    "alpha":20.0,  # defaut = 1.0
    "fit_intercept":True, # defaut = True
    "normalize":True, # normalize data 
    "max_iter": 15000, # defaut value is 15000
    "tol":1e-3, #default=1e-3 precision of solutiuon of w vector found by solver
    "solver":"auto", #{‘auto’ : according to data, ‘svd’ : Decomposition, ‘cholesky’, ‘lsqr’, ‘sparse_cg’, ‘sag’, ‘saga’, ‘lbfgs’}
}

l_lasso = Lasso(alpha=2.0,
                fit_intercept=True).fit(X_train_scaled, y_train)

"""
############################
"""


print("Model parameters : ")
print("#"*20)
print("Ridg Regression intercept param : {} ".format(l_lasso.fit_intercept))
print("-"*100)
print("Ridg Rehression coef : \n{}".format(l_lasso.coef_))
print("-"*100)
print("number of non zero features is {} ".format(sum(l_lasso.coef_ != 0)))
print()
print()




"""
Model evaluation
"""

from sklearn.metrics import r2_score, mean_squared_error

print("Model evaluation :")
print("#"*20)
y_predict_train = l_lasso.predict(X_train)
y_predict_test = l_lasso.predict(X_test)
print("R-squared error for test model is {:.3f} ".format(r2_score(y_predict_test, y_test)))



Model parameters : 
####################
Ridg Regression intercept param : True 
----------------------------------------------------------------------------------------------------
Ridg Rehression coef : 
[  0.          -0.         140.74817495  27.05138828  -0.
  -0.         -11.34003209   0.         110.71886545   0.        ]
----------------------------------------------------------------------------------------------------
number of non zero features is 4 


Model evaluation :
####################
R-squared error for test model is -162.996 
