# Lasso Regression

The aim is to find the coefficients that minimize the sum of error squares by applying a penalty to these coefficients.

- Lasso regression = L1
- Ridge regression = L2

- It has been proposed to eliminate the disadvantage of leaving the related-unrelated variables in the model of the Ridge regression.
- Coefficients near zero in Lasso.
- But when the L1 norm is big enough in lambda, some coefficients make it zero. Thus, it makes the selection of the variable.
- It is very important to choose Lambda correctly, CV is used here too.
- Ridge and Lasso methods are not superior to each other.

<img src="https://i.ibb.co/Nswq4kn/Whats-App-Image-2020-08-17-at-23-08-10.jpg" />

In [None]:
# import the necessary packages
import numpy as np
import pandas as pd
from sklearn.linear_model import Lasso
from sklearn.metrics import mean_squared_error,r2_score
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn import model_selection
from sklearn.linear_model import LassoCV
from scipy.stats import boxcox
import matplotlib.pyplot as plt

In [None]:
# load data
data = "../input/insurance/insurance.csv"
df = pd.read_csv(data)

# show data (6 row)
df.head(6)

## Model

In [None]:
df_encode = pd.get_dummies(data = df, columns = ['sex','smoker','region'])
df_encode.head()

In [None]:
# normalization
y_bc,lam, ci= boxcox(df_encode['charges'],alpha=0.05)
df_encode['charges'] = np.log(df_encode['charges'])

df_encode.head()

In [None]:
X = df_encode.drop("charges",axis=1)
y = df_encode["charges"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

lasso_model = Lasso().fit(X_train,y_train)
lasso_model

In [None]:
print("intercept: ", lasso_model.intercept_)
print("coef: ", lasso_model.coef_)

In [None]:
# coefficients for different lambda values

alphas = 10**np.linspace(10, -2, 100) * 0.5
lasso = Lasso()
coefs = []

for a in alphas:
    lasso.set_params(alpha=a)
    lasso.fit(X_train,y_train)
    coefs.append(lasso.coef_)

In [None]:
ax = plt.gca()
ax.plot(alphas*2, coefs)
ax.set_xscale("log")
plt.axis("tight")
plt.xlabel("alpha")
plt.show()

## Prediction

In [None]:
lasso.predict(X_test)[0:10]

In [None]:
y_pred = lasso.predict(X_test)
np.sqrt(mean_squared_error(y_test, y_pred))

## Model Tuning

In [None]:
lasso_cv_model = LassoCV(alphas=None, cv=10, max_iter=100000, normalize=True)
lasso_cv_model

In [None]:
lasso_cv_model.fit(X_train, y_train)

In [None]:
lasso_cv_model.alpha_

In [None]:
lasso_tuned = Lasso().set_params(alpha= lasso_cv_model.alpha_).fit(X_train,y_train)
y_pred = lasso_tuned.predict(X_test)
np.sqrt(mean_squared_error(y_test,y_pred))