# Lasso Regression

Least Absolute Shrinkage and Selection Operator Regression (simply called Lasso
Regression) is another regularized version of Linear Regression: just like Ridge
Regression, it adds a regularization term to the cost function, but it uses the ℓ1 norm
of the weight vector instead of half the square of the ℓ2 norm

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

An important characteristic of Lasso Regression is that it tends to completely eliminate the weights of the least important features( i.e. set them to zero)

all the weights for the high-degree polynomial features are equal to zero. In other words, Lasso Regression automatically performs feature selection and outputs a sparse model (i.e., with few nonzero feature weights).

![image.png](attachment:image.png)

 * on the top-left plot, the background contours (ellipses) represent an unregularized MSE cost function (α = 0)
 * the white circles show the Batch Gradient Descent path with that cost function.
 * The foreground contours (diamonds) represent the ℓ1 penalty
 * the triangles show the BGD path for this penalty only (α → ∞)
 
 ![image-2.png](attachment:image-2.png)
 
 

In [5]:
from sklearn.linear_model import Lasso
import numpy as np
X = np.random.randn(100,1)
y = 0.5*X**2 + 10*X +0.5
lasso_reg = Lasso()
lasso_reg.fit(X, y)
lasso_reg.predict([[1.5]])

array([14.56609276])