#### About

> Lasso Regression

Lasso (Least Absolute Shrinkage and Selection Operator) regression is a type of linear regression that is used for feature selection and regularization

In this technique, the cost function is modified to minimize the absolute sum of the coefficients, subject to a constraint on the sum of the absolute values of the coefficients. This results in a sparse model with fewer non-zero coefficients, which can help prevent overfitting and improve generalization performance.

> Mathematics

> Cost Function

J(theta) = (1 / (2 * m)) * sum((h_theta(x^(i)) - y^(i))^2) + alpha * sum(|theta_j|)

where:

J(theta) is the cost function
m is the number of training examples
h_theta(x^(i)) is the hypothesis function
y^(i) is the actual output
alpha is the regularization parameter that controls the strength of the regularization

The first term in the cost function is the mean squared error between the predicted and actual outputs, while the second term is the sum of the absolute values of the coefficients. The regularization term imposes a penalty on the magnitude of the coefficients, which forces them towards zero and results in a simpler model with fewer non-zero coefficients.

> Gradient Descent

theta_j = theta_j - (lr / m) * sum((h_theta(x^(i)) - y^(i)) * x_j^(i)) - (alpha * lr / m) * sign(theta_j)

where:

lr is the learning rate
x_j^(i) is the j-th feature of the i-th training example
sign(theta_j) is the sign of the j-th coefficient



In [5]:
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Lasso
from sklearn.metrics import accuracy_score

In [6]:
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

In [7]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [8]:

# Create a Lasso regression model
alpha = 0.1
lasso = Lasso(alpha=alpha)


In [9]:
lasso.fit(X_train, y_train)


In [10]:
y_pred = lasso.predict(X_test)


In [11]:
y_pred = y_pred.round().astype(int)


In [12]:
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy: {:.4f}".format(accuracy))

Accuracy: 0.9667


It is to be noted that Lasso regression is typically used for continuous target variables in regression problems, and it's not commonly used for classification problems like the Iris dataset. However, for illustrative purposes, we converted the predicted values to integers for classification in this example. In practice, other classification algorithms such as logistic regression or support vector machines are typically used for classification tasks.