### 🔹 Lasso Regression  

Lasso Regression is another **regularized version of Linear Regression** that adds an \(L1\)-penalty term to the cost function.  
Unlike Ridge, Lasso can shrink some coefficients **exactly to zero**, which makes it useful for **feature selection**.  

The main idea:  
- Lasso modifies the **Ordinary Least Squares (OLS)** cost function by adding an \(L1\)-penalty term.  
- This penalty encourages sparsity, meaning some coefficients become exactly zero.  
- As a result, Lasso automatically selects a subset of the most important features.  

Mathematically, Lasso minimizes the following cost function:  

$$
J(\beta) = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 + \lambda \sum_{j=1}^{p} |\beta_j|
$$  

where:  
- $y_i$ is the true value,  
- $\hat{y}_i$ is the predicted value,  
- $\beta_j$ are the model coefficients,  
- $\lambda \geq 0$ is the **regularization strength** (controls sparsity).  

Lasso Regression helps us to:  
- Perform **feature selection** by setting irrelevant coefficients to zero.  
- Reduce overfitting when there are many predictors.  
- Improve interpretability by producing a simpler model.  

In this notebook, we will implement **Lasso Regression**, tune the regularization parameter \(\lambda\), and analyze which features remain in the model 🚀.  


# --------------------------------------------------------------------------

# import dataset

In [1]:
# from google.colab import files, drive

# up = files.upload()
# drive.mount('/content/drive')

In [2]:
import pandas as pd

df = pd.read_csv('dataset.csv')
df.head(3)

Unnamed: 0,A,B,C,T
0,2.0,4,8.5,196
1,2.4,4,9.6,221
2,1.5,4,5.9,136


In [3]:
# df.info()

# cleaning

In [4]:
# clean the data

# encoding

In [5]:
# encode the data

# define x , y

In [6]:
import numpy as np

x = df[['A', 'B', 'C']].values
y = df['T'].values

# spliting

In [18]:
# # finding best random state 

# from sklearn.model_selection import train_test_split
# from sklearn.linear_model import Lasso
# l = Lasso()
# from sklearn.preprocessing import StandardScaler
# sc = StandardScaler()
# from sklearn.metrics import r2_score

# import time
# t1 = time.time()
# lst = []
# for i in range(1,10):
#     x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25, random_state=i) 
#     x_train = sc.fit_transform(x_train)
#     x_test = sc.transform(x_test)
#     l.fit(x_train, y_train)
#     yhat_test = l.predict(x_test)
#     r2 = r2_score(y_test, yhat_test)
#     lst.append(r2)
               
# t2 = time.time()
# print(f"run time: {round((t2 - t1) / 60 , 0)} min")
# print(f"R2_score = {round(max(lst),2)}")
# print(f"random_state = {np.argmax(lst) + 1}")

In [8]:
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25, random_state=42)

# scaling

In [9]:
from sklearn.preprocessing import StandardScaler

sc = StandardScaler().fit(x_train)
x_train = sc.transform(x_train)
x_test = sc.transform(x_test)

# fit the model

In [10]:
# # K-fold cross validation

# from sklearn.linear_model import Lasso
# from sklearn.model_selection import GridSearchCV

# parameters = {
#     '': [],
#     '': []
# }

# ls = Lasso()
# gs = GridSearchCV(estimator=ls, param_grid=parameters, cv=5)

# gs.fit(x_train, y_train)

# best_params = gs.best_params_
# print(best_params)

In [11]:
# def param
# alpha=1.0, fit_intercept=True, precompute=False, copy_X=True, max_iter=1000, tol=0.0001, 
# warm_start=False, positive=False, random_state=None, selection='cyclic'

In [12]:
from sklearn.linear_model import Lasso

lr = Lasso()                            # Adjust alpha for regularization strength
lr.fit(x_train, y_train)

# predict test data

In [13]:
yhat_test = lr.predict(x_test)

# Evaluate the model

In [14]:
from sklearn.metrics import r2_score

print("r2-score (train data): %0.4f" % r2_score(y_train, lr.predict(x_train)))
print("r2-score (test data): %0.4f" % r2_score(y_test, yhat_test))

r2-score (train data): 0.8605
r2-score (test data): 0.8710


In [15]:
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error

print(f"MSE (train data): {mean_squared_error(y_train, lr.predict(x_train))}")
print(f"RMSE (train data): {np.sqrt(mean_squared_error(y_train, lr.predict(x_train)))}")
print(f"MAE (train data): {mean_absolute_error(y_train, lr.predict(x_train))}")
print('------------')
print(f"MSE (test data): {mean_squared_error(y_test, yhat_test)}")
print(f"RMSE (test data): {np.sqrt(mean_squared_error(y_test, yhat_test))}")
print(f"MAE (test data): {mean_absolute_error(y_test, yhat_test)}")

MSE (train data): 553.2544705084507
RMSE (train data): 23.521362003686153
MAE (train data): 16.927795711636126
------------
MSE (test data): 535.085519167531
RMSE (test data): 23.131915596584967
MAE (test data): 17.299951600964057


# save the model

In [16]:
# import joblib

# joblib.dump(lr, 'l_model.pkl')

# load the model

In [17]:
# import joblib

# lr = joblib.load('l_model.pkl')