## 1. Dataset Description (UCI ML Repository)

**Dataset Name: Wine Quality – Red Wine**
Target Variable: quality (integer score between 0 and 10)
Features: 11 physicochemical properties such as acidity, sugar, pH, alcohol, etc.
Problem Type: Regression

**We will implement:**

- Linear Regression

- Ridge Regression (L2 Regularization)

- Lasso Regression (L1 Regularization)

- ElasticNet Regression (L1 + L2)

## 2. Import Required Libraries

In [1]:
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

from sklearn.linear_model import LinearRegression, Ridge, Lasso, ElasticNet
from sklearn.metrics import mean_squared_error, r2_score


## 3. Load the UCI Dataset

In [7]:
df = pd.read_csv("winequality-red.csv")
df.head()

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality
0,7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4,5
1,7.8,0.88,0.0,2.6,0.098,25.0,67.0,0.9968,3.2,0.68,9.8,5
2,7.8,0.76,0.04,2.3,0.092,15.0,54.0,0.997,3.26,0.65,9.8,5
3,11.2,0.28,0.56,1.9,0.075,17.0,60.0,0.998,3.16,0.58,9.8,6
4,7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4,5


## 4. Feature Selection and Target Split

In [8]:
X = df.drop("quality", axis=1)
y = df["quality"]

In [9]:
X

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol
0,7.4,0.700,0.00,1.9,0.076,11.0,34.0,0.99780,3.51,0.56,9.4
1,7.8,0.880,0.00,2.6,0.098,25.0,67.0,0.99680,3.20,0.68,9.8
2,7.8,0.760,0.04,2.3,0.092,15.0,54.0,0.99700,3.26,0.65,9.8
3,11.2,0.280,0.56,1.9,0.075,17.0,60.0,0.99800,3.16,0.58,9.8
4,7.4,0.700,0.00,1.9,0.076,11.0,34.0,0.99780,3.51,0.56,9.4
...,...,...,...,...,...,...,...,...,...,...,...
1594,6.2,0.600,0.08,2.0,0.090,32.0,44.0,0.99490,3.45,0.58,10.5
1595,5.9,0.550,0.10,2.2,0.062,39.0,51.0,0.99512,3.52,0.76,11.2
1596,6.3,0.510,0.13,2.3,0.076,29.0,40.0,0.99574,3.42,0.75,11.0
1597,5.9,0.645,0.12,2.0,0.075,32.0,44.0,0.99547,3.57,0.71,10.2


In [10]:
y

0       5
1       5
2       5
3       6
4       5
       ..
1594    5
1595    6
1596    6
1597    5
1598    6
Name: quality, Length: 1599, dtype: int64

## 5. Train Test Split

In [11]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

## 6. Feature Scaling (Important for Regularization)

In [12]:
scaler = StandardScaler()

X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

## 7. Linear Regression Model

In [13]:
lr = LinearRegression()
lr.fit(X_train_scaled, y_train)

y_pred_lr = lr.predict(X_test_scaled)

print("Linear Regression")
print("MSE:", mean_squared_error(y_test, y_pred_lr))
print("R2 Score:", r2_score(y_test, y_pred_lr))

Linear Regression
MSE: 0.39002514396395493
R2 Score: 0.4031803412796219


## 8. Ridge Regression (L2 Regularization)

In [14]:
ridge = Ridge(alpha=1.0)
ridge.fit(X_train_scaled, y_train)

y_pred_ridge = ridge.predict(X_test_scaled)

print("Ridge Regression")
print("MSE:", mean_squared_error(y_test, y_pred_ridge))
print("R2 Score:", r2_score(y_test, y_pred_ridge))

Ridge Regression
MSE: 0.39003800591460774
R2 Score: 0.4031606598177524


## 9. Lasso Regression (L1 Regularization)

In [15]:
lasso = Lasso(alpha=0.01)
lasso.fit(X_train_scaled, y_train)

y_pred_lasso = lasso.predict(X_test_scaled)

print("Lasso Regression")
print("MSE:", mean_squared_error(y_test, y_pred_lasso))
print("R2 Score:", r2_score(y_test, y_pred_lasso))

Lasso Regression
MSE: 0.39301501591123317
R2 Score: 0.39860521482224376


## 10. ElasticNet Regression (L1 + L2)

In [16]:
elastic = ElasticNet(alpha=0.01, l1_ratio=0.5)
elastic.fit(X_train_scaled, y_train)

y_pred_elastic = elastic.predict(X_test_scaled)

print("ElasticNet Regression")
print("MSE:", mean_squared_error(y_test, y_pred_elastic))
print("R2 Score:", r2_score(y_test, y_pred_elastic))

ElasticNet Regression
MSE: 0.3922395521482107
R2 Score: 0.3997918358018384


## 11. Model Comparison Summary

| Model             | Regularization | Purpose                                             |
| ----------------- | -------------- | --------------------------------------------------- |
| Linear Regression | None           | Baseline model                                      |
| Ridge             | L2             | Reduces overfitting, keeps all features             |
| Lasso             | L1             | Feature selection by shrinking coefficients to zero |
| ElasticNet        | L1 + L2        | Balance between Ridge and Lasso                     |


## 12. Key Learning Outcomes

- Linear Regression works as a baseline.

- Regularization reduces overfitting.

- Lasso helps in feature selection.

- Feature scaling is mandatory for Ridge, Lasso, and ElasticNet.

- R2 Score helps measure model performance.

# The End !!