In this use case, we'll explore Elastic Net Regression, a combination of L1 (Lasso) and L2 (Ridge) regularization techniques, to predict wine quality based on various chemical features. We'll use the well-known Wine Quality dataset, which contains information about wine attributes and wine quality ratings. The goal is to create an Elastic Net Regression model and evaluate its performance.

Step 1: Problem Definition

Our objective is to build a predictive model to estimate wine quality based on multiple chemical attributes while controlling overfitting using Elastic Net Regression.

Step 2: Data Collection and Preprocessing

We will use the Wine Quality dataset available in scikit-learn's datasets module.

In [2]:
# Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load the Wine Quality dataset
wine = load_wine()

# Convert it to a DataFrame
data = pd.DataFrame(wine.data, columns=wine.feature_names)
data['QUALITY'] = wine.target

# Separate features and target
X = data.drop('QUALITY', axis=1)
y = data['QUALITY']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)


Step 3: Elastic Net Regression

Elastic Net Regression combines L1 (Lasso) and L2 (Ridge) regularization techniques to control overfitting while allowing feature selection. We'll create an Elastic Net model and train it.

In [3]:
# Import Elastic Net Regression
from sklearn.linear_model import ElasticNet

# Create an Elastic Net model
elastic_net_model = ElasticNet(alpha=1.0, l1_ratio=0.5)

# Train the Elastic Net model
elastic_net_model.fit(X_train_scaled, y_train)


Step 4: Model Evaluation

We'll evaluate the Elastic Net model on the test data using Mean Squared Error (MSE) and R-squared (R2) score, which are common regression evaluation metrics.

In [4]:
from sklearn.metrics import mean_squared_error, r2_score

# Predictions for the Elastic Net model
y_pred_elastic_net = elastic_net_model.predict(X_test_scaled)

# Calculate MSE and R2 for the Elastic Net model
mse_elastic_net = mean_squared_error(y_test, y_pred_elastic_net)
r2_elastic_net = r2_score(y_test, y_pred_elastic_net)

print("Elastic Net Regression:")
print(f"MSE: {mse_elastic_net}")
print(f"R-squared (R2): {r2_elastic_net}")


Elastic Net Regression:
MSE: 0.46678484011115856
R-squared (R2): 0.19979741695229936
