# MLP Regression for Car Fuel Efficiency üöóüí®

---

This notebook implements a complete **Deep Learning Regression** solution using the PyTorch framework to predict a continuous numerical value‚Äî**car fuel efficiency (Miles Per Gallon or MPG)**‚Äîfrom structured, tabular data (likely the Auto MPG dataset). It serves as a robust example of integrating professional data preprocessing techniques (from libraries like `pandas` and `scikit-learn`) with the advanced model building and training concepts from **Chapter 13: Going Deeper**.

### 1. Advanced Data Preparation and Pipeline Setup üßπ

The notebook dedicates significant effort to preparing the messy, real-world data, a critical step often overlooked in simplified examples:

* **External Data Handling:** Data is first loaded and manipulated using **`pandas`** and **`numpy`**.
* **Handling Categorical Features:** The **`Origin`** feature (or similar categorical columns) is processed using **One-Hot Encoding** (`torch.nn.functional.one_hot` or `pandas.get_dummies`) to convert nominal values into numerical binary vectors, preventing the model from inferring incorrect ordinal relationships.
* **Feature Scaling (Standardization):** The continuous features (e.g., Horsepower, Weight) are preprocessed using **`sklearn.preprocessing.StandardScaler`**. This is **crucial** for optimizing gradient descent. It centers the data around a mean of zero and scales it to unit variance, ensuring all features contribute equally to the gradient descent process and promoting **faster, more stable convergence**.
* **Final PyTorch Pipeline:** The processed NumPy arrays are converted into **PyTorch Tensors**. These are then combined into a **`TensorDataset`** and wrapped in a **`DataLoader`** to efficiently stream mini-batches during training.

### 2. Multilayer Perceptron (MLP) Architecture for Regression üìê

The model is specifically designed for a regression task:

* **Architecture:** A **Multilayer Perceptron** (MLP) is constructed (e.g., using `nn.Sequential` or a custom class) to handle the multi-feature input vector after preprocessing.
* **Output Unit:** Since the goal is to predict a single continuous value (MPG), the final layer of the network contains **only 1 output unit**.
* **Activation Function Choice:** Crucially, the final output layer **does not use a non-linear activation function** (like ReLU or Sigmoid). This ensures the model's output is not restricted to a specific range (like 0 to 1), allowing it to predict any continuous MPG value.

### 3. Training and Evaluation for Regression Metrics üìâ

The training process uses specialized loss functions and rigorous evaluation protocols:

* **Loss Function:** The primary loss used is **`nn.MSELoss` (Mean Squared Error)**. This is the standard objective for regression problems; it penalizes larger errors exponentially more than smaller ones, encouraging the model to minimize outliers.
* **Evaluation Metrics (MAE):** The final performance on the test set is also measured using **`nn.L1Loss` (Mean Absolute Error, MAE)**. This metric provides a more **interpretable** average error (e.g., "The model is off by 2.1 MPG on average"), as it avoids the squaring operation of MSE.
* **Evaluation Best Practices:** The notebook strictly adheres to evaluation best practices for accurate testing:
    * **`model.eval()`:** Sets the model to evaluation mode, which turns off behaviors specific to training (like `Dropout` or `BatchNorm` if they were included).
    * **`with torch.no_grad():`:** Disables PyTorch's **Autograd engine**, which is unnecessary during inference. This conserves memory and accelerates the evaluation process.

This notebook provides a complete and technically detailed blueprint for using PyTorch to solve high-dimensional regression tasks on structured data, effectively integrating the machine learning ecosystem.

In [74]:
import torch
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from torch import nn, optim
from torch.nn.functional import one_hot
from torch.utils.data import DataLoader, TensorDataset

In [75]:
url = 'http://archive.ics.uci.edu/ml/' \
       'machine-learning-databases/auto-mpg/auto-mpg.data'
column_names = ['MPG', 'Cylenders', 'Displacement', 'Horsepower',
                'Weight', 'Acceleration', 'Model Year', 'Origin']
df = pd.read_csv(url, names= column_names, na_values= '?', 
                 comment= '\t', sep=" ", skipinitialspace= True)

URLError: <urlopen error [Errno 11001] getaddrinfo failed>

In [93]:
# Drop NA values
df = df.dropna()
df = df.reset_index(drop= True)

In [94]:
df_train, df_test = train_test_split(df, train_size= 0.8, random_state= 28)

In [95]:
train_stats = df_train.describe().transpose()

In [96]:
numeric_columns = ['Cylenders', 'Displacement', 'Horsepower', 'Weight', 'Acceleration']
scaler = StandardScaler()
scaler.fit(df_train[numeric_columns])
df_train_norm = df_train.copy()
df_test_norm = df_test.copy()
df_train_norm[numeric_columns] = scaler.transform(df_train_norm[numeric_columns])
df_test_norm[numeric_columns] = scaler.transform(df_test_norm[numeric_columns])

In [97]:
df_train_norm.tail()

Unnamed: 0,MPG,Cylenders,Displacement,Horsepower,Weight,Acceleration,Model Year,Origin
259,18.1,0.296585,0.608152,0.403349,0.513356,-0.16052,78,1
32,19.0,0.296585,0.357974,-0.114592,-0.405942,-0.924528,71,1
278,21.5,0.296585,0.348352,0.273863,0.317887,-0.051376,79,1
5,15.0,1.464272,2.25355,2.423318,1.616277,-2.015968,70,1
257,20.8,0.296585,0.050063,-0.503048,0.110571,0.421582,78,1


In [98]:
boundries = torch.tensor([73, 76, 79])
v1 = torch.tensor(df_train_norm['Model Year'].values)
df_train_norm['Model Year Bucketed'] = torch.bucketize(
            v1, boundries, right= True
)
v2 = torch.tensor(df_test_norm['Model Year'].values)
df_test_norm['Model Year Bucketed'] = torch.bucketize(
            v2, boundries, right= True
)
numeric_columns.append('Model Year Bucketed')

In [99]:
df_train_norm.tail()

Unnamed: 0,MPG,Cylenders,Displacement,Horsepower,Weight,Acceleration,Model Year,Origin,Model Year Bucketed
259,18.1,0.296585,0.608152,0.403349,0.513356,-0.16052,78,1,2
32,19.0,0.296585,0.357974,-0.114592,-0.405942,-0.924528,71,1,0
278,21.5,0.296585,0.348352,0.273863,0.317887,-0.051376,79,1,3
5,15.0,1.464272,2.25355,2.423318,1.616277,-2.015968,70,1,0
257,20.8,0.296585,0.050063,-0.503048,0.110571,0.421582,78,1,2


In [100]:
total_origins = len(set(df_train_norm['Origin']))
origin_encoded = one_hot(torch.from_numpy(df_train_norm['Origin'].values) % total_origins)
X_train_numeric = torch.tensor(df_train_norm[numeric_columns].values)
X_train = torch.cat([X_train_numeric, origin_encoded], 1).float()
origin_encoded = one_hot(torch.from_numpy(df_test_norm['Origin'].values) % total_origins)
X_test_numeric = torch.tensor(df_test_norm[numeric_columns].values)
X_test = torch.cat([X_test_numeric, origin_encoded], 1).float()

In [101]:
y_train = torch.tensor(df_train_norm['MPG'].values).float()
y_test = torch.tensor(df_test_norm['MPG'].values).float()

In [102]:
train_ds = TensorDataset(X_train, y_train)
batch_size = 8
torch.manual_seed(28)
train_dl = DataLoader(train_ds, batch_size, shuffle= True)

In [103]:
class MLP(nn.Module):
    def __init__(self, input_size, hidden_units= [8, 4]):
        super().__init__()
        layers = []
        for hidden_unit in hidden_units:
            layers += [nn.Linear(input_size, hidden_unit), nn.ReLU()]
            input_size = hidden_unit
        layers.append(nn.Linear(input_size, 1))
        self.module_list = nn.ModuleList(layers)
    def forward(self, x):
        for l in self.module_list:
            x = l(x)
        return x

In [104]:
model = MLP(X_train.shape[1])

In [105]:
loss_fn = nn.MSELoss(reduction= 'mean')
optimizer = optim.RMSprop(model.parameters(), lr= 0.001, momentum= 0.9)

In [106]:
num_epochs = 200
log_epochs = 20
torch.manual_seed(28)
for epoch in range(num_epochs):
    model.train()
    loss_hist_train = 0
    for x_batch, y_batch in train_dl:
        pred = model(x_batch)[:, 0]
        loss = loss_fn(pred, y_batch)
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        loss_hist_train += loss.item()
    if epoch % log_epochs == 0:
        print(f'Epoch {epoch}    Loss {loss_hist_train / len(train_dl):.4f}')

Epoch 0    Loss 308.3709
Epoch 20    Loss 8.1058
Epoch 40    Loss 7.3740
Epoch 60    Loss 8.1377
Epoch 80    Loss 6.9075
Epoch 100    Loss 6.7277
Epoch 120    Loss 6.4086
Epoch 140    Loss 6.6444
Epoch 160    Loss 6.7304
Epoch 180    Loss 6.5976


In [107]:
model.eval()
with torch.no_grad():
    pred = model(X_test.float())[:, 0]
    loss = loss_fn(pred, y_test)
    print(f'MSE Loss: {loss.item():.4f}')
    print(f'MAE Loss: {nn.L1Loss()(pred, y_test)}')

MSE Loss: 10.0369
MAE Loss: 2.187140464782715
