**Goal**: To predict the price of backpacks given various attributes.

**Agenda**: To experiment with **TabNet** (neural network architecture designed for tabular data learning) using **Optuna** (a powerful hyperparameter optimization framework designed to automate the search for optimal hyperparameters in machine learning models)

*Note: This project was done as a part of kaggle competition https://www.kaggle.com/competitions/playground-series-s5e2/overview*

**Import Packages**

In [None]:
!pip install pytorch-tabnet


Collecting pytorch-tabnet
  Downloading pytorch_tabnet-4.1.0-py3-none-any.whl.metadata (15 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.3->pytorch-tabnet)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.3->pytorch-tabnet)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.3->pytorch-tabnet)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.3->pytorch-tabnet)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=1.3->pytorch-tabnet)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 

In [None]:
import pandas as pd
import numpy as np
import torch
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error
from pytorch_tabnet.tab_model import TabNetRegressor

**Load train set**

In [None]:
df = pd.read_csv('train.csv')
df

Unnamed: 0,id,Brand,Material,Size,Compartments,Laptop Compartment,Waterproof,Style,Color,Weight Capacity (kg),Price
0,0,Jansport,Leather,Medium,7.0,Yes,No,Tote,Black,11.611723,112.15875
1,1,Jansport,Canvas,Small,10.0,Yes,Yes,Messenger,Green,27.078537,68.88056
2,2,Under Armour,Leather,Small,2.0,Yes,No,Messenger,Red,16.643760,39.17320
3,3,Nike,Nylon,Small,8.0,Yes,No,Messenger,Green,12.937220,80.60793
4,4,Adidas,Canvas,Medium,1.0,Yes,Yes,Messenger,Green,17.749338,86.02312
...,...,...,...,...,...,...,...,...,...,...,...
299995,299995,Adidas,Leather,Small,9.0,No,No,Tote,Blue,12.730812,129.99749
299996,299996,Jansport,Leather,Large,6.0,No,Yes,Tote,Blue,26.633182,19.85819
299997,299997,Puma,Canvas,Large,9.0,Yes,Yes,Backpack,Pink,11.898250,111.41364
299998,299998,Adidas,Nylon,Small,1.0,No,Yes,Tote,Pink,6.175738,115.89080


In [None]:
df.describe(include='all')

Unnamed: 0,id,Brand,Material,Size,Compartments,Laptop Compartment,Waterproof,Style,Color,Weight Capacity (kg),Price
count,300000.0,290295,291653,293405,300000.0,292556,292950,292030,290050,299862.0,300000.0
unique,,5,4,3,,2,2,3,6,,
top,,Adidas,Polyester,Medium,,Yes,Yes,Messenger,Pink,,
freq,,60077,79630,101906,,148342,148077,100031,51690,,
mean,149999.5,,,,5.44359,,,,,18.029994,81.411107
std,86602.684716,,,,2.890766,,,,,6.966914,39.03934
min,0.0,,,,1.0,,,,,5.0,15.0
25%,74999.75,,,,3.0,,,,,12.097867,47.38462
50%,149999.5,,,,5.0,,,,,18.068614,80.95612
75%,224999.25,,,,8.0,,,,,24.002375,115.01816


**Data contains NaN values, filling the numerical data with median and categorical by creating a new value 'unknown'**

In [None]:
# Fill missing numerical values with the median
df["Weight Capacity (kg)"].fillna(df["Weight Capacity (kg)"].median(), inplace=True)

# Verify that there are no missing values left
print(df["Weight Capacity (kg)"].isnull().sum())  # Should print 0

0


The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df["Weight Capacity (kg)"].fillna(df["Weight Capacity (kg)"].median(), inplace=True)


In [None]:
# List of categorical columns
cat_features = ["Brand", "Material", "Size", "Laptop Compartment", "Waterproof", "Style", "Color"]

# Fill missing categorical values with "Unknown"
df[cat_features] = df[cat_features].fillna("Unknown")

# Verify that no missing values remain
print(df[cat_features].isnull().sum())  # Should print all zeros


Brand                 0
Material              0
Size                  0
Laptop Compartment    0
Waterproof            0
Style                 0
Color                 0
dtype: int64


**Label encoding the categorical features**

In [None]:
from sklearn.preprocessing import LabelEncoder

# Apply Label Encoding to categorical features
label_encoders = {}
for col in cat_features:
    le = LabelEncoder()
    df[col] = le.fit_transform(df[col])
    label_encoders[col] = le  # Store encoders for future use (e.g., test data encoding)

# Verify changes (Check unique values in each categorical column after encoding)
encoded_summary = {col: df[col].nunique() for col in cat_features}
print(encoded_summary)  # Display unique values per encoded categorical column


{'Brand': 6, 'Material': 5, 'Size': 4, 'Laptop Compartment': 3, 'Waterproof': 3, 'Style': 4, 'Color': 7}


In [None]:
df

Unnamed: 0,id,Brand,Material,Size,Compartments,Laptop Compartment,Waterproof,Style,Color,Weight Capacity (kg),Price
0,0,1,1,1,7.0,2,0,2,0,11.611723,112.15875
1,1,1,0,2,10.0,2,2,1,3,27.078537,68.88056
2,2,4,1,2,2.0,2,0,1,5,16.643760,39.17320
3,3,2,2,2,8.0,2,0,1,3,12.937220,80.60793
4,4,0,0,1,1.0,2,2,1,3,17.749338,86.02312
...,...,...,...,...,...,...,...,...,...,...,...
299995,299995,0,1,2,9.0,0,0,2,1,12.730812,129.99749
299996,299996,1,1,0,6.0,0,2,2,1,26.633182,19.85819
299997,299997,3,0,0,9.0,2,2,0,4,11.898250,111.41364
299998,299998,0,2,2,1.0,0,2,2,4,6.175738,115.89080


In [None]:
# Check if GPU is available
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")

Using device: cpu


**Train test split**

In [None]:
from sklearn.model_selection import train_test_split
from pytorch_tabnet.tab_model import TabNetRegressor
import torch

# Define features and target
X = df.drop(columns=["id", "Price"])  # Drop ID & target column
y = df["Price"]

# Split data into training and validation sets (80% train, 20% validation)
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Convert to NumPy arrays (TabNet requires NumPy input)
X_train, X_val, y_train, y_val = X_train.values, X_val.values, y_train.values, y_val.values

# Check if GPU is available
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")

# Initialize TabNetRegressor with GPU support
tabnet_model = TabNetRegressor(verbose=1, seed=42, device_name=device)

# Display dataset shapes
X_train.shape, X_val.shape, y_train.shape, y_val.shape


Using device: cpu




((240000, 9), (60000, 9), (240000,), (60000,))

In [None]:
# Ensure target variable is 2D
y_train = y_train.reshape(-1, 1)
y_val = y_val.reshape(-1, 1)


In [None]:
# Display dataset shapes
X_train.shape, X_val.shape, y_train.shape, y_val.shape

((240000, 9), (60000, 9), (240000, 1), (60000, 1))

In [None]:
!pip install optuna

Collecting optuna
  Downloading optuna-4.2.1-py3-none-any.whl.metadata (17 kB)
Collecting alembic>=1.5.0 (from optuna)
  Downloading alembic-1.14.1-py3-none-any.whl.metadata (7.4 kB)
Collecting colorlog (from optuna)
  Downloading colorlog-6.9.0-py3-none-any.whl.metadata (10 kB)
Collecting Mako (from alembic>=1.5.0->optuna)
  Downloading Mako-1.3.9-py3-none-any.whl.metadata (2.9 kB)
Downloading optuna-4.2.1-py3-none-any.whl (383 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m383.6/383.6 kB[0m [31m8.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading alembic-1.14.1-py3-none-any.whl (233 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m233.6/233.6 kB[0m [31m10.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading colorlog-6.9.0-py3-none-any.whl (11 kB)
Downloading Mako-1.3.9-py3-none-any.whl (78 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m78.5/78.5 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: Ma

**Defining the model and training**

In [None]:
import optuna
from pytorch_tabnet.tab_model import TabNetRegressor

# Define Optuna objective function
def objective(trial):
    # Suggest hyperparameters
    learning_rate = trial.suggest_float("learning_rate", 1e-3, 1e-1, log=True)
    batch_size = trial.suggest_categorical("batch_size", [512, 1024, 2048])
    virtual_batch_size = trial.suggest_categorical("virtual_batch_size", [64, 128, 256])
    momentum = trial.suggest_float("momentum", 0.8, 0.98)
    gamma = trial.suggest_float("gamma", 0.90, 0.99)

    # Initialize TabNet model
    model = TabNetRegressor(
        optimizer_params=dict(lr=learning_rate),
        momentum=momentum,
        scheduler_params={"gamma": gamma},
        verbose=0,
        seed=42,
        device_name=device
    )

    # Train model with suggested batch size
    model.fit(
        X_train, y_train,
        eval_set=[(X_val, y_val)],
        eval_metric=['rmse'],
        max_epochs=50,
        patience=10,
        batch_size=batch_size,  # Pass batch size in fit(), not init
        virtual_batch_size=virtual_batch_size,
        num_workers=0,
        drop_last=False
    )

    # Get best validation RMSE
    best_rmse = min(model.history["val_0_rmse"])
    return best_rmse

# Run Optuna study for 20 trials
study = optuna.create_study(direction="minimize")
study.optimize(objective, n_trials=20)

# Best hyperparameters found
best_hyperparams = study.best_params
best_hyperparams


[I 2025-02-24 04:32:30,008] A new study created in memory with name: no-name-7dcb9c7b-5feb-4d08-906a-8bf081baa1d4



Early stopping occurred at epoch 12 with best_epoch = 2 and best_val_0_rmse = 38.9368


[I 2025-02-24 04:35:44,062] Trial 0 finished with value: 38.93680187111795 and parameters: {'learning_rate': 0.017829393162074582, 'batch_size': 2048, 'virtual_batch_size': 256, 'momentum': 0.8275892655751844, 'gamma': 0.91709882415862}. Best is trial 0 with value: 38.93680187111795.



Early stopping occurred at epoch 35 with best_epoch = 25 and best_val_0_rmse = 38.93515


[I 2025-02-24 04:48:52,151] Trial 1 finished with value: 38.93515275462701 and parameters: {'learning_rate': 0.020729972100857138, 'batch_size': 512, 'virtual_batch_size': 256, 'momentum': 0.8418065301859748, 'gamma': 0.9335568851046009}. Best is trial 1 with value: 38.93515275462701.



Early stopping occurred at epoch 12 with best_epoch = 2 and best_val_0_rmse = 38.94716


[I 2025-02-24 04:52:01,286] Trial 2 finished with value: 38.94716196794855 and parameters: {'learning_rate': 0.053528059793781335, 'batch_size': 2048, 'virtual_batch_size': 256, 'momentum': 0.8893485258566006, 'gamma': 0.9097756814250093}. Best is trial 1 with value: 38.93515275462701.



Early stopping occurred at epoch 15 with best_epoch = 5 and best_val_0_rmse = 38.94688


[I 2025-02-24 04:57:59,408] Trial 3 finished with value: 38.94688041740964 and parameters: {'learning_rate': 0.08848986743657732, 'batch_size': 512, 'virtual_batch_size': 256, 'momentum': 0.826262358646654, 'gamma': 0.9614889335739389}. Best is trial 1 with value: 38.93515275462701.



Early stopping occurred at epoch 41 with best_epoch = 31 and best_val_0_rmse = 38.93725


[I 2025-02-24 05:13:57,970] Trial 4 finished with value: 38.93724513665973 and parameters: {'learning_rate': 0.0010436665586248833, 'batch_size': 1024, 'virtual_batch_size': 64, 'momentum': 0.8931636880075331, 'gamma': 0.9290488592126784}. Best is trial 1 with value: 38.93515275462701.



Early stopping occurred at epoch 18 with best_epoch = 8 and best_val_0_rmse = 38.94662


[I 2025-02-24 05:19:53,044] Trial 5 finished with value: 38.946623215915075 and parameters: {'learning_rate': 0.057153322341511256, 'batch_size': 1024, 'virtual_batch_size': 128, 'momentum': 0.8933424322716528, 'gamma': 0.9078097621557966}. Best is trial 1 with value: 38.93515275462701.



Early stopping occurred at epoch 36 with best_epoch = 26 and best_val_0_rmse = 38.9269


[I 2025-02-24 05:33:39,266] Trial 6 finished with value: 38.926896249919224 and parameters: {'learning_rate': 0.00395272142308686, 'batch_size': 1024, 'virtual_batch_size': 64, 'momentum': 0.8726176766325433, 'gamma': 0.9833023835492274}. Best is trial 6 with value: 38.926896249919224.



Early stopping occurred at epoch 22 with best_epoch = 12 and best_val_0_rmse = 38.93145


[I 2025-02-24 05:39:21,929] Trial 7 finished with value: 38.93145331072282 and parameters: {'learning_rate': 0.0115813633505811, 'batch_size': 1024, 'virtual_batch_size': 256, 'momentum': 0.921048472706837, 'gamma': 0.955296623126487}. Best is trial 6 with value: 38.926896249919224.



Early stopping occurred at epoch 43 with best_epoch = 33 and best_val_0_rmse = 38.94158


[I 2025-02-24 05:54:16,173] Trial 8 finished with value: 38.941578834831645 and parameters: {'learning_rate': 0.07865049506104238, 'batch_size': 1024, 'virtual_batch_size': 64, 'momentum': 0.9011793343635086, 'gamma': 0.9136768647400269}. Best is trial 6 with value: 38.926896249919224.



Early stopping occurred at epoch 29 with best_epoch = 19 and best_val_0_rmse = 38.94899


[I 2025-02-24 06:07:18,833] Trial 9 finished with value: 38.94898788367777 and parameters: {'learning_rate': 0.043940080197535286, 'batch_size': 512, 'virtual_batch_size': 64, 'momentum': 0.8493558940279331, 'gamma': 0.9499929715401928}. Best is trial 6 with value: 38.926896249919224.


Stop training because you reached max_epochs = 50 with best_epoch = 42 and best_val_0_rmse = 38.93138


[I 2025-02-24 06:21:44,848] Trial 10 finished with value: 38.93138428580165 and parameters: {'learning_rate': 0.0029449421026226733, 'batch_size': 1024, 'virtual_batch_size': 128, 'momentum': 0.971970410851466, 'gamma': 0.985394448234559}. Best is trial 6 with value: 38.926896249919224.



Early stopping occurred at epoch 31 with best_epoch = 21 and best_val_0_rmse = 38.93253


[I 2025-02-24 06:31:19,184] Trial 11 finished with value: 38.93252958056647 and parameters: {'learning_rate': 0.0029894200902748024, 'batch_size': 1024, 'virtual_batch_size': 128, 'momentum': 0.965275151986799, 'gamma': 0.9888259185398024}. Best is trial 6 with value: 38.926896249919224.



Early stopping occurred at epoch 20 with best_epoch = 10 and best_val_0_rmse = 38.93318


[I 2025-02-24 06:37:35,575] Trial 12 finished with value: 38.93318330472819 and parameters: {'learning_rate': 0.0038280957358433876, 'batch_size': 1024, 'virtual_batch_size': 128, 'momentum': 0.9692029175399145, 'gamma': 0.9892820226644387}. Best is trial 6 with value: 38.926896249919224.



Early stopping occurred at epoch 29 with best_epoch = 19 and best_val_0_rmse = 38.93319


[I 2025-02-24 06:46:25,001] Trial 13 finished with value: 38.93319115142846 and parameters: {'learning_rate': 0.00386821492905031, 'batch_size': 1024, 'virtual_batch_size': 128, 'momentum': 0.9403139263857164, 'gamma': 0.9718000038140966}. Best is trial 6 with value: 38.926896249919224.



Early stopping occurred at epoch 23 with best_epoch = 13 and best_val_0_rmse = 38.93866


[I 2025-02-24 06:54:54,918] Trial 14 finished with value: 38.938658118898616 and parameters: {'learning_rate': 0.0013569527047118683, 'batch_size': 1024, 'virtual_batch_size': 64, 'momentum': 0.8704278235703561, 'gamma': 0.9727648900803277}. Best is trial 6 with value: 38.926896249919224.



Early stopping occurred at epoch 31 with best_epoch = 21 and best_val_0_rmse = 38.93123


[I 2025-02-24 07:02:35,718] Trial 15 finished with value: 38.931231994337686 and parameters: {'learning_rate': 0.006678288688013661, 'batch_size': 2048, 'virtual_batch_size': 128, 'momentum': 0.8051190694341224, 'gamma': 0.9761117451780605}. Best is trial 6 with value: 38.926896249919224.



Early stopping occurred at epoch 26 with best_epoch = 16 and best_val_0_rmse = 38.93323


[I 2025-02-24 07:10:49,299] Trial 16 finished with value: 38.93323267597578 and parameters: {'learning_rate': 0.008595553181276713, 'batch_size': 2048, 'virtual_batch_size': 64, 'momentum': 0.8019538623934657, 'gamma': 0.973351253869344}. Best is trial 6 with value: 38.926896249919224.



Early stopping occurred at epoch 20 with best_epoch = 10 and best_val_0_rmse = 38.93755


[I 2025-02-24 07:15:59,877] Trial 17 finished with value: 38.93754724495122 and parameters: {'learning_rate': 0.007063337164475159, 'batch_size': 2048, 'virtual_batch_size': 128, 'momentum': 0.8056701692394475, 'gamma': 0.9625959246914945}. Best is trial 6 with value: 38.926896249919224.



Early stopping occurred at epoch 36 with best_epoch = 26 and best_val_0_rmse = 38.93096


[I 2025-02-24 07:27:06,485] Trial 18 finished with value: 38.93095512217119 and parameters: {'learning_rate': 0.001821804571494995, 'batch_size': 2048, 'virtual_batch_size': 64, 'momentum': 0.8665070791242614, 'gamma': 0.9400222145908504}. Best is trial 6 with value: 38.926896249919224.


Stop training because you reached max_epochs = 50 with best_epoch = 42 and best_val_0_rmse = 38.92959


[I 2025-02-24 07:42:12,928] Trial 19 finished with value: 38.92958514287129 and parameters: {'learning_rate': 0.001724838148879855, 'batch_size': 2048, 'virtual_batch_size': 64, 'momentum': 0.8683255059736483, 'gamma': 0.9388456178115131}. Best is trial 6 with value: 38.926896249919224.


{'learning_rate': 0.00395272142308686,
 'batch_size': 1024,
 'virtual_batch_size': 64,
 'momentum': 0.8726176766325433,
 'gamma': 0.9833023835492274}

**Best RMSE** is **38.926896249919224**.
Based on training the **best parameters** are {'learning_rate': 0.00395272142308686,
 'batch_size': 1024,
 'virtual_batch_size': 64,
 'momentum': 0.8726176766325433,
 'gamma': 0.9833023835492274}

In [None]:
# Merge training and validation data
X_full = np.concatenate([X_train, X_val], axis=0)
y_full = np.concatenate([y_train, y_val], axis=0)

print("✅ Full dataset prepared. Shape:", X_full.shape, y_full.shape)


✅ Full dataset prepared. Shape: (300000, 9) (300000, 1)


**Retraining the model on the best parameters on the full dataset**

In [None]:
from pytorch_tabnet.tab_model import TabNetRegressor
import torch

# Ensure CUDA is used if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Initialize the model with best hyperparameters
best_model = TabNetRegressor(
    optimizer_params={"lr": 0.00395272142308686},
    momentum=0.8726176766325433,
    scheduler_params={"gamma": 0.9833023835492274},
    verbose=1,
    seed=42,
    device_name="cuda" if torch.cuda.is_available() else "cpu"  # Force GPU if available
)

# Train the model on full dataset
best_model.fit(
    X_full, y_full,
    eval_set=[(X_full, y_full)],  # Optional: Monitor training performance
    eval_metric=["rmse"],
    max_epochs=50,
    patience=10,  # Early stopping
    batch_size=1024,
    virtual_batch_size=64,
    num_workers=0,
    drop_last=False
)

print("✅ Model retrained successfully!")




epoch 0  | loss: 5123.22357| val_0_rmse: 42.22392|  0:00:35s
epoch 1  | loss: 1538.48358| val_0_rmse: 39.05486|  0:01:12s
epoch 2  | loss: 1526.04292| val_0_rmse: 39.03616|  0:01:47s
epoch 3  | loss: 1525.56161| val_0_rmse: 39.03932|  0:02:25s
epoch 4  | loss: 1525.30337| val_0_rmse: 39.03886|  0:03:03s
epoch 5  | loss: 1525.16626| val_0_rmse: 39.0514 |  0:03:39s
epoch 6  | loss: 1524.67991| val_0_rmse: 39.06746|  0:04:16s
epoch 7  | loss: 1524.67086| val_0_rmse: 39.03343|  0:04:52s
epoch 8  | loss: 1524.71268| val_0_rmse: 39.03933|  0:05:28s
epoch 9  | loss: 1524.53342| val_0_rmse: 39.03235|  0:06:04s
epoch 10 | loss: 1524.72431| val_0_rmse: 39.04306|  0:06:39s
epoch 11 | loss: 1524.3527| val_0_rmse: 39.02883|  0:07:14s
epoch 12 | loss: 1524.46711| val_0_rmse: 39.03111|  0:07:48s
epoch 13 | loss: 1524.20806| val_0_rmse: 39.03186|  0:08:22s
epoch 14 | loss: 1524.07471| val_0_rmse: 39.03939|  0:08:56s
epoch 15 | loss: 1523.95832| val_0_rmse: 39.02736|  0:09:30s
epoch 16 | loss: 1523.822



✅ Model retrained successfully!


After retraining, it can been seen that RMSE has slightly increased to 39.02 for the entire retrained dataset, which is quite normal.

**Saving the model**

In [None]:
import torch

# ✅ Save best_model
model_path = "best_tabnet_model.pth"
torch.save(best_model, model_path)

print(f"✅ Model saved at {model_path}")


✅ Model saved at best_tabnet_model.pth


In [None]:
from google.colab import files
files.download("best_tabnet_model.pth")


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

**Load the saved model**

In [None]:
import torch
from pytorch_tabnet.tab_model import TabNetRegressor

# ✅ Load best_model
loaded_model = torch.load("best_tabnet_model.pth")
print("✅ Model loaded successfully!")


✅ Model loaded successfully!


  loaded_model = torch.load("best_tabnet_model.pth")


**Loading the Unseen test dataset and preprocessing similar to train dataset**

In [None]:
df_test = pd.read_csv('test.csv')
df_test

Unnamed: 0,id,Brand,Material,Size,Compartments,Laptop Compartment,Waterproof,Style,Color,Weight Capacity (kg)
0,300000,Puma,Leather,Small,2.0,No,No,Tote,Green,20.671147
1,300001,Nike,Canvas,Medium,7.0,No,Yes,Backpack,Green,13.564105
2,300002,Adidas,Canvas,Large,9.0,No,Yes,Messenger,Blue,11.809799
3,300003,Adidas,Nylon,Large,1.0,Yes,No,Messenger,Green,18.477036
4,300004,,Nylon,Large,2.0,Yes,Yes,Tote,Black,9.907953
...,...,...,...,...,...,...,...,...,...,...
199995,499995,Adidas,Canvas,Large,2.0,Yes,No,Messenger,Red,7.383498
199996,499996,Nike,Polyester,Small,9.0,No,Yes,Messenger,Pink,6.058394
199997,499997,Jansport,Nylon,Small,9.0,No,Yes,Tote,Green,26.890163
199998,499998,Puma,Nylon,Large,10.0,Yes,No,Tote,Gray,25.769153


In [None]:
df_test.describe(include='all')

Unnamed: 0,id,Brand,Material,Size,Compartments,Laptop Compartment,Waterproof,Style,Color,Weight Capacity (kg)
count,200000.0,193773,194387,195619,200000.0,195038,195189,194847,193215,199923.0
unique,,5,4,3,,2,2,3,6,
top,,Adidas,Polyester,Medium,,Yes,Yes,Messenger,Pink,
freq,,40173,53027,67775,,98659,98594,66387,34761,
mean,399999.5,,,,5.442855,,,,,17.993033
std,57735.171256,,,,2.88874,,,,,6.972079
min,300000.0,,,,1.0,,,,,5.0
25%,349999.75,,,,3.0,,,,,12.068875
50%,399999.5,,,,5.0,,,,,18.05475
75%,449999.25,,,,8.0,,,,,23.9657


In [None]:
# Fill missing numerical values with the median
df_test["Weight Capacity (kg)"].fillna(df_test["Weight Capacity (kg)"].median(), inplace=True)

# Verify that there are no missing values left
print(df_test["Weight Capacity (kg)"].isnull().sum())  # Should print 0


0


The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df_test["Weight Capacity (kg)"].fillna(df_test["Weight Capacity (kg)"].median(), inplace=True)


In [None]:
# List of categorical columns
cat_features = ["Brand", "Material", "Size", "Laptop Compartment", "Waterproof", "Style", "Color"]

# Fill missing categorical values with "Unknown"
df_test[cat_features] = df_test[cat_features].fillna("Unknown")

# Verify that no missing values remain
print(df_test[cat_features].isnull().sum())  # Should print all zeros


Brand                 0
Material              0
Size                  0
Laptop Compartment    0
Waterproof            0
Style                 0
Color                 0
dtype: int64


In [None]:
from sklearn.preprocessing import LabelEncoder

# Apply Label Encoding to categorical features
label_encoders = {}
for col in cat_features:
    le = LabelEncoder()
    df_test[col] = le.fit_transform(df_test[col])
    label_encoders[col] = le  # Store encoders for future use (e.g., test data encoding)

# Verify changes (Check unique values in each categorical column after encoding)
encoded_summary = {col: df_test[col].nunique() for col in cat_features}
print(encoded_summary)  # Display unique values per encoded categorical column


{'Brand': 6, 'Material': 5, 'Size': 4, 'Laptop Compartment': 3, 'Waterproof': 3, 'Style': 4, 'Color': 7}


In [None]:
df_test

Unnamed: 0,id,Brand,Material,Size,Compartments,Laptop Compartment,Waterproof,Style,Color,Weight Capacity (kg)
0,300000,3,1,2,2.0,0,0,2,3,20.671147
1,300001,2,0,1,7.0,0,2,0,3,13.564105
2,300002,0,0,0,9.0,0,2,1,1,11.809799
3,300003,0,2,0,1.0,2,0,1,3,18.477036
4,300004,5,2,0,2.0,2,2,2,0,9.907953
...,...,...,...,...,...,...,...,...,...,...
199995,499995,0,0,0,2.0,2,0,1,5,7.383498
199996,499996,2,3,2,9.0,0,2,1,4,6.058394
199997,499997,1,2,2,9.0,0,2,2,3,26.890163
199998,499998,3,2,0,10.0,2,0,2,2,25.769153


In [None]:
df_test

Unnamed: 0,id,Brand,Material,Size,Compartments,Laptop Compartment,Waterproof,Style,Color,Weight Capacity (kg)
0,300000,3,1,2,2.0,0,0,2,3,20.671147
1,300001,2,0,1,7.0,0,2,0,3,13.564105
2,300002,0,0,0,9.0,0,2,1,1,11.809799
3,300003,0,2,0,1.0,2,0,1,3,18.477036
4,300004,5,2,0,2.0,2,2,2,0,9.907953
...,...,...,...,...,...,...,...,...,...,...
199995,499995,0,0,0,2.0,2,0,1,5,7.383498
199996,499996,2,3,2,9.0,0,2,1,4,6.058394
199997,499997,1,2,2,9.0,0,2,2,3,26.890163
199998,499998,3,2,0,10.0,2,0,2,2,25.769153


In [None]:
test_data = df_test.copy()

if "id" in test_data.columns:
    test_data = test_data.drop(columns=["id"])

# Convert test data to NumPy array
X_test = test_data.to_numpy()


**Making the predictions**

In [None]:
predictions = loaded_model.predict(X_test)

In [None]:
predictions

array([[82.30394 ],
       [81.48038 ],
       [81.57865 ],
       ...,
       [82.10019 ],
       [82.09662 ],
       [82.511856]], dtype=float32)

In [None]:
# Ensure df_test_copy contains the 'id' column
submission = pd.DataFrame({
    "id": df_test["id"],  # Extract ID column
    "Price": predictions.flatten()  # Convert predictions to a single column
})

# Save the submission file
submission.to_csv("submission.csv", index=False)

print("✅ Submission file 'submission.csv' saved successfully!")

✅ Submission file 'submission.csv' saved successfully!


In [None]:
submission

Unnamed: 0,id,Price
0,300000,82.303940
1,300001,81.480377
2,300002,81.578651
3,300003,82.725151
4,300004,81.127029
...,...,...
199995,499995,80.153778
199996,499996,79.314926
199997,499997,82.100189
199998,499998,82.096619


**Conclusion: It can be seen that Tabnet with optuna prevents overfitting by early stopping, can save computational time and resources by preventing unnecessary training epochs.**