### 📥 Load the Wine Quality Dataset

We start by loading the **Wine Quality** dataset from the [UCI Machine Learning Repository](https://archive.ics.uci.edu/), using the `fetch_ucirepo` function. This dataset contains physicochemical properties of red and white wines, along with a quality score (typically between 0 and 10) assigned by wine taster

- X contains the input features (e.g., acidity, sugar, alcohol).

- y contains the target variable (wine quality score).s.


In [1]:
from ucimlrepo import fetch_ucirepo
import sys
import os
sys.path.append(os.path.abspath('..'))
from dropkan.DropKAN import DropKAN
from dropkan.DropKANLayer import DropKANLayer
import torch
import torch.nn as nn
import numpy as np
import random
import pandas as pd 
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
from sklearn.metrics import mean_absolute_error

def set_training_mode(module, mode):
    """Set the training mode for a module and all its sub-modules."""
    module.training = mode
    for submodule in module.children():
        set_training_mode(submodule, mode)

In [2]:
wine_quality = fetch_ucirepo(id=186) 
  
X = wine_quality.data.features 
y = wine_quality.data.targets 


### 🧪 Train-Test Split & Feature Scaling

We split the dataset into training and testing sets using an 60/40 split. A fixed `random_state` is used to ensure reproducibility.

After splitting, we apply **standardization** to the features using `StandardScaler`, which transforms the data to have zero mean and unit variance. 


We prepare the dataset for training by converting the NumPy arrays (from `scikit-learn`) into PyTorch tensors. This is required for compatibility with PyTorch-based models like KAN and DropKAN.


In [3]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42  
)


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)  
X_test = scaler.transform(X_test)        

dataset = {}
dataset['train_input'] = torch.from_numpy(X_train).float()
dataset['test_input'] = torch.from_numpy(X_test).float()
dataset['train_label'] = torch.from_numpy(y_train.values)
dataset['test_label'] = torch.from_numpy(y_test.values)


### Training and Evaluating the DropKAN Model

- Optimizer: Adam
- Loss function: L1 loss (Mean Absolute Error)
- Batch size: 32
- Learning rate: 0.01
- Number of epochs: 10

After training, the model is evaluated on the test set, and the Mean Absolute Error (MAE) is recorded.s recorded.


In [6]:
drop_rate = 0.3
epochs = 10
batch = 32
steps = int(len(X_train) / batch) * epochs

# Initialize model
model = DropKAN(seed=0, width=[X_train.shape[1], X_train.shape[1]*2, 1], drop_rate=drop_rate, drop_mode='postact')

# Train
model.train(dataset, opt="Adam", steps=steps, batch=batch, lr=0.01, loss_fn=torch.nn.L1Loss())

# Evaluation
set_training_mode(model, False)

y_pred = model(dataset['test_input']).detach().numpy()
mae = mean_absolute_error(y_test, y_pred)
print(f"mode=DropKAN | test={mae:.4f}")



description: 100%|█████████████████████████████████████████████| 1620/1620 [00:15<00:00, 101.49it/s]

mode=DropKAN | test=0.5649



