<a href="https://colab.research.google.com/github/dineshRaja29/Radial-Basis-Function-Neural-Network/blob/main/rbfANN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Shri Radhey

# <font color = 'green'> <b> INTRODUCTION

**GOAL:** Trying to build the alternative of the Neural Network when data have radial symmetry instead of linear symmetry.These type of network are called Radial basis function (RBF) network

**RBF**
- Radial basis function (RBF) network is an artificial neural network that uses radial basis functions as activation functions.
- The output of the network is a linear combination of radial basis functions of the inputs and neuron parameters.
- Implementing the Radial Basis Function (RBF) Network from scratch, there is no standard implementation in the PyTorch library to the best of my knowledge.


![RBF Network](https://upload.wikimedia.org/wikipedia/commons/thumb/d/d0/Rbf-network.svg/800px-Rbf-network.svg.png)


# <font color = 'green'> <b> PERFORMANCE METRIC
- Accuracy
- F1 Score

# <font color = 'green'> <b> CONSTRAINTS
- Data have partial or full radial symmetry

# <font color = 'green'> <b> DATASET
- Data is borrowed from https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_circles.html
- This is toy dataset which have large circle containing a smaller circle in 2d.

# <font color = 'green'> <b> MODEL BUILDING

In [2]:
# importing the necessary
import torch
import torch.nn as nn
import torch.nn.init as init
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from torch.utils.data import DataLoader, TensorDataset
import torch.optim as optim
from sklearn.datasets import make_circles
import numpy as np
from prettytable import PrettyTable

In [3]:
# config variables
INPUT_DIM = 2
OUTPUT_DIM = 2
SPLIT_RATIO = 0.2
L2_PENALTY = 0.001
LEARNING_RATE = 0.01
BATCH_SIZE = 8
NUM_OF_EPOCHS = 50
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(f'NOTE: Hardware on which most of the computation run is {device}')

NOTE: Hardware on which most of the computation run is cuda:0


In [4]:
# utility function
def calculate_accuracy(loader, model):

    model.eval()  # Set the model to evaluation mode

    with torch.no_grad():

        correct = 0
        total = 0

        for batch_X, batch_y in loader:
            batch_X = batch_X.to(device)
            batch_y = batch_y.to(device)
            outputs = model(batch_X)
            _, predicted = torch.max(outputs, 1)
            total += batch_y.size(0)
            correct += (predicted == batch_y).sum().item()

        accuracy = correct / total

    return accuracy


In [5]:
# data loading and converting to PyTorch format
scaler = StandardScaler()

X, y = make_circles(n_samples=10000, noise=0.05, random_state=42)

# spliting the data
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = SPLIT_RATIO, stratify = y)

# standardize data without leakage
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Convert NumPy arrays to PyTorch tensors
X_train = torch.FloatTensor(X_train)
y_train = torch.LongTensor(y_train)
X_test = torch.FloatTensor(X_test)
y_test = torch.LongTensor(y_test)

## reference: https://stackoverflow.com/questions/67683406/difference-between-dataset-and-tensordataset-in-pytorch

# Create DataLoader for training
train_dataset = TensorDataset(X_train, y_train)
train_loader = DataLoader(train_dataset, batch_size = BATCH_SIZE, shuffle = True)

# Create DataLoader for testing
test_dataset = TensorDataset(X_test, y_test)
test_loader = DataLoader(test_dataset, batch_size = BATCH_SIZE, shuffle = False)


## <font color = 'green'> <b> BASELINE (FF-DNN)

- Feed Forward Deep Neural Network (FF-DNN) is our baseline model with ReLU as activation function
- The number of hidden layers and activation units are same as RBF
- Using the standard implementation from PyTorch


In [6]:
class MyLinearNetwork(nn.Module):
    def __init__(self):
        super(MyLinearNetwork, self).__init__()
        # Sequential container for defining the network architecture
        self.layers = nn.Sequential(
            nn.Linear(INPUT_DIM, 2 * INPUT_DIM),
            nn.ReLU(),
            nn.Linear(2 * INPUT_DIM, OUTPUT_DIM),
            nn.ReLU()
        )

        # Apply Xavier initialization to the linear layers
        for layer in self.layers:
            if isinstance(layer, nn.Linear):
                init.xavier_normal_(layer.weight.data)

    def forward(self, x):
        # Forward pass through the defined layers
        return self.layers(x)

model = MyLinearNetwork()
model.to(device)

MyLinearNetwork(
  (layers): Sequential(
    (0): Linear(in_features=2, out_features=4, bias=True)
    (1): ReLU()
    (2): Linear(in_features=4, out_features=2, bias=True)
    (3): ReLU()
  )
)

In [7]:
## TRAINING LOOP ##
# Define your loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr = LEARNING_RATE)

BEST_LOSS = np.inf

# Training loop with DataLoader

for epoch in range(NUM_OF_EPOCHS):

    model.train() # setting model stage to training

    total_loss = 0

    for batch_X, batch_y in train_loader:

        optimizer.zero_grad()

        batch_X = batch_X.to(device)
        batch_y = batch_y.to(device)

        outputs = model(batch_X)

        loss = criterion(outputs, batch_y)

        total_loss = total_loss + loss.item()

        loss.backward()
        optimizer.step()

    # normalizing with number of batches
    total_loss = total_loss / len(train_loader)

    print(f"Epoch {epoch + 1}/{NUM_OF_EPOCHS}, Loss: {total_loss}")

    if  total_loss < BEST_LOSS:
        print('Updating the checkpoint')
        BEST_LOSS = total_loss
        torch.save(model.state_dict(), '/content/linear_model.pt')


Epoch 1/50, Loss: 0.6986931726336479
Updating the checkpoint
Epoch 2/50, Loss: 0.6906988353729248
Updating the checkpoint
Epoch 3/50, Loss: 0.6897462752461433
Updating the checkpoint
Epoch 4/50, Loss: 0.6894357318282127
Updating the checkpoint
Epoch 5/50, Loss: 0.6891173392534256
Updating the checkpoint
Epoch 6/50, Loss: 0.6888076118826866
Updating the checkpoint
Epoch 7/50, Loss: 0.6886007885336876
Updating the checkpoint
Epoch 8/50, Loss: 0.6883324218988418
Updating the checkpoint
Epoch 9/50, Loss: 0.6880211118459701
Updating the checkpoint
Epoch 10/50, Loss: 0.6878049199581147
Updating the checkpoint
Epoch 11/50, Loss: 0.6874026858210563
Updating the checkpoint
Epoch 12/50, Loss: 0.6870764989256859
Updating the checkpoint
Epoch 13/50, Loss: 0.6864045936465263
Updating the checkpoint
Epoch 14/50, Loss: 0.6852859667539597
Updating the checkpoint
Epoch 15/50, Loss: 0.6830416114330292
Updating the checkpoint
Epoch 16/50, Loss: 0.6803556940555573
Updating the checkpoint
Epoch 17/50, Loss

In [8]:
## PERFORMANCE EVALUATION
# loading the best model
model = MyLinearNetwork()
model.load_state_dict(torch.load('/content/linear_model.pt'))
model.to(device)
print(f"\n- Train Accuracy: {calculate_accuracy(train_loader, model) * 100:.2f}%")
print(f"\n- Test Accuracy: {calculate_accuracy(test_loader, model) * 100:.2f}%")


- Train Accuracy: 67.01%

- Test Accuracy: 67.10%


## <font color = 'green'> <b> CANDIDATE (RBF NETWORK)

- RBF Network (https://en.wikipedia.org/wiki/Radial_basis_function_network#/media/File:Rbf-network.svg)

In [9]:
class MyRBFLayer(nn.Module):
    def __init__(self, in_features, out_features, beta=0.1):
        super(MyRBFLayer, self).__init__()
        # Initialize weights with Xavier initialization
        # output_dim and input_dim
        self.centers = nn.Parameter(torch.randn(2 * in_features, in_features), requires_grad=True)
        init.xavier_normal_(self.centers.data)  # Xavier initialization for centers
        self.linear_layer = nn.Linear(2 * in_features, out_features)
        init.xavier_normal_(self.linear_layer.weight.data)  # Xavier initialization for linear layer weights
        init.constant_(self.linear_layer.bias.data, 0)  # Set bias to zero
        self.beta = beta

    def forward(self, x):
        # Implement the forward pass
        distances = torch.sum((x.unsqueeze(1) - self.centers) ** 2, dim=2, keepdim=True)
        radial_symmetry = torch.exp(-self.beta * distances)
        linear_symmetry = self.linear_layer(radial_symmetry.squeeze())
        return linear_symmetry

class MyRBFNetwork(nn.Module):
    def __init__(self):
        super(MyRBFNetwork, self).__init__()
        self.hidden_layer1 = MyRBFLayer(INPUT_DIM, OUTPUT_DIM)

    def forward(self, x):
        hidden_output1 = self.hidden_layer1(x)
        return hidden_output1

model = MyRBFNetwork()
model.to(device)

MyRBFNetwork(
  (hidden_layer1): MyRBFLayer(
    (linear_layer): Linear(in_features=4, out_features=2, bias=True)
  )
)

In [10]:
# class MyLinearLayer(nn.Module):
#     def __init__(self, in_features, out_features):
#         super(MyLinearLayer, self).__init__()
#         # Initialize weights
#         self.weights = nn.Parameter(torch.randn(out_features, in_features))

#     def forward(self, x):
#         print(x.shape, self.weights.shape)
#         return torch.matmul(x, self.weights.t())


In [11]:
## TRAINING LOOP ##
# Define your loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr = LEARNING_RATE)

BEST_LOSS = np.inf

# Training loop with DataLoader

for epoch in range(NUM_OF_EPOCHS):

    model.train() # setting model stage to training

    total_loss = 0

    for batch_X, batch_y in train_loader:

        optimizer.zero_grad()

        batch_X = batch_X.to(device)
        batch_y = batch_y.to(device)

        outputs = model(batch_X)

        loss = criterion(outputs, batch_y)

        total_loss = total_loss + loss.item()

        loss.backward()
        optimizer.step()

    # normalizing with number of batches
    total_loss = total_loss / len(train_loader)

    print(f"Epoch {epoch + 1}/{NUM_OF_EPOCHS}, Loss: {total_loss}")

    if  total_loss < BEST_LOSS:
        print('Updating the checkpoint')
        BEST_LOSS = total_loss
        torch.save(model.state_dict(), '/content/rbf_model.pt')


Epoch 1/50, Loss: 0.6940061441659927
Updating the checkpoint
Epoch 2/50, Loss: 0.6866713054180146
Updating the checkpoint
Epoch 3/50, Loss: 0.6813645353913307
Updating the checkpoint
Epoch 4/50, Loss: 0.6756566749215126
Updating the checkpoint
Epoch 5/50, Loss: 0.6700254489779472
Updating the checkpoint
Epoch 6/50, Loss: 0.6641248695254326
Updating the checkpoint
Epoch 7/50, Loss: 0.6583865784406662
Updating the checkpoint
Epoch 8/50, Loss: 0.6531862029433251
Updating the checkpoint
Epoch 9/50, Loss: 0.6475702694058418
Updating the checkpoint
Epoch 10/50, Loss: 0.6418798626661301
Updating the checkpoint
Epoch 11/50, Loss: 0.6366446309089661
Updating the checkpoint
Epoch 12/50, Loss: 0.6310571937561035
Updating the checkpoint
Epoch 13/50, Loss: 0.6257547313570976
Updating the checkpoint
Epoch 14/50, Loss: 0.6205036726593971
Updating the checkpoint
Epoch 15/50, Loss: 0.6154106248021126
Updating the checkpoint
Epoch 16/50, Loss: 0.6101664971113205
Updating the checkpoint
Epoch 17/50, Loss

In [12]:
## PERFORMANCE EVALUATION
# loading the best model
model = MyRBFNetwork()
model.load_state_dict(torch.load('/content/rbf_model.pt'))
model.to(device)
print(f"\n- Train Accuracy: {calculate_accuracy(train_loader, model) * 100:.2f}%")
print(f"\n- Test Accuracy: {calculate_accuracy(test_loader, model) * 100:.2f}%")



- Train Accuracy: 97.14%

- Test Accuracy: 97.55%


# <font color = 'green'> <b> RESULTS

In [14]:
x = PrettyTable()
x.field_names = ["Sr. No.", "EXPERIMENT", "#f Intermediate Layers", " Training Accuray", "Test Accuracy"]
x.add_row([1, 'Baseline (FF-DNN)', 2, 67.01, 67.10])
x.add_row([2, 'Candidate (RBF)', 2, 97.14, 97.55])
print(x)

+---------+-------------------+------------------------+-------------------+---------------+
| Sr. No. |     EXPERIMENT    | #f Intermediate Layers |  Training Accuray | Test Accuracy |
+---------+-------------------+------------------------+-------------------+---------------+
|    1    | Baseline (FF-DNN) |           2            |       67.01       |      67.1     |
|    2    |  Candidate (RBF)  |           2            |       97.14       |     97.55     |
+---------+-------------------+------------------------+-------------------+---------------+


# <font color = 'green'> <b> CONCLUSION

- When data have radial symmetry then RBF network performs better than FF-DNN
- RBF Network Generalize better on the test data compare to FF-DNN when data have radial symmetry