<a href="https://colab.research.google.com/github/suryawanshiaarti022/Individual-Project/blob/main/code_with_weight_matrices.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Importing Libraries

In [None]:
import pandas as pd
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split


## Loading the Dataset

In [None]:
# loading the Zoo dataset into a pandas DataFrame called zoo_data
column_names = ['animal_name', 'hair', 'feathers', 'eggs', 'milk', 'airborne', 'aquatic',
                'predator', 'toothed','backbone', 'breathes', 'venomous', 'fins', 'legs',
                'tail', 'domestic', 'catsize', 'class_type']
zoo_data = pd.read_csv('zoo.csv',names=column_names)

## Data Preprocessing

##### removing the first row of the dataset

In [None]:
# Display the first few rows of the dataframe
zoo_data = zoo_data.iloc[1:,:]
print(zoo_data.head())

  animal_name hair feathers eggs milk airborne aquatic predator toothed  \
1    aardvark    1        0    0    1        0       0        1       1   
2    antelope    1        0    0    1        0       0        0       1   
3        bass    0        0    1    0        0       1        1       1   
4        bear    1        0    0    1        0       0        1       1   
5        boar    1        0    0    1        0       0        1       1   

  backbone breathes venomous fins legs tail domestic catsize class_type  
1        1        1        0    0    4    0        0       1          1  
2        1        1        0    0    4    1        0       1          1  
3        1        0        0    1    0    1        0       0          4  
4        1        1        0    0    4    0        0       1          1  
5        1        1        0    0    4    1        0       1          1  


##### Converting the 'legs' column to numeric values because it's essential for mathematical operations later

In [None]:
zoo_data["legs"] = pd.to_numeric(zoo_data["legs"])

##### Creating a new binary feature indicating whether an animal has legs or not. This feature will be used to apply custom logic in the neural network later.

In [None]:
# Binarize the 'legs' feature: 1 if legs > 0, else 0
zoo_data['binary_legs'] = (zoo_data['legs'] > 0).astype(int)
print(zoo_data.head())

  animal_name hair feathers eggs milk airborne aquatic predator toothed  \
1    aardvark    1        0    0    1        0       0        1       1   
2    antelope    1        0    0    1        0       0        0       1   
3        bass    0        0    1    0        0       1        1       1   
4        bear    1        0    0    1        0       0        1       1   
5        boar    1        0    0    1        0       0        1       1   

  backbone breathes venomous fins  legs tail domestic catsize class_type  \
1        1        1        0    0     4    0        0       1          1   
2        1        1        0    0     4    1        0       1          1   
3        1        0        0    1     0    1        0       0          4   
4        1        1        0    0     4    0        0       1          1   
5        1        1        0    0     4    1        0       1          1   

   binary_legs  
1            1  
2            1  
3            0  
4            1  
5      

This loop is creating seven new binary columns in the dataset, each corresponding to one of the seven classes. If an animal belongs to class i, then class_i will be 1; otherwise, it will be 0.

In [None]:
# Convert the target to a binary classification for each class
for i in range(1, 8):
    zoo_data[f'class_{i}'] = (zoo_data['class_type'] == i).astype(int)

print(zoo_data.head())

  animal_name hair feathers eggs milk airborne aquatic predator toothed  \
1    aardvark    1        0    0    1        0       0        1       1   
2    antelope    1        0    0    1        0       0        0       1   
3        bass    0        0    1    0        0       1        1       1   
4        bear    1        0    0    1        0       0        1       1   
5        boar    1        0    0    1        0       0        1       1   

  backbone  ... catsize class_type binary_legs  class_1 class_2 class_3  \
1        1  ...       1          1           1        0       0       0   
2        1  ...       1          1           1        0       0       0   
3        1  ...       0          4           0        0       0       0   
4        1  ...       1          1           1        0       0       0   
5        1  ...       1          1           1        0       0       0   

  class_4 class_5  class_6  class_7  
1       0       0        0        0  
2       0       0     

## Feature Selection

Separating the dataset into features (X) and targets (y). We are excluding features like 'animal_name', 'class_type', 'legs', and the binary class columns.

In [None]:
# Split the data into features and target
X = zoo_data.drop(['animal_name', 'class_type', 'legs'] + [f'class_{i}' for i in range(1, 8)], axis=1)
y = zoo_data[[f'class_{i}' for i in range(1, 8)]]

print(X.head())

  hair feathers eggs milk airborne aquatic predator toothed backbone breathes  \
1    1        0    0    1        0       0        1       1        1        1   
2    1        0    0    1        0       0        0       1        1        1   
3    0        0    1    0        0       1        1       1        1        0   
4    1        0    0    1        0       0        1       1        1        1   
5    1        0    0    1        0       0        1       1        1        1   

  venomous fins tail domestic catsize  binary_legs  
1        0    0    0        0       1            1  
2        0    0    1        0       1            1  
3        0    1    1        0       0            0  
4        0    0    0        0       1            1  
5        0    0    1        0       1            1  


In [None]:
print(y.head())

   class_1  class_2  class_3  class_4  class_5  class_6  class_7
1        0        0        0        0        0        0        0
2        0        0        0        0        0        0        0
3        0        0        0        0        0        0        0
4        0        0        0        0        0        0        0
5        0        0        0        0        0        0        0


## Data Splitting and Normalization

In [None]:
# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

Normalizing the features to have a mean of 0 and a standard deviation of 1, which can help the neural network learn more effectively.

In [None]:
# Normalize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

## Conversion to PyTorch Tensors

Converting the NumPy arrays to PyTorch tensors, which are required for training the neural network.

In [None]:
# Convert the data to PyTorch tensors
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train.values, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test.values, dtype=torch.float32)

## DataLoader Creation

Creating a TensorDataset and DataLoader which batches the data and provides it to the neural network during training.

In [None]:
# Create TensorDataset and DataLoader
train_dataset = TensorDataset(X_train_tensor, y_train_tensor)
train_loader = DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)

## Neural Network Architecture

Defining a custom Multilayer Perceptron (MLP) with two hidden layers and an output layer. It also includes a ReLU activation function and introduces custom logic to bias the network towards classes 4 and 7 based on the 'binary_legs' feature.

In [None]:
# Define the MLP architecture with the custom logic gate
class CustomMLP(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(CustomMLP, self).__init__()
        self.hidden1 = nn.Linear(input_size, hidden_size)
        self.hidden2 = nn.Linear(hidden_size, hidden_size)
        self.output = nn.Linear(hidden_size, num_classes)
        self.relu = nn.ReLU()

    def forward(self, x):
      out = self.hidden1(x)
      out = self.relu(out)

      # Custom logic to bias towards classes 4 (Fish) and 7 (Invertebrates)
      legs = x[:, -1].unsqueeze(1)  # Isolate the 'binary_legs' feature
      # Create a mask for the custom logic
      # This assumes that the last two neurons in your hidden layer correspond to classes 4 and 7
      custom_logic_mask = torch.zeros_like(out)
      custom_logic_mask[:, -2] = (legs < 1).float().squeeze() * 10.0  # Bias for class 4
      custom_logic_mask[:, -1] = (legs < 1).float().squeeze() * 10.0  # Bias for class 7
      out = out + custom_logic_mask

      out = self.hidden2(out)
      out = self.relu(out)
      out = self.output(out)
      return out

## Training Preparation

Instantiating the neural network model, defining the loss function (CrossEntropyLoss for multi-class classification), and setting up the optimizer (Adam) with a learning rate of 0.001.

In [None]:
# Initialize the model, loss criterion, and optimizer
model = CustomMLP(input_size=X_train.shape[1], hidden_size=10, num_classes=7)
loss_criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

## Training Loop

Defining the training loop that iterates over the dataset multiple times (epochs), feeds the data through the model, calculates the loss, performs backpropagation, and updates the model's weights.

In [None]:
# Training Loop
def train(model, data_loader, loss_criterion, optimizer, num_epochs):
    for epoch in range(num_epochs):
        for inputs, targets in data_loader:
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = loss_criterion(outputs, targets)
            loss.backward()
            optimizer.step()
            if epoch % 10 == 0:
                print(f'Epoch {epoch+1}/{num_epochs}, Loss: {loss.item()}')


## Model Training

Starting the training process with 100 epochs.

In [None]:
# Training the model
num_epochs = 100
train(model, train_loader, loss_criterion, optimizer, num_epochs)

Epoch 1/100, Loss: -0.0
Epoch 1/100, Loss: -0.0
Epoch 11/100, Loss: -0.0
Epoch 11/100, Loss: -0.0
Epoch 21/100, Loss: -0.0
Epoch 21/100, Loss: -0.0
Epoch 31/100, Loss: -0.0
Epoch 31/100, Loss: -0.0
Epoch 41/100, Loss: -0.0
Epoch 41/100, Loss: -0.0
Epoch 51/100, Loss: -0.0
Epoch 51/100, Loss: -0.0
Epoch 61/100, Loss: -0.0
Epoch 61/100, Loss: -0.0
Epoch 71/100, Loss: -0.0
Epoch 71/100, Loss: -0.0
Epoch 81/100, Loss: -0.0
Epoch 81/100, Loss: -0.0
Epoch 91/100, Loss: -0.0
Epoch 91/100, Loss: -0.0


## Prediction Function

Defining a function to make predictions with the trained model on new inputs.

In [None]:
# Prediction Function
def predict(model, inputs):
    model.eval()
    with torch.no_grad():
        outputs = model(inputs)
    _, predicted = torch.max(outputs.data, 1)
    return predicted

## Test Predictions

Generating predictions on the test set and adjusts the predicted class indices to match the original 1-7 range.

In [None]:
# Predict on the test set
test_predictions = predict(model, X_test_tensor)
# Assuming the original classes were encoded from 1 to 7, we need to adjust predictions by 1
test_predictions += 1

## Conversion to DataFrame

in the last the predictions are converted back into a Pandas DataFrame for easier analysis and visualization.

In [None]:
# Convert predictions back to a pandas DataFrame (optional)
import pandas as pd
test_predictions_df = pd.DataFrame(test_predictions.numpy(), columns=['Predicted Class'])
test_predictions_df.head()

Unnamed: 0,Predicted Class
0,7
1,7
2,7
3,7
4,7
