# Binary Classifier for Sonar Readings

In this document I will be copying code from https://machinelearningmastery.com/building-a-binary-classification-model-in-pytorch/. This code uses PyTorch to design and train a neural network on training data. It will then evaluate the performance of the neural network using a k-fold cross validation.

This dataset describes data from a sonar chirp which returns bouncing off of different services. There are 60 input variables which each have the strength of the returns at different angles. The classification problem will determine whether it has bounced off of a rock or a metal cylinder

First we need to import pandas so that we can read in the dataset and set the X as the independent variables and the Y as the label or dependent variable which in this case is whether it is a rock or a metal cylinder

In [11]:
import copy
 
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
import tqdm
from sklearn.metrics import roc_curve
from sklearn.model_selection import StratifiedKFold, train_test_split
from sklearn.preprocessing import LabelEncoder

In [12]:
# Read data
data = pd.read_csv("sonar.csv", header=None) #read file in
X = data.iloc[:, 0:60] #all independent variables
y = data.iloc[:, 60] #dependent variable or label

The label which is in the y variable needs to be converted froma a string to a numeric label. As there is only 2 labels, 'M' and 'R' these can be converted to 1 and 0 respectively. 

This can be done using sklearn and the encoder function which will do this automatically

In [13]:
encoder = LabelEncoder()
encoder.fit(y)
y = encoder.transform(y)

To check this has been done correctly we can use encoder.classes_ to check the classes and we can also print y to see the data outputs. When using encoder.classes_ this should us 'M' and 'R'. When printing y, we should see 1's and 0's.

In [14]:
print(encoder.classes_)

['M' 'R']


In [15]:
print(y)

[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]


As seen this has worked and we have received our expected output. The 0 represents 'M' and the 1 represents the 'R'.

Next we need to convert these in to PyTorch tensors so that we can feed it into our PyTorch model.

In [16]:
X = torch.tensor(X.values, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

We are going to be creating a 3 layer neural network model that only has 1 hidden layer. As this model has 60 pieces of input data to predict one binary variable. As we want to make a wide model with one hidden layer, a hidden layer of 180 neurons would be a good model. 180 is a three times the input features.

In [17]:
#creating a new class which creates a model with one hidden layer
class Wide(nn.Module):
    #initialiser
    def __init__(self):
        super().__init__()
        #creates linear transformation from input layer to hidden layer
        self.hidden = nn.Linear(60, 180)
        self.relu = nn.ReLU()
        #creates linear transformation from hidden layer to output layer
        self.output = nn.Linear(180, 1)
        #applies sigmoid function
        self.sigmoid = nn.Sigmoid()

    #method to move data from left layer to right layer, takes in the data as paramater
    def forward(self, x):
        x = self.relu(self.hidden(x))
        x = self.sigmoid(self.output(x))
        return x

We are alos going to create a second model which uses 3 hidden layers. This is called a deeper model as it has more than one hidden layer. This model will have 3 layers each with 60 neurons.

In [18]:
class Deep(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer1 = nn.Linear(60, 60)
        self.act1 = nn.ReLU()
        self.layer2 = nn.Linear(60, 60)
        self.act2 = nn.ReLU()
        self.layer3 = nn.Linear(60, 60)
        self.act3 = nn.ReLU()
        self.output = nn.Linear(60, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        x = self.act1(self.layer1(x))
        x = self.act2(self.layer2(x))
        x = self.act3(self.layer3(x))
        x = self.sigmoid(self.output(x))
        return x

We can confirm that the 2 models have similar number of parameters by running the following code. When this code is run we should get 2 numbers which are vaguely similar.

In [19]:
# Compare model sizes
model1 = Wide()
model2 = Deep()
print(sum([x.reshape(-1).shape[0] for x in model1.parameters()]))  # 11161
print(sum([x.reshape(-1).shape[0] for x in model2.parameters()]))  # 11041

11161
11041


As seen in the above cell the models have been created successfully as the two numbers are similar.

Now that both models have been created we now need ot train the data using our testing set. A testing method that we can use is called k-fold cross validation. This splits a large dataset into k amount of portions and takes one portion as the test set while the k-1 portions are the training set. There will be k number of combinations and the size of the training set will increase each time. 

Scikit-learn will use stratified k fold which means when seperating the data into the portions, it will ensure that there is a fair distribution of data in each portion.

First we need to import the needed libraries from sklearn.

In [20]:
# define 5-fold cross validation test harness
kfold = StratifiedKFold(n_splits=5, shuffle=True)
cv_scores = []
for train, test in kfold.split(X, y):
    # create model, train, and get accuracy
    model = Wide()
    acc = model_train(model, X[train], y[train], X[test], y[test])
    print("Accuracy (wide): %.2f" % acc)
    cv_scores.append(acc)

# evaluate the model
acc = np.mean(cv_scores)
std = np.std(cv_scores)
print("Model accuracy: %.2f%% (+/- %.2f%%)" % (acc*100, std*100))

NameError: name 'model_train' is not defined