## Implement Neural Network using PyTorch
* Load Data
* Define PyToch Model
* Define Loss Function and Optimizers
* Run a Training Loop
* Evaluate the Model
* Make Predictions

In [2]:
!pip install torch

In [3]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from tqdm import tqdm

## Load Data
Pima Indians onset of diabetes dataset. This has been a standard machine learning dataset since the early days of the field. It describes patient medical record data for Pima Indians and whether they had an onset of diabetes within five years.

**Input Variables**
* Number of times pregnant
* Plasma glucose concentration at 2 hours in an oral glucose tolerance test
* Diastolic blood pressure (mm Hg)
* Triceps skin fold thickness (mm)
* 2-hour serum insulin (μIU/ml)
* Body mass index (weight in kg/(height in m)2)
* Diabetes pedigree function
* Age (years)

**Output Variables**

* Class label (0 or 1)

In [4]:

# load the dataset, split into input (X) and output (y) variables
dataset = np.loadtxt('./data/pima-indians-diabetes.csv', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]

In [5]:
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

## Define Model
A model can be defined as a sequence of layers. You create a Sequential model with the layers listed out. The first thing you need to do to get this right is to ensure the first layer has the correct number of input features. In this example, you can specify the input dimension  8 for the eight input variables as one vector.

* The model expects rows of data with 8 variables (the first argument at the first layer set to 8)
* The first hidden layer has 12 neurons, followed by a ReLU activation function
* The second hidden layer has 8 neurons, followed by another ReLU activation function
* The output layer has one neuron, followed by a sigmoid activation function

In this approach, a class needs to have all the layers defined in the constructor because you need to prepare all its components when it is created, but the input is not yet provided. Note that you also need to call the parent class’s constructor (the line super().__init__()) to bootstrap your model. You also need to define a forward() function in the class to tell, if an input tensor x is provided, how you produce the output tensor in return.

In [6]:
class PimaClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden1 = nn.Linear(8, 12)
        self.act1 = nn.ReLU()
        self.hidden2 = nn.Linear(12, 8)
        self.act2 = nn.ReLU()
        self.output = nn.Linear(8, 1)
        self.act_output = nn.Sigmoid()
 
    def forward(self, x):
        x= self.hidden1(x)
        x = self.act1(x)
        x = self.act2(self.hidden2(x))
        x = self.act_output(self.output(x))
        return x
 
model = PimaClassifier()
print(model)

PimaClassifier(
  (hidden1): Linear(in_features=8, out_features=12, bias=True)
  (act1): ReLU()
  (hidden2): Linear(in_features=12, out_features=8, bias=True)
  (act2): ReLU()
  (output): Linear(in_features=8, out_features=1, bias=True)
  (act_output): Sigmoid()
)


## Define Loss Function and Optimizer

We need neural network model to predict the results as close as the actual. Training the Neural Network model means finding the best set of weights to map the input to outputs in your dataset. The loss or cost function us the metric to measure the difference between prediction and actuals. In this problem we are using binary cross entropy because it is a binary classification problem.

After the loss function, we need an optimizer. Optimizer is the algorithm used to adjust the model weights and theirby minimize the cost function. 

In [5]:
loss_fn = nn.BCELoss()  # binary cross entropy
optimizer = optim.Adam(model.parameters(), lr=0.001)

## Training the model

Training a neural network model usually takes in epochs and batches. They are idioms for how data is passed to a model:

* **Epoch**: Passes the entire training dataset to the model once
* **Batch**: One or more samples passed to the model, from which the gradient descent algorithm will be executed for one iteration

In [6]:
n_epochs = 100
batch_size = 10
for epoch in tqdm(range(n_epochs)):
    for i in range(0, len(X), batch_size):
        Xbatch = X[i:i+batch_size]
        y_pred = model(Xbatch)
        ybatch = y[i:i+batch_size]
        loss = loss_fn(y_pred, ybatch)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    if epoch%10 == 0:
        print(f'Finished epoch {epoch}, latest loss {loss}')

  6%|▋           | 6/100 [00:00<00:01, 55.21it/s]

Finished epoch 0, latest loss 0.4929962754249573
Finished epoch 10, latest loss 0.4554964303970337


 30%|███▎       | 30/100 [00:00<00:00, 70.65it/s]

Finished epoch 20, latest loss 0.44788020849227905
Finished epoch 30, latest loss 0.44748398661613464


 54%|█████▉     | 54/100 [00:00<00:00, 72.76it/s]

Finished epoch 40, latest loss 0.4265342652797699
Finished epoch 50, latest loss 0.4158594608306885


 70%|███████▋   | 70/100 [00:00<00:00, 71.41it/s]

Finished epoch 60, latest loss 0.39098966121673584
Finished epoch 70, latest loss 0.3696490526199341


 94%|██████████▎| 94/100 [00:01<00:00, 70.49it/s]

Finished epoch 80, latest loss 0.35486742854118347
Finished epoch 90, latest loss 0.34297993779182434


100%|██████████| 100/100 [00:01<00:00, 70.30it/s]


## Evalaute the Model

You have trained our neural network on the entire dataset, and you can evaluate the performance of the network on the same dataset. This will only give you an idea of how well you have modeled the dataset (e.g., train accuracy) but no idea of how well the algorithm might perform on new data. This was done for simplicity, but ideally, you could separate your data into train and test datasets for training and evaluation of your model.

You can evaluate your model on your training dataset in the same way you invoked the model in training. This will generate predictions for each input, but then you still need to compute a score for the evaluation. This score can be the same as your loss function or something different. Because you are doing binary classification, you can use accuracy as your evaluation score by converting the output (a floating point in the range of 0 to 1) to an integer (0 or 1) and compare to the label we know.

In [7]:
# compute accuracy (no_grad is optional)
with torch.no_grad():
    y_pred = model(X)
 
accuracy = (y_pred.round() == y).float().mean()
print(f"Accuracy {accuracy}")

Accuracy 0.7864583134651184


## Make class predictions with the model

In [9]:

predictions = (model(X) > 0.5).int()
for i in range(15):
    print('%s => %d (expected %d)' % (X[i].tolist(), predictions[i], y[i]))

[6.0, 148.0, 72.0, 35.0, 0.0, 33.599998474121094, 0.6269999742507935, 50.0] => 1 (expected 1)
[1.0, 85.0, 66.0, 29.0, 0.0, 26.600000381469727, 0.35100001096725464, 31.0] => 0 (expected 0)
[8.0, 183.0, 64.0, 0.0, 0.0, 23.299999237060547, 0.671999990940094, 32.0] => 1 (expected 1)
[1.0, 89.0, 66.0, 23.0, 94.0, 28.100000381469727, 0.16699999570846558, 21.0] => 0 (expected 0)
[0.0, 137.0, 40.0, 35.0, 168.0, 43.099998474121094, 2.2880001068115234, 33.0] => 1 (expected 1)
[5.0, 116.0, 74.0, 0.0, 0.0, 25.600000381469727, 0.20100000500679016, 30.0] => 0 (expected 0)
[3.0, 78.0, 50.0, 32.0, 88.0, 31.0, 0.24799999594688416, 26.0] => 0 (expected 1)
[10.0, 115.0, 0.0, 0.0, 0.0, 35.29999923706055, 0.1340000033378601, 29.0] => 1 (expected 0)
[2.0, 197.0, 70.0, 45.0, 543.0, 30.5, 0.15800000727176666, 53.0] => 1 (expected 1)
[8.0, 125.0, 96.0, 0.0, 0.0, 0.0, 0.23199999332427979, 54.0] => 0 (expected 1)
[4.0, 110.0, 92.0, 0.0, 0.0, 37.599998474121094, 0.19099999964237213, 30.0] => 0 (expected 0)
[10.0,