# Diabetes Classification Model With **PyTorch**

## Overview

The steps we will follow:

- Load Data
- Define PyToch Model
- Define Loss Function and Optimizers
- Run a Training Loop
- Evaluate the Model
- Make Predictions

In [28]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim

### Load Data

**Pima Indians Diabetes**:
This dataset describes patient medical record data for Pima Indians and whether they had an onset of diabetes within five years.

It's used for binary classification problems `onset of diabetes as 1 or not as 0`. All the input variables that describe each patient are transformed and numerical. This makes it easy to use directly with neural networks that expect numerical input and output values and is an ideal choice for our neural network in PyTorch.

In [29]:
fname = "/data/train.csv"

dataset = np.loadtxt(fname=fname, delimiter=",", dtype=np.float32)

dataset[:5]

array([[6.000e+00, 1.480e+02, 7.200e+01, 3.500e+01, 0.000e+00, 3.360e+01,
        6.270e-01, 5.000e+01, 1.000e+00],
       [1.000e+00, 8.500e+01, 6.600e+01, 2.900e+01, 0.000e+00, 2.660e+01,
        3.510e-01, 3.100e+01, 0.000e+00],
       [8.000e+00, 1.830e+02, 6.400e+01, 0.000e+00, 0.000e+00, 2.330e+01,
        6.720e-01, 3.200e+01, 1.000e+00],
       [1.000e+00, 8.900e+01, 6.600e+01, 2.300e+01, 9.400e+01, 2.810e+01,
        1.670e-01, 2.100e+01, 0.000e+00],
       [0.000e+00, 1.370e+02, 4.000e+01, 3.500e+01, 1.680e+02, 4.310e+01,
        2.288e+00, 3.300e+01, 1.000e+00]], dtype=float32)

 We have 9 columns where from 1st to 8th are input variables, that we will represents as `X`. And the 9th column is the output variable and will be represented as `y`.

**Input Variables (X)**:

1. Number of times pregnant.
2. Plasma glucose concentration at 2 hours in an oral glucose tolerance test.
3. Diastolic blood pressure (mm Hg).
4. Triceps skin fold thickness (mm).
5. 2-hour serum insulin (μIU/ml).
6. Body mass index (weight in kg/(height in m)2).
7. Diabetes pedigree function.
8. Age (years).

**Output Variables (y)**:

9. Class label (0 or 1).

In [30]:
X = dataset[:, :8]
y = dataset[:, 8]

X.shape, y.shape

((768, 8), (768,))

In [31]:
X = torch.from_numpy(X)
y = torch.from_numpy(y).reshape(-1, 1)

X.shape, y.shape

(torch.Size([768, 8]), torch.Size([768, 1]))

### Define The Model

You can piece it all together by adding each layer such that:

- The model expects rows of data with 8 variables (the first argument at the first layer set to 8).
- The first hidden layer has 12 neurons, followed by a `ReLU` activation function.
- The second hidden layer has 8 neurons, followed by another `ReLU` activation function.
- The output layer has one neuron, followed by a `sigmoid` activation function.

In [32]:
class DiabetesClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden1 = nn.Linear(8, 200)
        self.act1 = nn.ReLU()
        self.hidden2 = nn.Linear(200, 100)
        self.act2 = nn.ReLU()
        self.output = nn.Linear(100, 1)
        self.act_output = nn.Sigmoid()

    def forward(self, x):
        x = self.act1(self.hidden1(x))
        x = self.act2(self.hidden2(x))
        x = self.act_output(self.output(x))
        return x

model = DiabetesClassifier()

print(model)

DiabetesClassifier(
  (hidden1): Linear(in_features=8, out_features=200, bias=True)
  (act1): ReLU()
  (hidden2): Linear(in_features=200, out_features=100, bias=True)
  (act2): ReLU()
  (output): Linear(in_features=100, out_features=1, bias=True)
  (act_output): Sigmoid()
)


### Define Loss Function and Optimizers (Preparation for Training)

Training a network means finding the best set of weights to map inputs to outputs in our dataset.

The loss function is the metric to measure the prediction’s distance to `y`. In general the loss function is the measure of the model's performance.

The optimizer is the algorithm we use to adjust/update the model weights progressively to produce a better output.


- loss function used: `BCELoss (Binary Classification Entropy Loss)` because the type of our task is a binary classification task.
- optimizer algorithm used: `ADAM (Adaptive Moment Estimation)`.

In [33]:
loss_function = nn.BCELoss()

optimizer = optim.Adam(model.parameters(), lr=0.001)

### Trianing The Model

Simply speaking, the entire dataset is split into batches, and we pass the batches one by one into a model using a training loop. Once we have exhausted all the batches, we have finished one epoch. Then we can start over again with the same dataset and start the second epoch, continuing to refine the model. This process repeats until we are satisfied with the model’s output.

- **Device = cuda**: to make the training faster we make the process running on the GPU (cuda).
- **Epoch**: Passes the entire training dataset to the model once.
- **Batch**: One or more samples passed to the model, from which the gradient descent algorithm will be executed for one iteration.

**The total number of batches over many epochs is how many times you run the gradient descent to refine the model.**

In [34]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

X = X.to(device)
y = y.to(device)
model.to(device)

epochs = 200
batch_size = 10

for epoch in range(epochs):
    for i in range(0, len(X), batch_size):
        X_batch = X[i:i+batch_size]

        y_prediction = model(X_batch)
        y_batch = y[i:i+batch_size]

        loss = loss_function(y_prediction, y_batch)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    print(f"Finished epoch: {epoch}, Latest loss: {loss}")

Finished epoch: 0, Latest loss: 0.3934597074985504
Finished epoch: 1, Latest loss: 0.6608480215072632
Finished epoch: 2, Latest loss: 0.6762295961380005
Finished epoch: 3, Latest loss: 0.6646689176559448
Finished epoch: 4, Latest loss: 0.5518184900283813
Finished epoch: 5, Latest loss: 0.4350510239601135
Finished epoch: 6, Latest loss: 0.4045760929584503
Finished epoch: 7, Latest loss: 0.4129359722137451
Finished epoch: 8, Latest loss: 0.3959192633628845
Finished epoch: 9, Latest loss: 0.4066961407661438
Finished epoch: 10, Latest loss: 0.4124729633331299
Finished epoch: 11, Latest loss: 0.44192013144493103
Finished epoch: 12, Latest loss: 0.43613022565841675
Finished epoch: 13, Latest loss: 0.438364714384079
Finished epoch: 14, Latest loss: 0.43524372577667236
Finished epoch: 15, Latest loss: 0.4504162073135376
Finished epoch: 16, Latest loss: 0.4320225417613983
Finished epoch: 17, Latest loss: 0.44153255224227905
Finished epoch: 18, Latest loss: 0.4431651532649994
Finished epoch: 19,

### Evaluate The Model

We can use `accuracy` as our evaluation score by converting the output (a floating point in the range of 0 to 1) to an integer (0 or 1) and compare to the label we know.

In [35]:
with torch.no_grad():
    y_prediction = model(X)

accuracy = (y_prediction.round() == y).float().mean()
print(f"Accuracy: {accuracy}")

Accuracy: 0.88671875


### Make Predictions
We are using a `sigmoid` activation function on the output layer so that the predictions will be a probability in the range between **0 and 1**,
so we can convert the probability into 0 or 1 to predict crisp classes directly by:

In [36]:
predictions = (model(X) > 0.5).int()

for i in range(10):
    print(f"{X[i].tolist()} => {predictions[i].item()} (expected {int(y[i].item())})")

[6.0, 148.0, 72.0, 35.0, 0.0, 33.599998474121094, 0.6269999742507935, 50.0] => 1 (expected 1)
[1.0, 85.0, 66.0, 29.0, 0.0, 26.600000381469727, 0.35100001096725464, 31.0] => 0 (expected 0)
[8.0, 183.0, 64.0, 0.0, 0.0, 23.299999237060547, 0.671999990940094, 32.0] => 1 (expected 1)
[1.0, 89.0, 66.0, 23.0, 94.0, 28.100000381469727, 0.16699999570846558, 21.0] => 0 (expected 0)
[0.0, 137.0, 40.0, 35.0, 168.0, 43.099998474121094, 2.2880001068115234, 33.0] => 1 (expected 1)
[5.0, 116.0, 74.0, 0.0, 0.0, 25.600000381469727, 0.20100000500679016, 30.0] => 0 (expected 0)
[3.0, 78.0, 50.0, 32.0, 88.0, 31.0, 0.24799999594688416, 26.0] => 0 (expected 1)
[10.0, 115.0, 0.0, 0.0, 0.0, 35.29999923706055, 0.1340000033378601, 29.0] => 0 (expected 0)
[2.0, 197.0, 70.0, 45.0, 543.0, 30.5, 0.15800000727176666, 53.0] => 1 (expected 1)
[8.0, 125.0, 96.0, 0.0, 0.0, 0.0, 0.23199999332427979, 54.0] => 1 (expected 1)
