# Pytorch: simple NN for binary classification


For the PIMA Indians data on diabetes (already encountered in the Basic Course), we will set up a feedforward net to predict the diabetes status.

We follow the usual workflow:

1. Load the data.
2. Define the pytorch model.
3. Define the loss function and optimizer.
4. Run the training loop.
5. Evaluate the model.
6. Make predictions with the learned model.


In [1]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim

## 1. Load the data

Input Variables ( $X$ ):

- Number of times pregnant
- Plasma glucose concentration at 2 hours in an oral glucose tolerance test
- Diastolic blood pressure (mm Hg)
- Triceps skin fold thickness (mm)
- 2-hour serum insulin (μIU/ml)
- Body mass index BMI (weight in kg/(height in m)2)
- Diabetes pedigree function
- Age (years)

Output Variable  ( $y$ ):

- Class label (0 or 1)



In [3]:
dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
X = dataset[:, 0:8]
y = dataset[:, 8]

Convert numpy arrays to pytorch tensors.

In [4]:
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

## 2. Define the model

We will use

- fully connected layers or dense layers using the `Linear` class in PyTorch.
- ReLU activation functions
- Sigmoid output for the final classification in the output layer (0/1)

Use a "pyramidal" architecture... (see previous Examples).

In [5]:
model = nn.Sequential(
    nn.Linear(8, 12),
    nn.ReLU(),
    nn.Linear(12, 8),
    nn.ReLU(),
    nn.Linear(8, 1),
    nn.Sigmoid() )

print(model)

Sequential(
  (0): Linear(in_features=8, out_features=12, bias=True)
  (1): ReLU()
  (2): Linear(in_features=12, out_features=8, bias=True)
  (3): ReLU()
  (4): Linear(in_features=8, out_features=1, bias=True)
  (5): Sigmoid()
)


An alternative is to use a class inherited from `nn.Module`

In [6]:
class PimaClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden1 = nn.Linear(8, 12)
        self.act1 = nn.ReLU()
        self.hidden2 = nn.Linear(12, 8)
        self.act2 = nn.ReLU()
        self.output = nn.Linear(8, 1)
        self.act_output = nn.Sigmoid()

    def forward(self, x):
        x = self.act1(self.hidden1(x))
        x = self.act2(self.hidden2(x))
        x = self.act_output(self.output(x))
        return x

model = PimaClassifier()
print(model)

PimaClassifier(
  (hidden1): Linear(in_features=8, out_features=12, bias=True)
  (act1): ReLU()
  (hidden2): Linear(in_features=12, out_features=8, bias=True)
  (act2): ReLU()
  (output): Linear(in_features=8, out_features=1, bias=True)
  (act_output): Sigmoid()
)


## 3. Set up loss and optimizer

Since we have a binary classification problem, we must use the binary classification error loss function, `BCELoss`.

For the optimizer, we use standard `Adam`.

In [7]:
loss_fn = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr=0.0005)

## 4. Train the model.

- Loop over the epochs, 
  - loop over batches
    - compute loss
    - compute gradient (backward)
    - step the optimizer

In [8]:
n_epochs = 100
batch_size = 10

for epoch in range(n_epochs):
    for i in range(0, len(X), batch_size):
        X_batch = X[i:i+batch_size]
        y_pred  = model(X_batch)
        y_batch = y[i:i+batch_size]
        loss = loss_fn(y_pred, y_batch)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    print(f'At end of epoch {epoch}, loss = {loss}')

At end of epoch 0, loss = 0.6514326333999634
At end of epoch 1, loss = 0.4459029734134674
At end of epoch 2, loss = 0.4550330638885498
At end of epoch 3, loss = 0.4522790312767029
At end of epoch 4, loss = 0.4477901756763458
At end of epoch 5, loss = 0.44486045837402344
At end of epoch 6, loss = 0.4469968378543854
At end of epoch 7, loss = 0.4473118484020233
At end of epoch 8, loss = 0.44760721921920776
At end of epoch 9, loss = 0.44560447335243225
At end of epoch 10, loss = 0.4458388388156891
At end of epoch 11, loss = 0.4470572769641876
At end of epoch 12, loss = 0.4462440311908722
At end of epoch 13, loss = 0.44738638401031494
At end of epoch 14, loss = 0.448482871055603
At end of epoch 15, loss = 0.4486386179924011
At end of epoch 16, loss = 0.4485207200050354
At end of epoch 17, loss = 0.4487292170524597
At end of epoch 18, loss = 0.44715577363967896
At end of epoch 19, loss = 0.4507614076137543
At end of epoch 20, loss = 0.45241180062294006
At end of epoch 21, loss = 0.4519551694

## 5. Evaluate the model precision

Just evaluate on all the training data...not very satisfactory.

In [9]:
# compute accuracy (no_grad is optional)
with torch.no_grad():
    y_pred = model(X)

accuracy = (y_pred.round() == y).float().mean()
print(f"Accuracy {accuracy}")

Accuracy 0.7421875


## 6. Make predictions with model



In [10]:
# make class predictions with the model
predictions = (model(X) > 0.5).int()
for i in range(5):
    print('%s => %d (expected %d)' % (X[i].tolist(), predictions[i], y[i]))

[6.0, 148.0, 72.0, 35.0, 0.0, 33.599998474121094, 0.6269999742507935, 50.0] => 1 (expected 1)
[1.0, 85.0, 66.0, 29.0, 0.0, 26.600000381469727, 0.35100001096725464, 31.0] => 0 (expected 0)
[8.0, 183.0, 64.0, 0.0, 0.0, 23.299999237060547, 0.671999990940094, 32.0] => 1 (expected 1)
[1.0, 89.0, 66.0, 23.0, 94.0, 28.100000381469727, 0.16699999570846558, 21.0] => 0 (expected 0)
[0.0, 137.0, 40.0, 35.0, 168.0, 43.099998474121094, 2.2880001068115234, 33.0] => 1 (expected 1)


## Conclusions


1. In spite of relatively large learning error, and no real cross-validation, the classifier attained 100% accuracy on the "test" set.
2. The model should be redone with
   - complete EDA (see previous Example)
   - proper trian/test split
   - tuning of architecture
   - tuning of optimizer
   - rigorous reporting.