# Creating a CNN with Pytorch
[source](https://machinelearningmastery.com/develop-your-first-neural-network-with-pytorch-step-by-step/)
1. Get a dataset and separate your input features into X Tensor and your output answer values into a y Tensor reshaped into 1 column
2. Create your CNN class that inherits from nn.Module and defines the layers and forward pass
3. Define your loss function and optimizer
4. Choose epochs and batches and train your model (epochs outer loop, batches inner loop)
   1. predict answer based on inputs `y_pred = model(x_batch)`
   2. get the loss `loss = loss_fn(y_pred, y)`
   3. get the zero gradient of the loss function `optimizer.zero_grad()`
   4. do a backwards pass to compute the gradients `loss.backward()`
   5. update the model parameters `optimizer.step()`
   6. Print out how the model is doing during training `print(f'Finished epoch {epoch}, latest loss {loss}')`

Save model:
`torch.save(model.state_dict(), 'model.pth')`

Load model (choose if you only want the weights; if you want to tune the model more don't say weights_only):

`model = MyModelClassNameOrConstructor()`

`model.load_state_dict(torch.load('model.pth', weights_only=True))`

In [86]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim

In [87]:
# load the dataset, split into input (X) and output (y) variables
dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]

In [88]:
X = torch.Tensor(X)
y = torch.Tensor(y).reshape(-1,1)
# note: Adding -1 to the shape just lets numpy calculate the remaining value for you, 
# so that the product of the axes still matches the previous number of elements.

In [90]:
X.shape

torch.Size([768, 8])

In this approach, a class needs to have all the layers defined in the constructor because you need to prepare all its components when it is created, but the input is not yet provided. Note that you also need to call the parent class’s constructor (the line super().__init__()) to bootstrap your model. You also need to define a forward() function in the class to tell, if an input tensor x is provided, how you produce the output tensor in return.

nn.Module is a fundamental building block in PyTorch, used for defining neural network models. It provides a way to create custom neural network architectures by defining layers and their forward pass. 

To use nn.Module, you need to:
1. create a new class that inherits from nn.Module
2. define the
- layers
-  the forward pass

In [26]:
class PimaClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden1 = nn.Linear(8, 12)
        self.act1 = nn.ReLU()
        self.hidden2 = nn.Linear(12, 8)
        self.act2 = nn.ReLU()
        self.output = nn.Linear(8, 1)
        self.act_output = nn.Sigmoid()

    def forward(self, x):
        x = self.act1(self.hidden1(x))
        x = self.act2(self.hidden2(x))
        x = self.act_output(self.output(x))
        return x

model = PimaClassifier()
print(model)

PimaClassifier(
  (hidden1): Linear(in_features=8, out_features=12, bias=True)
  (act1): ReLU()
  (hidden2): Linear(in_features=12, out_features=8, bias=True)
  (act2): ReLU()
  (output): Linear(in_features=8, out_features=1, bias=True)
  (act_output): Sigmoid()
)


In [27]:
loss_fn = nn.BCELoss()  # binary cross entropy
optimizer = optim.Adam(model.parameters(), lr=0.001)

The goal of training a model is to ensure it learns a good enough mapping of input data to output classification. It will not be perfect, and errors are inevitable. Usually, you will see the amount of error reducing when in the later epochs, but it will eventually level out. This is called model convergence.

The simplest way to build a training loop is to use two nested for-loops, one for epochs and one for batches:

In [93]:
n_epochs = 100
batch_size = 10

for epoch in range(n_epochs):
    for i in range(0, len(X), batch_size):
        Xbatch = X[i:i+batch_size]
        y_pred = model(Xbatch)
        ybatch = y[i:i+batch_size]
        loss = loss_fn(y_pred, ybatch)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    print(f'Finished epoch {epoch}, latest loss {loss}')

Finished epoch 0, latest loss 0.34976398944854736
Finished epoch 0, latest loss 0.34976398944854736
Finished epoch 1, latest loss 0.3472548723220825
Finished epoch 1, latest loss 0.3472548723220825
Finished epoch 2, latest loss 0.3426494002342224
Finished epoch 2, latest loss 0.3426494002342224
Finished epoch 3, latest loss 0.3417043387889862
Finished epoch 3, latest loss 0.3417043387889862
Finished epoch 4, latest loss 0.34253567457199097
Finished epoch 4, latest loss 0.34253567457199097
Finished epoch 5, latest loss 0.34057021141052246
Finished epoch 5, latest loss 0.34057021141052246
Finished epoch 6, latest loss 0.3380061686038971
Finished epoch 6, latest loss 0.3380061686038971
Finished epoch 7, latest loss 0.3432692587375641
Finished epoch 7, latest loss 0.3432692587375641
Finished epoch 8, latest loss 0.3439106345176697
Finished epoch 8, latest loss 0.3439106345176697
Finished epoch 9, latest loss 0.33861076831817627
Finished epoch 9, latest loss 0.33861076831817627
Finished epo

You have trained our neural network on the entire dataset, and you can evaluate the performance of the network on the same dataset. This will only give you an idea of how well you have modeled the dataset (e.g., train accuracy) but no idea of how well the algorithm might perform on new data. This was done for simplicity, but ideally, you could separate your data into train and test datasets for training and evaluation of your model.

You can evaluate your model on your training dataset in the same way you invoked the model in training. This will generate predictions for each input, but then you still need to compute a score for the evaluation. This score can be the same as your loss function or something different. Because you are doing binary classification, you can use accuracy as your evaluation score by converting the output (a floating point in the range of 0 to 1) to an integer (0 or 1) and compare to the label we know.

In [30]:
with torch.no_grad():
    y_pred = model(X)

# gets the output of the sigmoid function (probability that the answer will be yes), makes it predict either yes or no based on if it's closer to 1 or 0, checks
# if each prediction was right, adds up the right guesses and averages them -> "accuracy"
accuracy = (y_pred.round() == y).float().mean()
print(f"Accuracy {accuracy}")

Accuracy 0.7734375


using a sigmoid activation function on the output layer so that the predictions will be a probability in the range between 0 and 1. 

In [52]:
# make probability predictions with the model
probability_predictions = model(X)
# round predictions
rounded = predictions.round()

In [60]:
print("Probabilities:", probability_predictions[0:5,:], "Predictions:", rounded[0:5,:])

Probabilities: tensor([[0.7903],
        [0.1231],
        [0.8872],
        [0.1254],
        [0.7991]], grad_fn=<SliceBackward0>) Predictions: tensor([[1.],
        [0.],
        [1.],
        [0.],
        [1.]], grad_fn=<SliceBackward0>)


In [62]:
# make class predictions with the model
predictions = (model(X) > 0.5).int()

In [64]:
print("Class predictions:", predictions[0:5, :])

Class predictions: tensor([[1],
        [0],
        [1],
        [0],
        [1]], dtype=torch.int32)


In [79]:
if (rounded.int().detach().numpy() == predictions.detach().numpy()).all():
    print("They're equivalent.")
else:
    print("They are not equivalent.")

They're equivalent.


This code uses a different way of building the model but should functionally be the same as before. After the model is trained, predictions are made for all examples in the dataset, and the input rows and predicted class value for the first five examples are printed and compared to the expected class value. You can see that most rows are correctly predicted. In fact, you can expect about 77% of the rows to be correctly predicted based on your estimated performance of the model in the previous section.

In [92]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim

# load the dataset, split into input (X) and output (y) variables
dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]

X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# define the model
class PimaClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden1 = nn.Linear(8, 12)
        self.act1 = nn.ReLU()
        self.hidden2 = nn.Linear(12, 8)
        self.act2 = nn.ReLU()
        self.output = nn.Linear(8, 1)
        self.act_output = nn.Sigmoid()

    def forward(self, x):
        x = self.act1(self.hidden1(x))
        x = self.act2(self.hidden2(x))
        x = self.act_output(self.output(x))
        return x

model = PimaClassifier()
print(model)

# train the model
loss_fn   = nn.BCELoss()  # binary cross entropy
optimizer = optim.Adam(model.parameters(), lr=0.001)

n_epochs = 100
batch_size = 10

for epoch in range(n_epochs):
    for i in range(0, len(X), batch_size):
        Xbatch = X[i:i+batch_size]
        y_pred = model(Xbatch)
        ybatch = y[i:i+batch_size]
        loss = loss_fn(y_pred, ybatch)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

# compute accuracy
y_pred = model(X)
accuracy = (y_pred.round() == y).float().mean()
print(f"Accuracy {accuracy}")

# make class predictions with the model
predictions = (model(X) > 0.5).int()
for i in range(5):
    print('8 inputs: %s => %d (expected/correct value %d)' % (X[i].tolist(), predictions[i], y[i]))

PimaClassifier(
  (hidden1): Linear(in_features=8, out_features=12, bias=True)
  (act1): ReLU()
  (hidden2): Linear(in_features=12, out_features=8, bias=True)
  (act2): ReLU()
  (output): Linear(in_features=8, out_features=1, bias=True)
  (act_output): Sigmoid()
)
Accuracy 0.7734375
8 inputs: [6.0, 148.0, 72.0, 35.0, 0.0, 33.599998474121094, 0.6269999742507935, 50.0] => 1 (expected/correct value 1)
8 inputs: [1.0, 85.0, 66.0, 29.0, 0.0, 26.600000381469727, 0.35100001096725464, 31.0] => 0 (expected/correct value 0)
8 inputs: [8.0, 183.0, 64.0, 0.0, 0.0, 23.299999237060547, 0.671999990940094, 32.0] => 1 (expected/correct value 1)
8 inputs: [1.0, 89.0, 66.0, 23.0, 94.0, 28.100000381469727, 0.16699999570846558, 21.0] => 0 (expected/correct value 0)
8 inputs: [0.0, 137.0, 40.0, 35.0, 168.0, 43.099998474121094, 2.2880001068115234, 33.0] => 1 (expected/correct value 1)
