# Creating our PyTorch training script

With our neural network architecture implemented, we can move on to training the model using PyTorch.

To accomplish this task, we'll need to implement a training script which:
    
    1) Creates an instance of our neural network architecture
    2) Builds our dataset
    3) Determines whether or not we are training our model on a GPU
    4) Defines a training loop

In [1]:
# import the necessary packages
from torch.optim import SGD
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_blobs
import torch.nn as nn
import torch
import mlp

#### mlp: My definition of the multi-layer perceptrno architecture, implemented in PyTorch.

#### SGD: The Stochastic Gradient Descent optimizer that we'll be using to train the model.

#### make_blobs: Buildsa synthetic dataset of example data.

#### train_test_split: Splits our dataset into a training and testing set.

#### nn: PyTorch's neural network functionality.

#### torch: The base PyTorch library.

When training a neural network, we do so in <i>batches</i> of data. Thefollowing function, <b>next_batch</b>, yields such batches to our training loop.

In [2]:
def next_batch(inputs, targets, batchSize):
    # loop over the dataset
    for i in range(0, inputs.shape[0], batchSize):
        # yield a tuple of the current batched data and labels
        yield (inputs[i:i + batchSize], targets[i:i + batchSize])

The <b>next_batch</b> function accepts three arguments:
    
    1) inputs: Our input data to the neural network.
    2) targets: Our target output values (i.e., what we want our neural network to accurately predict).
    3) batchSize:Size of data batch.

We have some important initializations to take care of:

In [3]:
# specify our batch size, number of epochs and learning rate
batch_size = 64
epochs = 10
LR = 1e-2

# determine thedevice we will be using for training (CPU or GPU)
device = "cuda" if torch.cuda.is_available() else "cpu"
print("[INFO] training using {}...".format(device))

[INFO] training using cpu...


Next, we need an example dataset to train our neural network on. Let's use scikit-learn's make_blobs function to create a synthetic dataset for us:

In [4]:
# generate a 3-class classification problem with 1000 data points
# where each data point is a 4D feature vector
print("[INFO] preparing data...")
(X, y) = make_blobs(n_samples=1000, n_features=4, centers=3, cluster_std=2.5, random_state=69)

# create training and testing splits, and convert them to PyTorch tensors
(trainX, testX, trainY, testY) = train_test_split(X, y, test_size=0.15, random_state=60)
trainX = torch.from_numpy(trainX).float()
testX = torch.from_numpy(testX).float()
trainY = torch.from_numpy(trainY).float()
testY = torch.from_numpy(testY).float()

[INFO] preparing data...


Once our data is generated, we apply the train_test_split function to create our training split, 85% for training and 15% for evaluation.

From there, the training and testing data is converted to PyTorch tensors from NumPy arrays, and then converted to the floating point data type.

Let's now instantiate our PyTorch neural network architecture:

In [5]:
# initialize our model and display its architecture
mlp = mlp.get_training_model().to(device)
print(mlp)

# initialize optimizer and loss function
optimizer = SGD(mlp.parameters(), lr=LR)
lossFunction = nn.CrossEntropyLoss()

Sequential(
  (hidden_layer_1): Linear(in_features=4, out_features=8, bias=True)
  (activation_1): ReLU()
  (output_layer): Linear(in_features=8, out_features=3, bias=True)
)


## Training loop

In [6]:
# create a template to summarize current training progress
trainTemplate = "Epoch: {} , Test loss: {:.3f} , Test accuracy: {:.3f}"

# loop through the epochs
for epoch in range(0, epochs):
    # initialize tracker variables and set our model to trainable
    print("[INFO] Epoch: {}...".format(epoch+1))
    trainLoss = 0
    trainAcc = 0
    samples = 0
    mlp.train()
    
    # loop over the current batch of data
    for (batchX, batchY) in next_batch(trainX, trainY, batch_size):
        # flash data to the current device, run it through our model
        # and calculate loss
        (batchX, batchY) = (batchX.to(device), batchY.to(device))
        predictions = mlp(batchX)
        loss = lossFunction(predictions, batchY.long())
        
        # zero the gradients accumulated from the previous steps,
        # perform backpropagation and update model parameters
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        # update training loss, accuracy and the number of samples visited
        trainLoss += loss.item() * batchY.size(0)
        trainAcc += (predictions.max(1)[1] == batchY).sum().item()
        samples += batchY.size(0)
        
    # Display model progress on the current training batch
    trainTemplate = "Epoch: {} , Train loss: {:.3f} , Train accuracy: {:.3f}"
    print(trainTemplate.format(epoch + 1, (trainLoss / samples),
                              (trainAcc/samples)))
    
    ###
    # Initialize tracker variables for testing, then set our model to evaluation mode
    testLoss = 0
    testAcc = 0
    samples = 0
    mlp.eval()

    # Initialize a no-gradient context
    with torch.no_grad():
        # Loop over the current batch of test data
        for (batchX, batchY) in next_batch(testX, testY, batch_size):
            # Flash the data to the current device
            (batchX, batchY) = (batchX.to(device), batchY.to(device))
        
            # Run data through our model and calculate loss
            predictions = mlp(batchX)
            loss = lossFunction(predictions, batchY.long())
        
            # Update test loss, accuracy and the number of samples visited
            testLoss += loss.item() * batchY.size(0)
            testAcc += (predictions.max(1)[1] == batchY).sum().item()
            samples += batchY.size(0)
    
        # Display model progress on the current test batch
        testTemplate = "epoch: {} test loss: {:.3f} test accuracy: {:.3f}"
        print(testTemplate.format(epoch + 1, (testLoss / samples),
            (testAcc / samples)))
        print("")

[INFO] Epoch: 1...
Epoch: 1 , Train loss: 0.623 , Train accuracy: 0.799
epoch: 1 test loss: 0.436 test accuracy: 0.953

[INFO] Epoch: 2...
Epoch: 2 , Train loss: 0.334 , Train accuracy: 0.973
epoch: 2 test loss: 0.276 test accuracy: 0.960

[INFO] Epoch: 3...
Epoch: 3 , Train loss: 0.220 , Train accuracy: 0.982
epoch: 3 test loss: 0.213 test accuracy: 0.960

[INFO] Epoch: 4...
Epoch: 4 , Train loss: 0.165 , Train accuracy: 0.986
epoch: 4 test loss: 0.181 test accuracy: 0.960

[INFO] Epoch: 5...
Epoch: 5 , Train loss: 0.134 , Train accuracy: 0.987
epoch: 5 test loss: 0.161 test accuracy: 0.960

[INFO] Epoch: 6...
Epoch: 6 , Train loss: 0.113 , Train accuracy: 0.988
epoch: 6 test loss: 0.149 test accuracy: 0.960

[INFO] Epoch: 7...
Epoch: 7 , Train loss: 0.099 , Train accuracy: 0.988
epoch: 7 test loss: 0.140 test accuracy: 0.960

[INFO] Epoch: 8...
Epoch: 8 , Train loss: 0.088 , Train accuracy: 0.988
epoch: 8 test loss: 0.134 test accuracy: 0.960

[INFO] Epoch: 9...
Epoch: 9 , Train loss

We then loop over our number of desired training epochs. Inmediately inside this <b>for</b> loop we:
    
    1) Show the epoch number, which is useful for debugging purposes.
    2) Initialize our training loss and accuracy
    3) Initialize the total number of data points used inside the current iteration of the training loop
    4) Put the PyTorch model in training mode
    
Then starts and inner <b>for</b> loop that loops over each of our batches in the training set. Nearly every training proedure you write using PyTorch will consist of an outer loop (over the number of epochs) and an inner loop (over the data batches).

Within the inner loop (batch loop), we proceed to:
    
    1) Move the batchX and batchY data to our CPU or GPU.
    2) Pass the batchX data through the neural network and make predictions on it.
    3) Use our loss function to compute our loss by comparing the output <b>predictions</b> to our ground-truth class labels