At this point in the series, we've finished building our model, and technically, we could jump right into the training process from here. However, let's work to better understand how our network is working right out of the box

Forward propagation is the process of transforming an input tensor to an output tensor

 we pass our sample data to the network's forward() method.This is why, the forward() method has the name forward, the execution of the forward() is the process of forward propagation.
 
 # Predicting with the network: Forward pass
 
 Before we being, we are going to turn off PyTorch’s gradient calculation feature. This will stop PyTorch from automatically building a computation graph as our tensor flows through the network.
 
The computation graph keeps track of the network's mapping by tracking each computation that happens. The graph is used during the training process to calculate the derivative (gradient) of the loss function with respect to the network’s weights.

Since we are not training the network yet, we aren’t planning on updating the weights, and so we don’t require gradient calculations. We will turn this back on when training begins.

Turning it off isn’t strictly necessary but having the feature turned off does reduce memory consumption since the graph isn't stored in memory.

In [9]:
import torch 
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F

In [10]:
train_set = torchvision.datasets.FashionMNIST(
    root='./data'
    ,train=True
    ,download=True
    ,transform=transforms.Compose([
        transforms.ToTensor()
    ])
)
train_loader = torch.utils.data.DataLoader(train_set
    ,batch_size=10
    ,shuffle=True
)

In [11]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        
        self.fc1 = nn.Linear(in_features=12 * 4 * 4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=60)#linear, dense, and fully connected layer all are same
        self.out = nn.Linear(in_features=60, out_features=10)
    def forward(self, t):
    # (1) input layer
        t = t

        # (2) hidden conv layer
        t = self.conv1(t)
        t = F.relu(t)
        t = F.max_pool2d(t, kernel_size=2, stride=2)

        # (3) hidden conv layer
        t = self.conv2(t)
        t = F.relu(t)
        t = F.max_pool2d(t, kernel_size=2, stride=2)

        # (4) hidden linear layer
        t = t.reshape(-1, 12 * 4 * 4)
        t = self.fc1(t)
        t = F.relu(t)

        # (5) hidden linear layer
        t = self.fc2(t)
        t = F.relu(t)

        # (6) output layer
        t = self.out(t)
        #t = F.softmax(t, dim=1)

        return t



In [12]:
torch.set_grad_enabled(False) 
#to turn off the computational graph

<torch.autograd.grad_mode.set_grad_enabled at 0x11ac84588>

In [13]:
network = Network()

In [14]:
# we’ll procure a single sample from our training set, unpack the image and the label, and verify the image’s shape:
sample = next(iter(train_set)) 
image, label = sample 
image.shape 

torch.Size([1, 28, 28])

In [15]:
#When we pass a tensor to our network, the network is expecting a batch,
#so even if we want to pass a single image, we still need a batch.
image.unsqueeze(0).shape

torch.Size([1, 1, 28, 28])

In [16]:
pred = network(image.unsqueeze(0))

In [17]:
pred

tensor([[ 0.0089,  0.0141,  0.0555, -0.0137, -0.0053, -0.0550, -0.0654,  0.0326,
         -0.0810, -0.1031]])

In [18]:
pred.shape
#The shape of the prediction tensor is 1 x 10. 
#This tells us that the first axis has a length of one while the second axis has a length of ten.
#The interpretation of this is that we have one image in our batch and ten prediction classes.

torch.Size([1, 10])

In [20]:
#If we wanted these values to be probabilities, we could just the softmax() function from the nn.functional package.
F.softmax(pred, dim=1)

tensor([[0.1029, 0.1035, 0.1078, 0.1006, 0.1015, 0.0966, 0.0956, 0.1054, 0.0941,
         0.0920]])

In [21]:
F.softmax(pred, dim=1).sum()

tensor(1.0000)

In [22]:
pred.argmax(dim=1)
#using the argmax() function we can see that the highest value in our prediction tensor occurred at the class 
#represented by index 2.

tensor([2])

In [23]:
label
#The label for the first image in our training set is 9

9

The prediction in this case is incorrect, which is what we expect because the weights in the network were generated randomly.

 Most of the probabilities came in close to 10%, and this makes sense because our network is guessing and we have ten prediction classes coming from a balanced dataset.
 
 Another implication of the randomly generated weights is that each time we create a new instance of our network, the weights within the network will be different. This means that the predictions we get will be different if we create different networks. Keep this in mind

In [24]:
net1 = Network()
net2 = Network()

In [25]:
net1(image.unsqueeze(0))

tensor([[-0.0317,  0.0964,  0.1132, -0.1000, -0.0151,  0.0138, -0.0911,  0.0094,
         -0.0460,  0.0263]])

In [26]:
net2(image.unsqueeze(0))

tensor([[-0.0984, -0.0744,  0.0493,  0.0447, -0.1132, -0.0216, -0.0384,  0.0757,
          0.0098,  0.0091]])

## Passing a Batch of Images to the Network

In [28]:
#upto in[12] same
batch = next(iter(train_loader))
images, labels = batch
#This gives us two tensors, a tensor of images and a tensor of corresponding labels.

In [29]:
images.shape

torch.Size([10, 1, 28, 28])

In [30]:
labels.shape
#The labels tensor has a single axis with a shape of ten, which corresponds to the ten images inside our batch. 
#One label for each image.

torch.Size([10])

In [31]:
preds = network(images)

In [32]:
preds

tensor([[ 0.0056,  0.0123,  0.0581, -0.0122, -0.0029, -0.0602, -0.0621,  0.0408,
         -0.0915, -0.0981],
        [ 0.0055,  0.0134,  0.0584, -0.0113, -0.0033, -0.0588, -0.0619,  0.0407,
         -0.0908, -0.0980],
        [ 0.0088,  0.0125,  0.0551, -0.0113,  0.0007, -0.0508, -0.0608,  0.0436,
         -0.0902, -0.1001],
        [ 0.0083,  0.0116,  0.0576, -0.0136, -0.0010, -0.0535, -0.0602,  0.0427,
         -0.0918, -0.0988],
        [ 0.0107,  0.0133,  0.0575, -0.0104, -0.0073, -0.0568, -0.0639,  0.0350,
         -0.0817, -0.1001],
        [ 0.0046,  0.0108,  0.0598, -0.0149, -0.0028, -0.0587, -0.0610,  0.0426,
         -0.0950, -0.0947],
        [ 0.0047,  0.0109,  0.0536, -0.0131, -0.0002, -0.0547, -0.0633,  0.0380,
         -0.0882, -0.0985],
        [ 0.0055,  0.0099,  0.0548, -0.0093, -0.0024, -0.0576, -0.0602,  0.0420,
         -0.0914, -0.0984],
        [ 0.0116,  0.0098,  0.0576, -0.0115, -0.0025, -0.0541, -0.0636,  0.0335,
         -0.0871, -0.0997],
        [ 0.0071,  

In [33]:
preds.shape
# This reflects the fact that we have ten images and for each of these ten images we have ten prediction classes.
#The elements of the first dimension are arrays of length ten. Each of these array elements contain 
#the ten predictions for each category for the corresponding image.

#The elements of the second dimension are numbers. Each number is the assigned value of the specific output class. 

torch.Size([10, 10])

In [34]:
preds.argmax(dim=1)

tensor([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [35]:
labels

tensor([6, 2, 1, 1, 9, 2, 3, 2, 9, 9])

In [36]:
#Once we have this tensor of indices of highest values, we can compare it against the label tensor.
preds.argmax(dim=1).eq(labels)

tensor([False,  True, False, False, False,  True, False,  True, False, False])

In [37]:
#Finally, if we call the sum() function on this result, 
#we can reduce the output into a single number of correct predictions inside this scalar valued tensor.
preds.argmax(dim=1).eq(labels).sum()

tensor(3)

We can wrap this last call into a function called get_num_correct() that accepts the predictions and the labels, and uses the item() method to return the Python number of correct predictions.

In [38]:
def get_num_correct(preds, labels):
    return preds.argmax(dim=1).eq(labels).sum().item()

In [39]:
get_num_correct(preds, labels)

3