<a href="https://colab.research.google.com/github/Volkner90/School/blob/main/MLIntel.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# A first Neural Network
### *by Rodrigo Camacho*, June, 2022
In this example, we will explore how to implement a simple fully connected, multi-layer neural network using [python](https://www.python.org/) and [pytorch](https://pytorch.org/).

No prior knowlede of python is needed. However, some programming knowledge in anyother language is required. We will explain most of the python stuff as we go through the example. A good place to start learning  about python is the official [Python tutorial](https://docs.python.org/3/tutorial) or this [W3's python tutorial](https://www.w3schools.com/python/default.asp). 

We will use Pytorch, but most of the concepts have very close analogs in other frameworks like TensorFlow. You can learn Pytorch following their [pytorch tutorials](https://pytorch.org/tutorials/)

**Warning:** *Python minds about indentation... pay attention to indentation when following the examples* 

The only requisite is that you attended the sessions where we talked about Neural Networks and their training. Here are links to the recorded sessions:

[session 1](https://intel-my.sharepoint.com/:v:/p/jorge_romero_aragon/EYDeQweifodFt2vmn-79JyoBWYMREj8eXX8fkNdHhUJEQQ) - Artificial Neurons

[session 2](https://intel-my.sharepoint.com/:v:/p/rodrigo_camacho/EfmGMbOIM6xOp5bWpzCB6gkBeybNyUHEaK8wFOneRBnynA?e=iNlv1G) - Multi-layer Neural Networks


Also, here is a link to the recorded session where we went through this notebook (although I did some updates) in case you prefer to follow the video instead of reading this text:

[session 3](https://intel-my.sharepoint.com/:v:/p/rodrigo_camacho/EbewGif2B4ZJhvORoefVQ4oBWeGLTTkHdDJykEXHGf2NNg?e=70pWbP) - A first example in pytorch

Finally, here is a link to the [Teams Folder](https://intel.sharepoint.com/:f:/s/ILSSPL/EjNN98Itx5dOrrc-bwca6G4BrM_JjTCa6e5JTtDNCh4MTQ?e=xhNgql) where we keep the slides:

---


# Python Modules
We start by importing the required "modules". A [module](https://docs.python.org/3/tutorial/modules.html) is a kind of collection of python files with some functionality. When you start python, you get only the baseline functionalities in python. If you need more specialized functionality, like neural networks, you need to import a "module" for such tasks.

In [None]:
import torch  # this is the baseline functionality of Pytorch

In python, you can import a full module or a portion of it with the construct:

>**from** *module* **import** *sub_module*

Let us import pytorch's [datasets](https://pytorch.org/vision/stable/datasets.html) module that will let us load images for our example

In [None]:
from torchvision import datasets # a module that let's you load popular datasets 
                                 # for training a neural network

You can also import sections of a module by refering to it as:

>**import** *module.submodule*

However, to use it, you need refer to it with the full *module.submodule* and this can be cumbersome.

Instead, you can rename it in your code using:

>**import** *module.submodule* **as** *nickname*

Let us import the **transforms** submodule that will help us to easily modify the images so we can use them

In [None]:
import torchvision.transforms as transforms  # The data in datasets is not 
                                  # always in the format we need. This module 
                                  #let's you easily modify 

import torchvision.utils as vision_utils  # submodule with useful functions for
                                  # handling images

import torch.optim as optim  # submodule wiht different otimization (or learning)
                            # algorithms

# Getting data: Datasets
In this example we will try to make a fully-connected, multi-layer neural network to recognize hand-written digits.

Thus, let us load [MNIST](https://en.wikipedia.org/wiki/MNIST_database), a database with 70k images of hand-written digits. This database is split in 60k images for training and 10k images for testing. 

As other image databases, MNIST is a collection of image files labeled with the corresponding digit label. 

To consume these files, we need to first copy them to our local drive. We could in principle access them online, but that might be too slow. Then, we would need to write python code to load each file to memory at a time and its corresponding label. 

Fortunately, we already imported Pytorch's **datasets** module so it will do the heavy lifting for us. 
It will let us not only load the MNIST image collection but also the corresponding labels and split them into training and testing datasets. 

We can load the training and testing datasets separately by specifying the option **train=True** for the training portion and **train=False** for the test portion

In [None]:
# Each sample in the dataset is a tuple (image:PIL[1,28,28],label:int)
train_ds = datasets.MNIST('../data', train=True, download=True)
test_ds = datasets.MNIST('../data', train=False, download=True)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ../data/MNIST/raw/train-images-idx3-ubyte.gz


  0%|          | 0/9912422 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/train-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ../data/MNIST/raw/train-labels-idx1-ubyte.gz


  0%|          | 0/28881 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/train-labels-idx1-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw/t10k-images-idx3-ubyte.gz


  0%|          | 0/1648877 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz


  0%|          | 0/4542 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw



## Objects and Classes
So now we have training and testing datasets.

Before moving on, we need to know what **train_ds** and **test_ds** actually are.

They are "objects"... *In python, everything is an object*

If you are new to object oriented programming (OOP), just mind that an *object* is a way to encapsulates data (info) and code (functionality). Also, objects are defined according to a kind of blueprint or format called *class*. You can read more about OOP at the [OOP wikipedia entry](https://en.wikipedia.org/wiki/Object-oriented_programming). You can learn more about python's objects and classes in this [objects tutorial](https://www.geeksforgeeks.org/python-classes-and-objects/#:~:text=A%20class%20is%20a%20user,that%20type%20to%20be%20made.) 

The one thing to keep in mind about Python Objects is that they have "attributes" (its data) and "functions" (what it can do). You can access both by doing:
>myObject.attribute

>myObject.function()

For now, let us see what kind of object **train_ds** is. In python, a way to know what kind of class is any variables is the function
**type()**

Also, the function 
**print(object)**
displays the object on the screen (or a file). 

So let us use both:


In [None]:
print(train_ds)
print(type(train_ds))

Dataset MNIST
    Number of datapoints: 60000
    Root location: ../data
    Split: Train
<class 'torchvision.datasets.mnist.MNIST'>


So line 1 in the previous section outputs a description of the **train_ds** and we see that it points 60k data points (images in this case) as expected, located at a given location and that this is the "train" portion of the set.

## Python Lists and Tuples
Next, what we need to know is that pytorch's **dataset** class is an extension of a class called "list" in python. 

Two of most fundamental types of objects in python are [list](https://www.w3schools.com/python/python_lists.asp) and [tuple](https://www.w3schools.com/python/python_tuples.asp). So let's spend a few lines describing them. 


A [list](https://www.w3schools.com/python/python_lists.asp), is a collection of python objects of any class. A [tuple](https://www.w3schools.com/python/python_tuples.asp) in a "locked down" list, meaning, it can't change. You can't add or remove elements from it.  

In both cases, we can retrieve each element in the list by doing: 
>mylist[*index*]

where **index** is an integer that indicates which specific item in the list we want (first one is index=0). This is called *indexing*

To know the number of elements in a list or tuple is:
> len(myList)

So the datasets **train_ds** and **test_ds** are a collection (similar to a list) that contains objects related to the images and their lables. Each element in the colection can be accessed by indexing. 

Let's see that in action. Let us print the type of the first element in the **train_ds** dataset:

In [None]:
print('The number of images in the set is:')
print(len(train_ds))

print('The type of the first element in train_ds is:')
print(type(train_ds[0]))

The number of images in the set is:
60000
The type of the first element in train_ds is:
<class 'tuple'>


We see from line 1 that the the first object in **train_ds** is a tuple (a collection). 

Since each element in **train_ds** is itself a tuple. So it is also a collection and we must be able to access its elements by indexing. 
In python, multi-level indexing is done by pyling indexes in "[ ]" as such:
>mylist[index0][index1]...[indexN]

So we can get each element in the tuple of the first element in the datases doing:

In [None]:
print('The number of elements in the first tuple of train_ds is:')
print(len(train_ds[0]))

print('\nThe first element in the first tuple of train_ds is of type:')  # "\n" prints a space before
print(type(train_ds[0][0])) 

print('\nThe second element in the first tuple of train_ds is of type:') # "\n" prints a space before
print(type(train_ds[0][1]))  

The number of elements in the first tuple of train_ds is:
2

The first element in the first tuple of train_ds is of type:
<class 'torch.Tensor'>

The second element in the first tuple of train_ds is of type:
<class 'int'>


So, the first element is an image and the second is the integer that it corresponds to. 

## Pytorch's Transforms
So far, so good. However, since the MNIST dataset contains PIL images and Pytorch works with objects called "tensors" (more about them later), we need to transform the images into tensors before using them. 
With Pytorch's [transforms](https://pytorch.org/vision/stable/transforms.html) module we can define a transform and then assign it to a dataset: 

In [None]:
transform = transforms.ToTensor()

train_ds.transform =transform
test_ds.transform = transform

print('The new type of the first image in train_ds is')
print(type(train_ds[0][0]))

The new type of the first image in train_ds is
<class 'torch.Tensor'>


# Arrays on steroids: Tensors
So we have now loaded the dataset and transformed the images into "tensors".
Tensors are one of the most importan classes in several Machine Learning (ML) frameworks like Pytorch. 

In short, "tensors" are multi-dimensional arrays of numbers with additional functionalities that enable efficient computation (specially on GPUs). We don't need to dive deeper for now. You can later take a look at Pytorch's own [Tensors Tutorial](https://pytorch.org/tutorials/beginner/basics/tensorqs_tutorial.html).

For example, a matrix, would be a 2-dimensional tensor because you have row and column dimensions. A 1-dimensional tensor would be an array of only one row (or column).  

An image is typically a 3-dimensional Tensor because you have a Color dimension (typically, three, for RGB) plus Height, and Width as shown in this illustration: 

<img src='https://drive.google.com/uc?export=view&id=1s6Kr6jfzqCaiJB4CoApNEU6FO3XGnXGJ' alt='Image Tensor' width='200' height='200'>

Beware however that other frameworks might use a differnet order for [C,H,W]. 

So, in general you can have any number of dimensions. As you can imagine, a lot more can be said. However, For now, just keep in mind these two aspects about Tensors:

1. Accessing Tensor elements is done by indexing. Just state the indices separated by commas inside square brackets:
> myTensor[index0,...,indexN]

1. Accessing multiple Tensor elements can be done passing a range of indices. For example, you can get mulitple elements of a 1D Tensor from index "start" up until a last index "end" (but not including it):
> myTensor[start:end:step]

1. A useful attribute of Tensors is **shape** that will tell you the size of your tensor in every dimension. 

Let's explore an image:

In [None]:
print('The type of the first image in the train dataset is: ')
print(type(train_ds[0][0]))  # first tuple of the dataset, first element of the tuple

print('\nThe shape attribute of the image Tensor of the train dataset is: ')
print(train_ds[0][0].shape)

The type of the first image in the train dataset is: 
<class 'torch.Tensor'>

The shape attribute of the image Tensor of the train dataset is: 
torch.Size([1, 28, 28])


So, you see that the image has three dimensions. The first is the color channel. In this case it is only of size 1 because these are monochromatic images. The next two are height (H) and width (W) both with size 28

Let's assing a name to the first image and explore its contents:

In [None]:
image0 = train_ds[0][0]  # first tuple in the dataset, first element in the tuple
print(type(image0))
print(image0.shape)

print('The first row of the image is:')
print(image0[0,0,0:28])  # first channel, first row, columns 0 to 27 (because 
                         # the last index is not included in the range construct
                         # begin:end)

print('The first column of the image is:')
print(image0[0,0:28,0])  # first channel, rows 0 to 27, first column


<class 'torch.Tensor'>
torch.Size([1, 28, 28])
The first row of the image is:
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0.])
The first column of the image is:
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0.])


# Consuming Data: Pytorch DataLoader and Python Iterables
We know how to access each element in the dataset now. Yet, for training we need to get data in batches and many times need to shufle them and use parallel processing. Hence, a more efficient way to access the data during training is to use Pytorch's [DataLoader](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html#preparing-your-data-for-training-with-dataloaders).

A [DataLoader](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html#preparing-your-data-for-training-with-dataloaders) object in Pytorch is an object that abstracts all those functionalities. It is also what is called an [iterable](https://www.w3schools.com/python/python_iterators.asp) in Python.

**Iterables** are objects that have an **iterator** function that that let's you loop through the iterable data efficiently. 


For now, just keep in mind this about iterators:
1. iterables are objects like lists, tuples and DataLoaders from which we can get an iterator object using
> myIterator = iter(myIterable)
1. iterators are the actual objects that enables looping over the iterable. You can ge the next element by:
> next_element = next(myIterator)
1. when you use an iterator, you can't access its elements by indexing, you can only request the next one, and the next one (until data is exhausted)

Let us then define a dataloader that will return images and labels in batches of 32 shuffled elements:

In [None]:
import torch.utils as utils  # note that you can import anywhere in your 
                                      # code as long as you do it before you use
                                      # the module

batch_size=32  # numer of elements to retrieve at once from the dataset

train_loader = utils.data.DataLoader(train_ds, # dataset to get the data from
                                    batch_size=batch_size,
                                    shuffle=True) 
# note that you can use multi-line statements in python

test_loader = utils.data.DataLoader(test_ds,
                                   batch_size=batch_size,
                                   shuffle=True)


So let's get a batch of images and labels from **train_loader** using its iterator:

In [None]:
print('The number of batches in the loader should be:')
print(60000/32)

print('and it actually is:')
print(len(train_loader))  # the loader class allows using python's len() 

# get an image from the loader
loader_iterator = iter(train_loader)
images, labels = next(loader_iterator) 

print('The shape of the images batch Tensor is')
print(images.shape)

The number of batches in the loader should be:
1875.0
and it actually is:
1875
The shape of the images batch Tensor is
torch.Size([32, 1, 28, 28])


You see that now, we have one additional dimension with size 32. This is the batch dimension.
In Pytorch, it is a convension to always use the first dimension as a batch dimension. This allows to treat the rest of the dimensions as independent tensors on which we can perform the same operation in parallel. For example, you can multiply by each element in the batch independently.

# Pytorch Neural Network Module
We are now ready to work on our neural network.
Pytorch provides a neural networks module called [torch.nn](https://pytorch.org/docs/stable/nn.html#containers) with classes for all layer types, activation functions and loss functions and more. 

For simplicity, we will start with a rudimentary, multi-layer, fully connected network (FCNN) that will only consist of two fully-connected layers. 

## Feeding images to a FCNN - Flattening
However, the input to an FCNN needs to be of a single dimension (apart from the batch of course). That is, if we request the **myTensor.shape** attribute of the input tensor we should see:
> [batch_size, n_inputs]

Where n_inputs is the number of inputs to our FCNN. 

Therefore, we need to change the shape of the input image, and collapse the H and W dimensions into a single one. We can do this with a Layer called [Flatten](https://pytorch.org/docs/stable/generated/torch.flatten.html).

The figure below shows how a 2D array gets "flattened" by taking each row and piling it in a single-column array:

<img src='https://drive.google.com/uc?export=view&id=1rTFNmQS48am_wpSsnFt9CVMwFSwierVx' alt='Image Tensor' width='200' height='200'>

In [None]:
n_batches = 10
C = 3
H = 28
W = 28
total_pxs = C*H*W
shape = (n_batches, C, H, W)  # shape is a python tuple

rand_image = torch.rand(shape)  # rand returns a tensor of the given shape

print(f'our random image is of shape: {rand_image.shape}')
print(f'the flattened image should be of shape [{n_batches}, {total_pxs}]')

our random image is of shape: torch.Size([10, 3, 28, 28])
the flattened image should be of shape [10, 2352]


In [None]:
a_flatten_layer = torch.nn.Flatten(start_dim=1)
flattened_image = a_flatten_layer(rand_image)
print(f'our flattened image is of shape: {flattened_image.shape}')

our flattened image is of shape: torch.Size([10, 2352])


## Pytorch Linear Layers
In Pytorch, fully-connected layers are called [Linear Layer](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear) because the weighted summation we saw in class is mathematically a matrix product which is itself a linear operation. 

To visualize what is going on, consider the following illustrarion of a linear layer with N neurons that take M inputs:

<img src='https://drive.google.com/uc?export=view&id=1FHKVJSAVARWTNmSsFsbP9cuRCgykesIi' alt='Image Tensor' width='250' height='250'>

If the weights of the N neurons in the layer are aranged as a matrix with shape [N,M] and the inputs are aranged as a [N,1] array as well as the biases, then, the output can be computed by the linear equation illustrated below (the product is a matrix product):

<img src='https://drive.google.com/uc?export=view&id=1AWUi_d09sh3F4_J9Epr6rUy0G8pGESu0' alt='Image Tensor' width=350>

Hence, for every [Linear Layer](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear), we need to define the number of inputs and the number of neurons. In our case, the number of inputs is the total number of pixels in our images:
> n_inputs = 28 * 28

The number of neurons in this first layer is something we can play around with, so let us leave it variable and give it a name... say N. 

For the second and last layer, the number of inputs is defined by the number of neurons of the previous layers. The number of neurons in this last layer must be equal to th number of classes. That is, 10 in this example.

In [None]:
N = 15  # number of neurons in the layer. Also, the number of outputs of the layer
a_linear_layer = torch.nn.Linear(total_pxs,N)
output = a_linear_layer(flattened_image)

print('The shape of the output tensor is:')
print(output.shape)

print('The shape of the weights tensor is [N, total_pxs]:')
print(a_linear_layer.weight.shape)

The shape of the output tensor is:
torch.Size([10, 15])
The shape of the weights tensor is [N, total_pxs]:
torch.Size([15, 2352])


## Pytorch Sequential Layer Containers 
The easiest way to put these layers together is using a [Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html) container. This object will take an input and pass it through all the layers it contains. This container will be our neural network.

In [None]:
C = 1  # monochromatic images
H = 28  # 28 pixels per side
W = 28  # 28 pixels per side

n_inputs = C*H*W  # total number of inputs is the total number of pixels 
N1 = 800  # number of neurons in layer 1, the input layer
n_outputs = 10  # number of neurons at layer 2, the output layer

# let's build our network using a Sequential container
net = torch.nn.Sequential(
            torch.nn.Flatten(start_dim=1),  # start at 2nd dim to preserve the batch dim
            torch.nn.Linear(n_inputs, N1),  # n_inputs connecting to N1 neurons
            torch.nn.Linear(N1, n_outputs), # N1 outputs from last layer are inputs to this layer with n_outputs neurons
        )

# Training
Now that we have our FCNN, we need to train it. For this, we need to define an error or loss criterion. We will also need to define a training algorithm. As we saw in class, training is really an optimization algorithm that adjusts the FCNN weights to minimizes the error or loss. Finally, we will need to define a loop to consume the dataset, compute the loss, its gradient and call the optimizer to update the weights of our network 



## Error or Loss criteria
All of the frameworks provide a wide variety of loss functions to choose from. You can see a full list of Pytroch Loss functions [here](https://pytorch.org/docs/stable/nn.html#loss-functions). In class we only described the Mean Square Error (MSE) and the Cross Entropy losses. Since this is a classification example, let us use the [Cross Entropy](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss) loss:

In [None]:
criterion = torch.nn.CrossEntropyLoss()

## Pytorch Stochastic Gradient Descent 
Again, several optimization methods are available in most frameworks. Pytorch encapsulates them in the [optim submodule](https://pytorch.org/docs/stable/optim.html)

Let us use the [Stochastic Gradient Descent](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html) submodule since this is what we saw in class with some detail. 
All we need is to define what variables need to be optimized and a learning rate. 
In our example, we want to optimize our FCNN weights that are an attribute of our net sequential object defined above. The attribute is called parameters:

In [None]:
optimizer = optim.SGD(net.parameters(), lr=0.001)

## The Training Loop
So we are ready to train our network.
We will need to loop over the batches from the train loader to get the outputs, compute the loss and the gradients and then update the weights.
Yet, before going there, let's briefly describe python for loops.


## Python for loops
A common way to do loops in python is [for loops](https://www.w3schools.com/python/python_for_loops.asp)
Since the baseline Python only knows about Lists, Tuples and a few other sequence types, you can imagine that  [for loops](https://www.w3schools.com/python/python_for_loops.asp)they let you go through the elements of a List or Tuple. 
The basic syntax is:
> for element in myListOrTuple:

where x is an element in the list or tupple myListOrTuple and the ":" indicates what we will do in the loop with x. 

A great thing about Python Loops, is that is the element is itself a sequence (another list or tuple) then, you can unpack the elements:
> for elment1,...,elementN in myListOfLists:

If all you need to do is loop a number of times, you need to use the [range(x) construct](https://www.w3schools.com/python/ref_func_range.asp). It will create a list of numbers between zero and x:
> for number in range(x):


Moreover, you can readily get an index for the loop iteration step along with the elements by just using the [enumerate construct](https://www.w3schools.com/python/ref_func_enumerate.asp). But this is better explained in the example so let's go ahead with it.



In [None]:
n_epochs = 2  # number of epochs we want to train
for epoch in range(n_epochs):  # range(int_x) converts an integer into a list from 0 to int_x
    for batch_idx, data in enumerate(train_loader):  # consume the dataset, one batch at a time
        # get the inputs; data is a tuple of (inputs, labels)
        inputs, labels = data

        # zero the parameter gradients to get the gradient per batch
        optimizer.zero_grad() 

        # do a forward pass
        outputs = net(inputs)

        # compute the loss
        loss = criterion(outputs, labels)

        # do the backpropagation
        loss.backward()  # backward is an attribute of every tensor.
                         # Thus, you can compute the gradient on any tensor that
                         # that results from any differentiable operation. 

        # Let the parameters with the optimizer
        optimizer.step()

        # print statistics
        if batch_idx % 100 == 0:    # print every 100 mini-batches
            print(f'[{epoch}, {batch_idx:5d}] loss: {loss.item():.3f}')

print('Finished Training')

[0,     0] loss: 2.339
[0,   100] loss: 2.262
[0,   200] loss: 2.177
[0,   300] loss: 2.123
[0,   400] loss: 2.071
[0,   500] loss: 1.971
[0,   600] loss: 1.927
[0,   700] loss: 1.900
[0,   800] loss: 1.880
[0,   900] loss: 1.800
[0,  1000] loss: 1.829
[0,  1100] loss: 1.605
[0,  1200] loss: 1.502
[0,  1300] loss: 1.551
[0,  1400] loss: 1.473
[0,  1500] loss: 1.364
[0,  1600] loss: 1.390
[0,  1700] loss: 1.462
[0,  1800] loss: 1.329
[1,     0] loss: 1.452
[1,   100] loss: 1.295
[1,   200] loss: 1.266
[1,   300] loss: 1.020
[1,   400] loss: 1.239
[1,   500] loss: 1.064
[1,   600] loss: 1.040
[1,   700] loss: 1.214
[1,   800] loss: 0.952
[1,   900] loss: 1.007
[1,  1000] loss: 1.186
[1,  1100] loss: 0.813
[1,  1200] loss: 0.804
[1,  1300] loss: 0.855
[1,  1400] loss: 1.126
[1,  1500] loss: 0.693
[1,  1600] loss: 0.893
[1,  1700] loss: 0.808
[1,  1800] loss: 1.048
Finished Training


# Testing the Performance of our Network
So we have trained our network. But what does the loss actually mean? Is the value we got any good? How often will the network predict the digits correctly?

To know the actual performance of our network, we need to use it with our test set and see how many times we got the digit right. 

That sounds straightforward. But before we do it you need to be aware of the fact that in Pytorch, there is a recording process going on in the background all the time. This is part of the [automatic differentiation](https://pytorch.org/tutorials/beginner/basics/autogradqs_tutorial.html) algorithm that requires a record of the operations. So, whenever you use your dataset, all the operations you do on it will be recorded. This takes resources. If you don't really need the gradients, just turn it off as you can see in the code below

In [None]:
n_samples = len(test_ds)
n_batches = len(test_loader)
test_loss = 0
n_hits = 0

with torch.no_grad():  # disable the AutoDiff record. Mind the indentation!
  for inputs,labels in test_loader: # python unpacks each tuple into inputs and labels
    # do a forward pass on this batch to get the outputs
    net_output = net(inputs)  # output is 10 values. The largest indicates the
                              # predicted category
    
    # To compute the accuracy, need to compute how many categories we got right
    # first, get the maximum for each prediction
    predicted_digits = torch.argmax(net_output, dim=1)  # dim=1 to do per image

    # compare the prediction with the labels
    current_hits = predicted_digits == labels  # == results in 0's or 1's 

    # add all the 1's because they mean we got those right 
    current_hits = torch.sum(current_hits)

    # accumulate our hits with those of the other batches
    n_hits = n_hits + current_hits

# normalize the accuracy 
Accuracy = n_hits / n_samples
print(f'Accuracy: {(100*Accuracy):.2f}%')



Accuracy: 83.81%
