# Deep learning with PyTorch

Import the required libraries

In [1]:
import torch as t
import torchvision.datasets as datasets 
import torchvision.transforms as transforms
import torch.nn as nn
import matplotlib.pyplot as plt

Tensors are data structure (similar to arrays and matrices, e.g. NumPy arrays).
PyTorch uses tensors to encode the inputs and outputs of a model, as well as the model’s parameters.

Ex 1.1:
Create a tensor using torch class. Your tensor is a 2*3 matrix with values as 0 and 1. 

In [2]:
tensor_a = t.Tensor([[0,0,0], [1,1,1]])
tensor_a

tensor([[0., 0., 0.],
        [1., 1., 1.]])

Ex 1.2: Then, create a tensor with random data and the previously defined dimensionality with torch.randn().

In [3]:
tensor_b = t.torch.randn(2,3)
tensor_b

tensor([[ 0.2475,  1.0046,  0.7398],
        [-0.0148,  0.2599, -1.8791]])

Ex 1.3:  Perform basic operation with pytorch: (1) concatenate the two tensors along the 1st dimension using torch.cat.

In [4]:
tensor_c = t.cat([tensor_a, tensor_b])
tensor_c

tensor([[ 0.0000,  0.0000,  0.0000],
        [ 1.0000,  1.0000,  1.0000],
        [ 0.2475,  1.0046,  0.7398],
        [-0.0148,  0.2599, -1.8791]])

Ex. 2: Reshape tensor_c to 1 rows and 12 columns.

In [5]:
tensor_r = tensor_c.view(1,12)
tensor_r

tensor([[ 0.0000,  0.0000,  0.0000,  1.0000,  1.0000,  1.0000,  0.2475,  1.0046,
          0.7398, -0.0148,  0.2599, -1.8791]])

Now we need to load the data to feed our deep learning model. But first, we introduce a transform function that applies a transformation to a given input and outputs a new transformed version of the input. This is very useful for any data preprocessing and/or augmentation we need to perform.

In [6]:
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,)),])

Ex. 3: Can you describe the tranform function above? Which output do you expect?

In many cases, we train neural networks on default or well-known datasets like MNIST. MNIST is a dataset consisting of handwritten images that are normalized and center-cropped. It has over 60,000 training images and 10,000 test images. To load and use the dataset we can use the syntax below after the torchvision package is installed. 
PyTorch has other built-in datasets, which are pre-loaded in the class torch.datasets: https://pytorch.org/vision/0.8/datasets.html

In [7]:
mnist_trainset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
mnist_testset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

Now, it's time to load the imported dataset via the torch.utils.data.DataLoader class. It represents a Python iterable over a dataset, with support for iterable-style datasets, customizing data loading order, automatic batching, etc. 

Ex. 4. Use the torch.utils.data.DataLoader class to define a train and test loader with batch size equal to 10. Use: https://pytorch.org/docs/stable/data.html as guideline for learning the class. 

In [8]:
train_loader = t.utils.data.DataLoader(mnist_trainset, batch_size=10, shuffle=True)
test_loader = t.utils.data.DataLoader(mnist_testset, batch_size=10, shuffle=True)

##### Let's designe our neural network. 
The main ingredients are the input size, hidden layer size and output size with the activation function. To do so we will the nn.Module class for the neural network. We will start designing a simple feedforward network that has 3 layers: an input layer, a hidden layer, and an output layer.

The self.linear1 is the input layer and takes in the parameters 28*28 because those are the amounts of pixels in each image, the output schould be 100.

Ex. 5: Create the self.linear2 which is the hidden layer. It takes the output of the previous layer for the input, and has an output size of 50.
Finally we design self.final which is the output layer. It takes the output of the previous layer for the input and will output size of 10 since we have 10 values within this dataset (0, 1, 2, 3, 4, 5, 6, 7, 8, 9).

Lastly, we introduce our ReLU activation function, that will output the input if it is positive, otherwise output the value of zero. 

In [9]:
class Net(nn.Module):
    def __init__(self):
        super(Net,self).__init__()
        self.linear1 = nn.Linear(28*28, 100) 
        self.linear2 = nn.Linear(100, 50) 
        self.final = nn.Linear(50, 10)
        self.relu = nn.ReLU()
    # the forward function feeds the data through our network.
    def forward(self, img):
        x = img.view(-1, 28*28)  # reshape the images for the model.
        x = self.relu(self.linear1(x))
        x = self.relu(self.linear2(x))
        x = self.final(x)
        return x
net = Net()

We define the loss function as the cross-entropy, that measures the difference between two probability distributions when given a random set of events (our dataset).

We also define is our optimizer, which is Adam. 

Lastly, we have an epoch size of 10. The epoch is the training round through the full training set.
A small number of epochs might prone to a poor learning, while a high number of epochs can lead to overfitting.

In [10]:
cross_el = nn.CrossEntropyLoss()
optimizer = t.optim.Adam(net.parameters(), lr=0.001) #e-1
epoch = 10

We now iterate over the range of epochs and call the train() function to perform the training. 
x and y are the batch of features and targets, respectively.

Ex. 6 Given the loss above, calculate the loss value:

In [11]:

for epoch in range(epoch):
    running_loss=0
    for data in train_loader:
        x, y = data
        optimizer.zero_grad() #set the gradient to zero before computing the loss
        output = net(x.view(-1, 28*28)) #batch reshaping 
        loss = cross_el(output, y) 
        loss.backward() #loss back to the network parameters
        optimizer.step() #optimize the weights taking into account the loss and the gradients
        
        running_loss += loss.item()
    else:
        print(f"Training loss: {running_loss/len(train_loader)}")
print('Finished training')


Training loss: 0.3304098732235143
Training loss: 0.17216977784124415
Training loss: 0.13776018655820857
Training loss: 0.11705696993091443
Training loss: 0.10521760855721368
Training loss: 0.09694274298407898
Training loss: 0.08953916756451978
Training loss: 0.08370368471924086
Training loss: 0.07856094846446787
Training loss: 0.07353259251663243
Finished training


With t.no_grad() we do not perform gradients, and we can do he evaluation on ur test set. 
We iterate over the test set and measure for accuracy of the model on unseen data. 

In [12]:
with t.no_grad():
    correct = 0
    total = 0
    for data in test_loader:
        x, y = data
        output = net(x.view(-1, 784))
        for idx, i in enumerate(output):
            if t.argmax(i) == y[idx]:
                correct +=1
            total +=1
print(f'accuracy: {round(correct/total, 3)}')

accuracy: 0.964


Now it's time for some visual result interpretation. Is the number depicted in the picture the same given by the tensor?

In [None]:
plt.imshow(x[6].view(28, 28))
plt.show()
print(t.argmax(net(x[6].view(-1, 784))[0]))

Optional exercises:
    
Ex. 7 - Tune the learning rate, the optimizer, change the kernel sizes of the linear layers (Net). Do you find differences in the model learning and achieve accuracy? Can you discuss, argue, the results observed?

Ex. 8 - Are you already a master of deep neaural networks? Can you replace our model with a 2D convolutional neural network? it is not necessary to re-design and re-train but can you give an insight on how would you proceed? 