In [0]:
# Importing NumPy and PyTorch related packages
import numpy as np
import torch
import torchvision.datasets
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt

#PyTorch Basics

Before we go over the basics of creating a neural network, we need to get to know the language in which we are going to use to describe our neural network. We are using the PyTorch package to create our NN in Python. What is PyTorch you may ask? It’s a Python-based scientific computing package targeted at two sets of audiences:

*   A replacement for NumPy to use the power of GPUs
*   a deep learning research platform that provides maximum flexibility and speed

Since most of Neural Network's computation is massive matrix multiplications (both for forward and backward directions), we can parallelize the process heavily (thus speed things up significantly) by using the power of the GPU. This is where PyTorch has made it very simple for us to do! We don't need to directly write CUDA (parallel computing platform) kernels in C/C++ to do so, we can just write simple commands in Python and get roughly the same speed up benefits. Before we get started, lets make sure that you have enabled COLAB's GPU Setting. **It is so simple to alter default hardware (CPU to GPU or vice versa); just follow Edit > Notebook settings or Runtime>Change runtime type and select GPU as Hardware accelerator.**


In [0]:
# Now lets define our GPU instance (we use this throughout the code):

device = torch.device('cuda:0')
print(device)

cuda:0


Tensor is the datatype that we work with in pytorch. Tensors are similar to NumPy’s ndarrays and you can create a randomly initilalzied one by following:

In [0]:
x = torch.rand(5, 3)
print(x)
print(x.shape)

tensor([[0.0785, 0.6236, 0.6687],
        [0.9932, 0.5677, 0.1199],
        [0.5778, 0.9978, 0.1031],
        [0.0055, 0.3063, 0.5250],
        [0.4225, 0.0433, 0.6711]])
torch.Size([5, 3])


The tensor that you just created is on the CPU, but for you to be able to do operations on it using the GPU, you have to move it to the GPU memory by using the following command. 

In [0]:
x = x.to(device)
print(x)

tensor([[0.0785, 0.6236, 0.6687],
        [0.9932, 0.5677, 0.1199],
        [0.5778, 0.9978, 0.1031],
        [0.0055, 0.3063, 0.5250],
        [0.4225, 0.0433, 0.6711]], device='cuda:0')


We can convert a Torch Tensor to a NumPy array and vice versa easily.

In [0]:
x = torch.rand(5, 3) # Tensor on the CPU
x_numpy = x.numpy()
print('Numpy array: ', x_numpy)
x_tensor= torch.from_numpy(x_numpy)
print('Torch Tensor: ', x_tensor)

Numpy array:  [[0.45583892 0.42274487 0.2455194 ]
 [0.9713313  0.08420366 0.35444242]
 [0.40629393 0.70475537 0.03286195]
 [0.9826768  0.5062513  0.05177331]
 [0.9118254  0.4811377  0.38271707]]
Torch Tensor:  tensor([[0.4558, 0.4227, 0.2455],
        [0.9713, 0.0842, 0.3544],
        [0.4063, 0.7048, 0.0329],
        [0.9827, 0.5063, 0.0518],
        [0.9118, 0.4811, 0.3827]])


To learn more about the basics of PyTorch please look at the tutorials in the link below:

https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html

#Create a Neural Net on Pytorch

First thing first, to create a network we need to create a class that uses nn.Linear to create linear layers for our NN:

In [1]:
class Network(nn.Module):

    def __init__(self):
        super(Network, self).__init__()

        self.LLayer1 = nn.Linear(24, 12, bias = True) ## Linear layer that goes from 24 features to 12 features
        self.LLayer2 = nn.Linear(12, 5, bias = True) ## Linear layer that goes from 12 features to 5 features
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):  ## We need to create a method that takes care of the forward pass of the NN. The backward pass is automatically taken care of so
                           ## you don't need to worry about that!
        Layer1_out = self.LLayer1(x)
        Layer2_out = self.LLayer2(Layer1_out)
        output = self.sigmoid(Layer2_out)
        
        return output


NameError: name 'nn' is not defined

We just created Neural Network Structure that looks like the following:




![alt text](https://i.imgur.com/KfjibwT.png)


We initialize our model in PyTroch (with random weights) by creating an instance of the model class:

In [0]:
model = Network() # Initializes the model on the CPU

model = model.to(device) # We can move the model (weights) to the GPU memory by calling this
model

Network(
  (LLayer1): Linear(in_features=24, out_features=12, bias=True)
  (LLayer2): Linear(in_features=12, out_features=5, bias=True)
  (sigmoid): Sigmoid()
)

We also need to define a loss function to then use during learning. let's say we go with cross entrpy:

In [0]:
loss_func = nn.CrossEntropyLoss() 
loss_func = loss_func.to(device)
loss_func

CrossEntropyLoss()

And we need to define an optimizer that does the optimization (backprop process) using the loss_func and output of our model. Let's go with the familiar SGD that you have worked with in previous assigments and worksheets:

In [0]:
optimizer = optim.SGD(model.parameters(), lr=0.01)
optimizer

SGD (
Parameter Group 0
    dampening: 0
    lr: 0.01
    momentum: 0
    nesterov: False
    weight_decay: 0
)

Now we have successfully initialized our NN using Pytorch on Python! In the next worksheet, we will go through an image classification example that shows you how to use all these elements together to successfully make your very own image classifier! Stay tuned! In the meanwhile, I suggest you go over the slides and the PyTorch documentation (especially the link provided above), to understand the theory behind NNs and to learn how to apply your NN knowledge in building useful applications using PyTorch.