<a href="https://colab.research.google.com/github/jfogarty/machine-learning-intro-workshop/blob/master/misc/pytorch_example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# PyTorch with GPU in Colab

- From [Getting Started With Pytorch In Google Collab With Free GPU](https://www.marktechpost.com/2019/06/09/getting-started-with-pytorch-in-google-collab-with-free-gpu/) by Niranjan Kumar in [www.marktechpost.com](https://www.marktechpost.com).

Updated by [John Fogarty](https://github.com/jfogarty) for Python 3.6 and [Base2 MLI](https://github.com/base2solutions/mli) and [colab](https://colab.research.google.com) standalone evaluation.

**NOTE** This is currently a **Colab only** notebook. It will need significant changes to work locally.

## Colab has pytorch support built-in for Python 3 kernels.

You don't need to install any extra stuff.  Very nice.

## Pytorch – Tensors

Numpy based operations are not optimized to utilize GPUs to accelerate its numerical computations. For modern deep neural networks, GPUs often provide speedups of 50x or greater. So, unfortunately, numpy won’t be enough for modern deep learning. 

This is where Pytorch introduces the concept of **Tensor**. A Pytorch Tensor is conceptually identical to an n-dimensional numpy array. Unlike numpy, **PyTorch Tensors can utilize GPUs to accelerate their numeric computations**

Let’s see how you can create a Pytorch Tensor. First, we will import the required libraries. Remember that torch, numpy and matplotlib are pre-installed in Colab’s virtual machine.

In [0]:
import torch
import numpy
import matplotlib.pyplot as plt

The default tensor type in PyTorch is a float tensor defined as **torch.FloatTensor**. We can create tensors by using the inbuilt functions present inside the torch package.

In [6]:
## creating a tensor of 3 rows and 2 columns consisting of ones
x = torch.ones(3,2)
print(x)

tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])


In [7]:
## creating a tensor of 3 rows and 2 columns consisting of zeros
x = torch.zeros(3,2)
print(x)

tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])


### Creating tensors by random initialization:

In [9]:
# To increase the reproducibility, we often set the random seed to a specific value first.
torch.manual_seed(2)

x = torch.rand(3, 2) 
print(x)

tensor([[0.6147, 0.3810],
        [0.6371, 0.4745],
        [0.7136, 0.6190]])


In [10]:
#generating tensor randomly from normal distribution
x = torch.randn(3,3)
print(x)

tensor([[-2.1409, -0.5534, -0.5000],
        [-0.0815, -0.1633,  1.5277],
        [-0.4023,  0.0972, -0.5682]])


## Simple Tensor Operations

### Slicing of Tensors
You can slice PyTorch tensors the same way you slice ndarrays

In [18]:
#create a tensor
x = torch.tensor([[1, 2], 
                 [3, 4], 
                 [5, 6]])
print(x)
print(f"- Every row, only the last column : {x[:, 1]}")
print(f"-       Every column in first row : {x[0, :]}") 

y = x[1, 1] # take the element in first row and first column and create a another tensor
print(y)

tensor([[1, 2],
        [3, 4],
        [5, 6]])
- Every row, only the last column : tensor([2, 4, 6])
-       Every column in first row : tensor([1, 2])
tensor(4)


### Reshape Tensor

Reshape a Tensor to different shape

In [23]:
 x = torch.tensor([[1, 2], 
                 [3, 4], 
                 [5, 6]]) #(3 rows and 2 columns)
print(x)

print("\n- reshaping to 2 rows and 3 columns")
y = x.view(2, 3) #reshaping to 2 rows and 3 columns
y

tensor([[1, 2],
        [3, 4],
        [5, 6]])

- reshaping to 2 rows and 3 columns


tensor([[1, 2, 3],
        [4, 5, 6]])

Use of -1 to reshape the tensors.

-1 indicates that the shape will be inferred from previous dimensions. 

In the below code snippet `x.view(6,-1)` will result in a tensor of shape 6x1 because we have fixed the size of rows to be 6, Pytorch will now infer the best possible dimension for the column such that it will be able to accommodate all the values present in the tensor.

In [25]:
x = torch.tensor([[1, 2], 
                 [3, 4], 
                 [5, 6]]) #(3 rows and 2 columns
print(x)

print("- y shape will be 6x1")
y = x.view(6,-1)
y

tensor([[1, 2],
        [3, 4],
        [5, 6]])
- y shape will be 6x1


tensor([[1],
        [2],
        [3],
        [4],
        [5],
        [6]])

## Mathematical Operations

In [35]:
#Create two tensors
x = torch.ones([3, 2])
y = torch.ones([3, 2])

#adding two tensors
z = x + y #method 1
z = torch.add(x,y) #method 2
print(f"X\n{x}\n+\nY\n{y}\nis\n{z}")

#subtracting two tensors
z = x - y #method 1

print("\nElement-wise subtraction:")
torch.sub(x,y) #method 2

X
tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])
+
Y
tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])
is
tensor([[2., 2.],
        [2., 2.],
        [2., 2.]])

Element-wise subtraction:


tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])

In [39]:
# Scalar element-wise divison
x / 2

tensor([[0.5000, 0.5000],
        [0.5000, 0.5000],
        [0.5000, 0.5000]])

### Inplace Operations

In Pytorch all operations on the tensor that operate in-place on it will have an **`_` postfix**. For example, **`add`** is the out-of-place version, and **`add_`** is the in-place version.

In [40]:
y.add_(x) #tensor y added with x and result will be stored in y

tensor([[2., 2.],
        [2., 2.],
        [2., 2.]])

## Pytorch to Numpy Bridge

Converting an **Pytorch tensor** to **numpy ndarray** is very useful sometimes. By using `.numpy()` on a tensor, we can easily convert tensor to ndarray.

In [43]:
x = torch.linspace(0 , 1, steps = 5) #creating a tensor using linspace
x_np = x.numpy() #convert tensor to numpy
print(type(x), type(x_np)) #check the types 

<class 'torch.Tensor'> <class 'numpy.ndarray'>


To convert numpy ndarray to pytorch tensor, we can use .from_numpy() to convert ndarray to tensor

In [45]:
import numpy as np
a = np.random.randn(5) #generate a random numpy array
a_pt = torch.from_numpy(a) #convert numpy array to a tensor
print(type(a), type(a_pt)) 

<class 'numpy.ndarray'> <class 'torch.Tensor'>


During the conversion, Pytorch tensor and numpy ndarray will share their underlying memory locations and changing one will change the other.

## CUDA Support

# IMPORTANT!

You **must** enable GPU support with **Runtime** | **Change Runtime Type** | **GPU** from the Colab menu before this will work.

To check how many CUDA supported GPU’s are connected to the machine, you can use below code snippet. If you are executing the code in Colab you will get 1, that means that the Colab virtual machine is connected to one GPU. torch.cuda is used to set up and run CUDA operations. It keeps track of the currently selected GPU.

In [5]:
import torch
n = torch.cuda.device_count()
print(f"The number of CUDA devices available to Torch is {n}.")
if n == 0:
    print("*** ERROR! You need to enable GPU support first using Runtime | Change Runtime Type | GPU")

The number of CUDA devices available to Torch is 1


In [6]:
!nvidia-smi

Fri Aug 16 23:33:19 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   48C    P8    16W /  70W |     10MiB / 15079MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|  No ru

In [7]:
print(torch.cuda.get_device_name(0))

Tesla T4


The important thing to note is that we can reference this CUDA supported GPU card to a variable and use this variable for any Pytorch Operations.

All CUDA tensors you allocate will be created on that device. The selected GPU device can be changed with a [torch.cuda.device](https://pytorch.org/docs/stable/cuda.html#torch.cuda.device) context manager.

In [8]:
#Assign cuda GPU located at location '0' to a variable
cuda0 = torch.device('cuda:0')

#Performing the addition on GPU
a = torch.ones(3, 2, device=cuda0) #creating a tensor 'a' on GPU
b = torch.ones(3, 2, device=cuda0) #creating a tensor 'b' on GPU
c = a + b
print(c)


tensor([[2., 2.],
        [2., 2.],
        [2., 2.]], device='cuda:0')


## Automatic Differentiation

In this section, we will discuss the important package called automatic differentiation or autograd in Pytorch. The `autograd` package gives us the ability to perform automatic differentiation or automatic gradient computation for all operations on tensors. It is a define-by-run framework, which means that your back-propagation is defined by how your code is run.

Let’s see how to perform automatic differentiation by using a simple example. First, we create a tensor with `requires_grad` parameter set to `True` because we want to track all the operations performing on that tensor.

In [10]:
#create a tensor with requires_grad = True
x = torch.ones([3,2], requires_grad = True)
print(x)

tensor([[1., 1.],
        [1., 1.],
        [1., 1.]], requires_grad=True)


Perform a simple tensor addition operation

In [11]:
y = x + 5 #tensor addition
print(y) #check the result

tensor([[6., 6.],
        [6., 6.],
        [6., 6.]], grad_fn=<AddBackward0>)


Because $y$ was created as a result of an operation on $x$, so it has a $grad\_fn$. 

Perform more operations on $y$ and create a new tensor $z$.

In [12]:
z = y*y + 1
print(z)

print("- adding all the values in z")
t = torch.sum(z) 
print(t)

tensor([[37., 37.],
        [37., 37.],
        [37., 37.]], grad_fn=<AddBackward0>)
- adding all the values in z
tensor(222., grad_fn=<SumBackward0>)


## Back-Propagation

To perform back-propagation, you can just call `t.backward()`

In [0]:
 t.backward() #peform backpropagation but pytorch will not print any output.

In [14]:
t

tensor(222., grad_fn=<SumBackward0>)

In [15]:
x

tensor([[1., 1.],
        [1., 1.],
        [1., 1.]], requires_grad=True)

In [0]:
.

Print gradients: $$\frac{d(t)}{dx}$$

In [16]:
print(x.grad)

tensor([[12., 12.],
        [12., 12.],
        [12., 12.]])


`x.grad` will give you the **partial derivative of t with respect to x** : $\partial t / \partial x$

If you are able to figure out how we got a tensor with all the values equal to 12, then you have understood the automatic differentiation. If not don’t worry just follow along, when we execute `t.backward()` we are calculating the partial derivate of t with respect to x. 

Remember that $t$ is a function of $z$, which in turn is a function of $x$.

$$
d(t)/dx = 2y + 1\ \text{at}\ x = 1\ \text{and}\ y = 6,\ \text{where}\ y = x + 5
$$

The important point to note is that the value of the derivative is calculated at the point where we initialized the tensor $x$.

Since we initialized $x$ at a value equal to one, we get an output tensor with all the values equal to 12.

## Conclusion

In this post, we briefly looked at the Pytorch & Google Colab and we also saw how to enable GPU hardware accelerator in Colab. Then we have seen how to create tensors in Pytorch and perform some basic operations on those tensors by utilizing CUDA supported GPU. After that, we discussed the Pytorch autograd package which gives us the ability to perform automatic gradient computation on tensors by taking a simple example. If you any issues or doubts while implementing the above code, feel free to ask them in the comment section below or send me a message in LinkedIn citing this article.

### End of notebook.