# Lecture 8 - PyTorch

Brief introduction of deep learning and basics of using PyTorch.

## Deep Learning Libraries

There are many deep learning libraries available, the most common ones for python are

- TensorFlow, Keras
- PyTorch

Working with tensorflow requires going into lot of details of the contruction of the computation graph, whereas Keras is a higher level interface for tensorflow. Tensorflow is very popular in the industry and good for production code.

PyTorch can be used as low level interface, but is much more user-friendly than tensorflow, but it also has a higher level interface. Pytorch is more popular in the research community.

## Main features that any deep learning library should provide

No matter what library or language you use, the main features provided by a deep learning library are 
1. Use the GPU to speed up computation 
2. Ability to do automatic differentiation
3. Useful library functions for common architectures and optimization algorithms

### PyTorch
We will look at all of the above in pytorch.
The best way to think about pytorch is that its numpy + GPU + autograd.

You can install it with

```conda install pytorch```.

Alternatively (and recommended), run this notebook in Google Colab-- it provides an environment with all of the PyTorch dependencies plus a GPU free of charge.

In [1]:
import torch
import numpy as np

The equivalent object to numpy arrays in pytorch are called tensors, but they are just multidimensional arrays.

In [2]:
torch.tensor([2,3,4,5])

tensor([2, 3, 4, 5])

In [3]:
torch.zeros((5,5))

tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])

In [4]:
x = torch.ones((5,5))
x

tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])

In [5]:
2*x + 5

tensor([[7., 7., 7., 7., 7.],
        [7., 7., 7., 7., 7.],
        [7., 7., 7., 7., 7.],
        [7., 7., 7., 7., 7.],
        [7., 7., 7., 7., 7.]])

In [6]:
torch.randn(5,5)

tensor([[ 1.1433, -1.4909,  0.5293,  1.1733,  0.4054],
        [-1.5029, -1.1425,  0.3311,  0.5070,  0.7782],
        [ 2.4627,  0.4597,  1.1158, -1.4993,  0.2556],
        [ 0.2922,  0.2013, -0.9016, -0.3207, -0.9856],
        [ 0.9131, -0.3142,  0.8002, -0.2564, -0.1577]])

In [7]:
x = torch.rand(25)
x

tensor([0.6106, 0.9569, 0.8199, 0.7865, 0.1891, 0.1220, 0.8010, 0.9313, 0.6128,
        0.6827, 0.3849, 0.4764, 0.0950, 0.9901, 0.5919, 0.6326, 0.9625, 0.7334,
        0.8762, 0.6439, 0.4115, 0.4274, 0.7796, 0.4602, 0.0866])

In [8]:
x = x.reshape(-1,5)
x

tensor([[0.6106, 0.9569, 0.8199, 0.7865, 0.1891],
        [0.1220, 0.8010, 0.9313, 0.6128, 0.6827],
        [0.3849, 0.4764, 0.0950, 0.9901, 0.5919],
        [0.6326, 0.9625, 0.7334, 0.8762, 0.6439],
        [0.4115, 0.4274, 0.7796, 0.4602, 0.0866]])

In [9]:
x.shape

torch.Size([5, 5])

In [10]:
print(torch.arange(10))
print(torch.eye(5))
print(torch.linspace(0,1,11))

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]])
tensor([0.0000, 0.1000, 0.2000, 0.3000, 0.4000, 0.5000, 0.6000, 0.7000, 0.8000,
        0.9000, 1.0000])


Some functions are a bit different

In [11]:
A = torch.rand(5,5)
# or A = torch.rand((5,5))
x = torch.ones(5,1)
A @ x

tensor([[2.1638],
        [3.7201],
        [2.3053],
        [3.7552],
        [3.5588]])

In [12]:
A = np.random.rand(5,5)
x = np.ones((5,1))
A @ x

array([[2.37844034],
       [2.55915363],
       [1.58531866],
       [3.20170007],
       [2.02110453]])

You can convert tensors to a numpy array that shares its memory with the pytorch tensor

In [13]:
x = torch.ones(5,5)
x

tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])

In [14]:
xn = x.numpy()
xn

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]], dtype=float32)

In [15]:
xn[4,2] = 10
xn

array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1., 10.,  1.,  1.]], dtype=float32)

In [16]:
x

tensor([[ 1.,  1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.,  1.],
        [ 1.,  1., 10.,  1.,  1.]])

### Using the GPU

The GPU (Graphical Processing Unit) is a separate processing unit that is specialized to handle bulk computations required for rendering high quality graphics. It mainly consists of a large number of processor cores that are individually very slow, but because of their sheer number (around 2000) they can churn through computations very quickly. 

In [17]:
torch.cuda.is_available()

False