![](https://discuss.pytorch.org/uploads/default/original/2X/3/35226d9fbc661ced1c5d17e374638389178c3176.png)

## References and other resources
- [PyTorch Tutorials](https://pytorch.org/tutorials/)
- [Torchvision](https://pytorch.org/docs/stable/torchvision/index.html)

## Alternatives

- [Tensorflow](https://www.tensorflow.org/)
- [Keras](https://keras.io/)
- [Theano](http://deeplearning.net/software/theano/)
- [Caffe](http://caffe.berkeleyvision.org/)
- [Caffe2](https://caffe2.ai/)
- [MXNet](https://mxnet.apache.org/)
- [many more...](https://www.google.com/search?q=deep+learning+frameworks&oq=deep+learning+frame&aqs=chrome.0.0j69i57j69i61l2j0l2.2284j0j1&sourceid=chrome&ie=UTF-8)

## So why PyTorch?

- Simple Python
- Easy to use + debug
- Supported/developed by Facebook
- Nice and extensible interface (modules, etc.)
- A lot of research code is published as PyTorch project

____

## Google Colab only!

In [None]:
# execute only if you're using Google Colab
!wget -q https://raw.githubusercontent.com/ahug/amld-pytorch-workshop/master/binder/requirements.txt -O requirements.txt
!pip install -qr requirements.txt

___

In [1]:
import torch

In [2]:
print("PyTorch Version:", torch.__version__)

PyTorch Version: 1.1.0


In [3]:
import numpy as np

Very similar to numpy framework (if that helps!)

## Tensor Creation 

## First of all, what is a tensor?

A **matrix** is a grid of numbers, let's say (3x5). In simple terms, a **tensor** can be seen as a generalization of a matrix to higher dimension. It can be of arbitrary shape, e.g. (3 x 6 x 2 x 10). 

For the start, you can think of tensors as multidimensional arrays.

In [4]:
X = torch.tensor([1, 2, 3, 4, 5])
X

tensor([1, 2, 3, 4, 5])

In [5]:
X.shape

torch.Size([5])

In [6]:
X = torch.tensor([[1, 2, 3], [4, 5, 6]])
X

tensor([[1, 2, 3],
        [4, 5, 6]])

In [7]:
X.shape

torch.Size([2, 3])

In [8]:
# numpy
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [9]:
# torch
torch.eye(3)

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

In [10]:
# numpy
5 * np.eye(3)

array([[5., 0., 0.],
       [0., 5., 0.],
       [0., 0., 5.]])

In [11]:
# torch
5 * torch.eye(3)

tensor([[5., 0., 0.],
        [0., 5., 0.],
        [0., 0., 5.]])

In [12]:
# numpy
np.ones(5)

array([1., 1., 1., 1., 1.])

In [13]:
# torch
torch.ones(5)

tensor([1., 1., 1., 1., 1.])

In [14]:
# numpy
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [15]:
# torch
torch.zeros(5)

tensor([0., 0., 0., 0., 0.])

In [16]:
# numpy
np.empty((3, 5))

array([[ 1.23516411e-322,  0.00000000e+000,  0.00000000e+000,
         0.00000000e+000,  0.00000000e+000],
       [ 0.00000000e+000,  0.00000000e+000,  0.00000000e+000,
         0.00000000e+000,  0.00000000e+000],
       [ 0.00000000e+000,  0.00000000e+000,  0.00000000e+000,
         1.46169306e+185, -0.00000000e+000]])

In [17]:
# torch
torch.empty((3, 5))

tensor([[1.2612e-44, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00]])

In [18]:
# numpy
X = np.random.random((5, 3))
X

array([[0.65553685, 0.22910497, 0.60174279],
       [0.25057626, 0.69336628, 0.85799112],
       [0.91885326, 0.49840521, 0.09241094],
       [0.38190843, 0.24701278, 0.95074729],
       [0.44192588, 0.10756499, 0.02348137]])

In [19]:
# torch
Y = torch.rand((5, 3))
Y

tensor([[0.3065, 0.0274, 0.1972],
        [0.0753, 0.8295, 0.5502],
        [0.8574, 0.5886, 0.3671],
        [0.8552, 0.3179, 0.5491],
        [0.0363, 0.5239, 0.3272]])

In [20]:
# numpy
X.shape

(5, 3)

In [21]:
# torch
Y.shape

torch.Size([5, 3])

___

## But wait: Why do we even need tensors if we can do exactly the same with numpy arrays?

`torch.tensor` behaves like numpy arrays under mathematical operations. However, `torch.tensor` additionally keeps track of the gradients (see next notebook) and provides GPU support.

____

## Linear Algebra Operations

In [23]:
X = np.random.rand(3, 5)
Y = torch.rand(3, 5)

In [24]:
# numpy (matrix multiplication)
X.T @ X

array([[0.37058679, 0.53266296, 0.21484676, 0.73606487, 0.73262589],
       [0.53266296, 1.5680143 , 0.70455751, 1.42089855, 1.22020871],
       [0.21484676, 0.70455751, 0.31975013, 0.60479045, 0.50709553],
       [0.73606487, 1.42089855, 0.60479045, 1.75932536, 1.54384748],
       [0.73262589, 1.22020871, 0.50709553, 1.54384748, 1.48446686]])

In [25]:
Y.shape

torch.Size([3, 5])

In [26]:
# torch (matrix multiplication)
Y.t() @ Y

tensor([[0.9166, 1.1402, 0.1833, 0.3286, 0.8926],
        [1.1402, 1.4414, 0.2885, 0.4404, 1.1418],
        [0.1833, 0.2885, 0.1957, 0.1505, 0.2623],
        [0.3286, 0.4404, 0.1505, 0.2101, 0.3924],
        [0.8926, 1.1418, 0.2623, 0.3924, 0.9296]])

In [27]:
Y.t().matmul(Y)

tensor([[0.9166, 1.1402, 0.1833, 0.3286, 0.8926],
        [1.1402, 1.4414, 0.2885, 0.4404, 1.1418],
        [0.1833, 0.2885, 0.1957, 0.1505, 0.2623],
        [0.3286, 0.4404, 0.1505, 0.2101, 0.3924],
        [0.8926, 1.1418, 0.2623, 0.3924, 0.9296]])

In [None]:
# CAUTION: Operator '*' does element-wise multiplication, just like in numpy!
# Y.t() * Y  # error, dimensions do not match for element-wise multiplication

In [28]:
np.linalg.inv(X.T @ X)

array([[ 8.70601793e+16,  3.24159628e+16, -4.55472655e+16,
         5.55865775e+15, -5.98340649e+16],
       [ 6.14903406e+15, -2.41244077e+15,  6.01916543e+15,
         3.87141733e+14, -3.51051284e+15],
       [ 6.04927892e+15,  1.14885418e+16, -2.13075263e+16,
         3.96971257e+14, -5.56306717e+15],
       [ 5.52813062e+15,  2.05287803e+15, -2.88141740e+15,
         3.52956236e+14, -3.79850041e+15],
       [-5.58367460e+16, -2.00746893e+16,  2.78065407e+16,
        -3.56425824e+15,  3.82661659e+16]])

In [29]:
torch.inverse(Y.t() @ Y)

tensor([[  8612287.0000,   9114375.0000,   2182488.5000,  15866261.0000,
         -26776936.0000],
        [ 12663859.0000, -25396984.0000,   4793162.5000, -14312655.0000,
          23724132.0000],
        [  2037559.7500,   3740335.0000,    451696.3750,   5290539.0000,
          -8911086.0000],
        [ 19309922.0000, -17207260.0000,   6430168.0000,   -946952.3750,
           1180182.2500],
        [-32549268.0000,  28650974.0000, -10824410.0000,   1252799.7500,
          -1413660.0000]])

In [31]:
np.arange(2, 10, 2)

array([2, 4, 6, 8])

In [32]:
torch.arange(2, 10, 2)

tensor([2, 4, 6, 8])

In [33]:
np.linspace(0, 1, 10)

array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])

In [34]:
torch.linspace(0, 1, 10)

tensor([0.0000, 0.1111, 0.2222, 0.3333, 0.4444, 0.5556, 0.6667, 0.7778, 0.8889,
        1.0000])

## Your turn

**_Create the tensor:_**

$ \begin{bmatrix}
5 & 7 & 9 & 11 & 13 & 15 & 17 & 19
\end{bmatrix}  $

In [36]:
# YOUR TURN

torch.arange(5, 21, 2)

tensor([ 5,  7,  9, 11, 13, 15, 17, 19])

## More on PyTorch Tensors

Each operation is also available as a function.

In [38]:
X = torch.rand(3, 2)

In [39]:
torch.exp(X)

tensor([[1.6097, 1.9651],
        [1.5852, 2.0654],
        [1.3335, 1.2028]])

In [40]:
X.exp()

tensor([[1.6097, 1.9651],
        [1.5852, 2.0654],
        [1.3335, 1.2028]])

In [41]:
X.sqrt()

tensor([[0.6900, 0.8219],
        [0.6787, 0.8516],
        [0.5365, 0.4297]])

In [42]:
(X.exp() + 2).sqrt() - 2 * X.log().sigmoid()  # be creative :-)

tensor([[1.2549, 1.1849],
        [1.2627, 1.1755],
        [1.3788, 1.4779]])

Many more functions available: sin, cos, tanh, log, etc.

In [43]:
A = torch.eye(3)
A

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

In [44]:
A.add(5)

tensor([[6., 5., 5.],
        [5., 6., 5.],
        [5., 5., 6.]])

In [46]:
A

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

Functions that mutate (in-place) the passed object end with an underscore, e.g. *add_*, *div_*, etc.

In [47]:
A.add_(5)

tensor([[6., 5., 5.],
        [5., 6., 5.],
        [5., 5., 6.]])

In [48]:
A

tensor([[6., 5., 5.],
        [5., 6., 5.],
        [5., 5., 6.]])

In [49]:
A.div_(3)

tensor([[2.0000, 1.6667, 1.6667],
        [1.6667, 2.0000, 1.6667],
        [1.6667, 1.6667, 2.0000]])

In [50]:
A

tensor([[2.0000, 1.6667, 1.6667],
        [1.6667, 2.0000, 1.6667],
        [1.6667, 1.6667, 2.0000]])

In [51]:
A.uniform_()  # fills the tensor with random uniform numbers in [0, 1]

tensor([[0.1741, 0.3524, 0.2812],
        [0.9146, 0.0966, 0.3531],
        [0.6598, 0.8906, 0.7436]])

In [52]:
A

tensor([[0.1741, 0.3524, 0.2812],
        [0.9146, 0.0966, 0.3531],
        [0.6598, 0.8906, 0.7436]])

## Indexing

Again, it works just like in numpy.

In [53]:
A = torch.randint(100, (3, 3))
A

tensor([[13, 66, 36],
        [96, 80, 50],
        [ 5, 61,  6]])

In [54]:
A[0, 0]

tensor(13)

In [55]:
A[2, 1]

tensor(61)

In [56]:
A[1]

tensor([96, 80, 50])

In [57]:
A[:, 1]

tensor([66, 80, 61])

In [58]:
A[1:2, :], A[1:2, :].shape

(tensor([[96, 80, 50]]), torch.Size([1, 3]))

In [59]:
A[1:, 1:]

tensor([[80, 50],
        [61,  6]])

In [60]:
A[:2, :2]

tensor([[13, 66],
        [96, 80]])

_____

## Reshaping & Expanding

In [61]:
X = torch.tensor([1, 2, 3, 4])
X

tensor([1, 2, 3, 4])

In [62]:
X = X.repeat(3, 1) # repeat it 3 times along 0th dimension and 1 times along first dimension
X, X.shape

(tensor([[1, 2, 3, 4],
         [1, 2, 3, 4],
         [1, 2, 3, 4]]), torch.Size([3, 4]))

In [64]:
# equivalent of 'reshape' in numpy (view does not allocate new memory!)
Y = X.view(2, 6)
Y

tensor([[1, 2, 3, 4, 1, 2],
        [3, 4, 1, 2, 3, 4]])

In [65]:
Y = X.view(-1)  # -1 tells PyTorch to infer the number of elements along that dimension
Y, Y.shape

(tensor([1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]), torch.Size([12]))

In [67]:
Y = X.view(-1, 2)
Y, Y.shape

(tensor([[1, 2],
         [3, 4],
         [1, 2],
         [3, 4],
         [1, 2],
         [3, 4]]), torch.Size([6, 2]))

In [68]:
Y = X.view(-1, 4)
Y, Y.shape

(tensor([[1, 2, 3, 4],
         [1, 2, 3, 4],
         [1, 2, 3, 4]]), torch.Size([3, 4]))

In [69]:
Y = torch.ones(5)
Y, Y.shape

(tensor([1., 1., 1., 1., 1.]), torch.Size([5]))

In [70]:
Y = Y.view(-1, 1)
Y, Y.shape

(tensor([[1.],
         [1.],
         [1.],
         [1.],
         [1.]]), torch.Size([5, 1]))

In [71]:
Y.expand(5, 5)  # similar to repeat but does not actually allocate new memory

tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])

In [73]:
X = torch.eye(4)
Y = X[3:, :]
Y, Y.shape

(tensor([[0., 0., 0., 1.]]), torch.Size([1, 4]))

In [74]:
Y = Y.squeeze() # removes all dimensions of size '1'
Y, Y.shape

(tensor([0., 0., 0., 1.]), torch.Size([4]))

In [75]:
Y = Y.unsqueeze(1)
Y, Y.shape

(tensor([[0.],
         [0.],
         [0.],
         [1.]]), torch.Size([4, 1]))

## Your turn!

**_Create the tensor:_**

$ \begin{bmatrix}
7 & 5 & 5 & 5 & 5 \\
5 & 7 & 5 & 5 & 5 \\
5 & 5 & 7 & 5 & 5 \\
5 & 5 & 5 & 7 & 5 \\
5 & 5 & 5 & 5 & 7 
\end{bmatrix}  $

Hint: You can use matrix sum and scalar multiplication

In [80]:
# YOUR TURN
2*torch.eye(5)+ 5*torch.ones(5)

tensor([[7., 5., 5., 5., 5.],
        [5., 7., 5., 5., 5.],
        [5., 5., 7., 5., 5.],
        [5., 5., 5., 7., 5.],
        [5., 5., 5., 5., 7.]])

**_Create the tensor:_**

$ \begin{bmatrix}
4 & 6 & 8 & 10 & 12 \\
14 & 16 & 18 & 20 & 22 \\
24 & 26 & 28 & 30 & 32
\end{bmatrix}$

In [84]:
# YOUR TURN
torch.arange(4,34,2).view(3,5)

tensor([[ 4,  6,  8, 10, 12],
        [14, 16, 18, 20, 22],
        [24, 26, 28, 30, 32]])

**_Create the tensor:_**

$ \begin{bmatrix}
2 & 2 & 2 & 2 & 2 \\
4 & 4 & 4 & 4 & 4 \\
6 & 6 & 6 & 6 & 6 \\
8 & 8 & 8 & 8 & 8
\end{bmatrix}  $

In [None]:
# YOUR TURN

_____

## Reductions

In [85]:
X = torch.randint(10, (3, 4)).float()
X

tensor([[8., 6., 4., 0.],
        [5., 9., 6., 9.],
        [5., 8., 7., 3.]])

In [86]:
X.sum()

tensor(70.)

In [87]:
X.sum().item()

70.0

In [88]:
X.sum(0) # colum-wise sum

tensor([18., 23., 17., 12.])

In [89]:
X.sum(dim=1)  # row-wise sum

tensor([18., 29., 23.])

In [90]:
X.mean()

tensor(5.8333)

In [91]:
X.mean(dim=1)

tensor([4.5000, 7.2500, 5.7500])

In [92]:
X.norm(dim=0)

tensor([10.6771, 13.4536, 10.0499,  9.4868])

## Your turn!

Compute the norms of the row-vectors in matrix **X** without using _torch.norm()_.

Remember: $$||\vec{v}||_2 = \sqrt{x_1^2 + x_2^2 + \dots + x_n^2}$$

Hint: _X\*\*2_ computes the element-wise square.

In [None]:
X = torch.eye(4) + torch.arange(4).repeat(4, 1).float()

# YOUR TURN

# SOLUTION: tensor([3.8730, 4.1231, 4.3589, 4.5826]

## Masking

In [None]:
X = torch.randint(100, (5, 3))
X

In [93]:
mask = (X > 25) & (X < 75)
mask

tensor([[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]], dtype=torch.uint8)

In [94]:
X[mask]  # returns all elements matching the criteria in a 1D-tensor

tensor([])

In [95]:
mask.sum()  # number of elements that fulfill the condition

tensor(0)

In [96]:
(X == 25) | (X > 60)

tensor([[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]], dtype=torch.uint8)

## Your turn!

Get the number of non-zeros in **X**

In [98]:
X = torch.tensor([[1, 0, 2], [0, 6, 0]])
# YOUR TURN

Compute the sum of all entries in X that are larger than the mean of all values in X.

In [None]:
# YOUR TURN

______

## Some useful properties of tensors

In [99]:
x = torch.Tensor([[0,1,2], [3,4,5]])

print("x.shape: \n%s\n" % (x.shape,))
print("x.size(): \n%s\n" % (x.size(),))
print("x.size(1): \n%s\n" % x.size(1))
print("x.dim(): \n%s\n" % x.dim())

print("x.dtype: \n%s\n" % x.dtype)
print("x.device: \n%s\n" % x.device)

x.shape: 
torch.Size([2, 3])

x.size(): 
torch.Size([2, 3])

x.size(1): 
3

x.dim(): 
2

x.dtype: 
torch.float32

x.device: 
cpu



The `nonzero` function returns indices of the non zero elements.

In [100]:
x = torch.Tensor([[0,1,2], [3,4,5]])

print("x.nonzero(): \n%s\n" % x.nonzero())

x.nonzero(): 
tensor([[0, 1],
        [0, 2],
        [1, 0],
        [1, 1],
        [1, 2]])



In [None]:
# press tab to autocomplete
# x.

___

## Converting between PyTorch and numpy

In [None]:
X = np.random.random((5,3))
X

In [None]:
# numpy ---> torch
Y = torch.from_numpy(X)  # Y is actually a DoubleTensor (i.e. 64-bit representation)
Y

In [None]:
Y = torch.rand((2,4))
Y

In [None]:
# torch ---> numpy
X = Y.numpy()
X

____

## Using GPUs 

Using **GPU** in pytorch is as simple as calling **`.cuda()`** on your tensor.

But first, you may want to check: 
 - that cuda can actually be used : `torch.cuda.is_available()`
 - how many gpus are available : `torch.cuda.device_count()`

In [None]:
torch.cuda.is_available()

In [None]:
torch.cuda.device_count()

In [None]:
x = torch.Tensor([[1,2,3], [4,5,6]])
print(x)

### tensor.cuda

_Note : If you don't have Cuda on the machine, the following examples won't work_

In [101]:
x.cuda(0)
print(x.device)
x = x.cuda(0)
print(x.device)
x = x.cuda(1)
print(x.device)

AssertionError: Torch not compiled with CUDA enabled

In [102]:
x = torch.Tensor([[1,2,3], [4,5,6]])

# This will generate an error since you cannot do operation on tensor that are not on the same device
x + x.cuda()

AssertionError: Torch not compiled with CUDA enabled

#### Write an if statement that moves x on gpu if cuda is available

In [None]:
# YOUR TURN

These kinds of if statements used to be all over the place in people's pytorch code. Recently, a more flexible way was introduced:

### torch.device

A **`torch.device`** is an object representing the device on which a torch.tensor is or will be allocated.

You can easily move a tensor from a device to another by using the **`tensor.to()`** function

In [None]:
cpu = torch.device('cpu')
cuda_0 = torch.device('cuda:0')

x = x.to(cpu)
print(x.device)
x = x.to(cuda_0)
print(x.device)

It can be more flexible since you can check if cuda exists only once in your code

In [None]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
x = x.to(device)  # We don't need to care anymore about whether cuda is available or not
print(x.device)

#### Timing GPU

How much faster is GPU ?  See for yourself ...

In [None]:
A = torch.rand(100, 1000, 1000)
B = A.cuda(1)
A.size()

In [None]:
%timeit -n 3 torch.bmm(A, A)

In [None]:
%timeit -n 30 torch.bmm(B, B)

___

## Don't forget to download the notebook, otherwise your changes will be lost!

![Download the notebook](figures/notebook-download.png)