# An introduction to Pytorch

Pytorch is a platform for deep learning in Python. 

It provides tools for efficiently creating, training, testing and analyzing neural networks:

* Different types of layers (embedding, linear, convolutional, recurrent)
* Activation functions (tanh, relu, sigmoid, etc.)
* Gradient computation
* Optimizer (adam, adagrad, RMSprop, SGD, etc.)
* Implementations speed gains in GPU

## Alternatives

Other platforms for deep learning in Python exist, with different focuses: Tensorflow, Caffe, MXNet,...

* Pytorch is comparetively simple to use 
* ... and also the only one besides Tensorflow I have experience with 🙂
* Feel free to try the others!

## Tensors

Let's start with some basics: tensors are similar to numpy arrays

In [None]:
import torch

In [None]:
v1 = torch.arange(10)
v2 = torch.arange(10, 20)

print("v1: %s\n" % v1)
print("v2: %s\n" % v2)
print("Dot product: %d" % v1.dot(v2))

#### Setting values manually or randomly:

In [None]:
v3 = torch.tensor([2, 4, 6, 8])
v4 = torch.rand(10)

print("v3: %s\n" % v3)
print("v4: %s\n" % v4)

#### You can also change a value inside the array manually

In [None]:
v4[1] = 0.1
print(v4)

#### Accessing values (indexing)

Individual tensor positions are scalars, or 0-dimension tensor:

In [None]:
print(v1[0])
print(v1[0].shape)

`.item()` returns a Python number:

In [None]:
number = v1[0].item()
print(number)
print(isinstance(number, int))

#### Elementwise operations

In [None]:
v1 + v2

In [None]:
v1 * v2

Some caveats when working with integer values!

In [None]:
v1 / v2 

In [None]:
x = torch.tensor(v1, dtype=torch.float)
y = torch.tensor(v2, dtype=torch.float)
x / y

#### Operations with constants

In [None]:
x + 1

In [None]:
x ** 2

#### Matrices

In [None]:
m1 = torch.rand(5, 4)
m2 = torch.rand(4, 5)

print("m1: %s\n" % m1)
print("m2: %s\n" % m2)
print(m1.dot(m2))

Oops... that can be misleading if you are used to numpy. Instead, call `matmul`

In [None]:
print(m1.matmul(m2))

#### Higher order tensors

In [None]:
t = torch.rand(3, 4, 5)
t

## Broadcasting

Broadcasting means doing some arithmetic operation with tensors of different ranks, as if the smaller one were expanded, or broadcast, to match the larger.

Let's experiment with a matrix (rank 2 tensor) and a vector (rank 1).

In [None]:
m = torch.rand(5, 4)
v = torch.rand(4)

In [None]:
print("m:", m)
print()
print("v:", v)
print()

m_plus_v = m + v
print("m + v:", m_plus_v)

Let's see row by row

In [None]:
print("m[0] = %s\n" % m[0])
print("v = %s\n" % v)

row_sum = m[0] + v
print("m[0] + v = %s\n" % row_sum)
print("(m + v)[0] = %s" % m_plus_v[0])

We can also reshape tensors

In [None]:
v.shape

In [None]:
v = v.view([2, 2])
v

In [None]:
v = v.view([4, 1])
v

Note that shape `[4, 1]` is not broadcastable to match `[5, 4]`!

In [None]:
m + v

... but `[1, 4]` is!

In [None]:
v = v.view([1, 4])
m + v

Broadcasting can be tricky sometimes:

In [None]:
u = torch.rand(4, 1)
u + v

Always take care with tensor shapes! It is a good practice to verify in the interpreter how some expression is evaluated before inserting into your model code.

## Useful Functions

Pytorch (and other libraries) have many functions that operate on tensors. Let's try some of them and plot the results.

In [None]:
%matplotlib inline

import matplotlib
import numpy as np
import matplotlib.pyplot as pl

Create a vector x with values from -10 to 10, and intervals of 0.1.

In [None]:
x = torch.arange(-10, 10, 0.1, dtype=torch.float)

The `.numpy()` method converts Pytorch tensors to numpy array. It is necessary to plot with matplotlib.

In [None]:
y = x.sin()
pl.plot(x.numpy(), y.numpy())

Hyperbolic tangent

In [None]:
y = x.tanh()
pl.plot(x.numpy(), y.numpy())

$e^x$ 

In [None]:
y = torch.exp(x)
pl.plot(x.numpy(), y.numpy())

We can implement softmax. Since it's a function of the whole array, the plot has a slightly different meaning (notice that the y-axis only goes until 0.1)

In [None]:
exps = torch.exp(x)
z = exps.sum()
softmax = exps / z
pl.plot(x.numpy(), softmax.numpy())

Anyway, torch also provides an implementation of softmax:

In [None]:
y = torch.softmax(x, dim=0)
pl.plot(x.numpy(), y.numpy())

Let's see the softmax with another x:

In [None]:
x = torch.randn([100])
y = torch.softmax(x, dim=0)
pl.plot(x.numpy(), y.numpy(), '.')