Setup environment.

In [1]:
from __future__ import print_function

import torch as th
import numpy as np

## 1. Tensors

Tensors are like matricies and we can use them in GPU-accelerated computations.

This script creates an unitialized `5x3` tensor:

In [2]:
x = th.Tensor(5, 3)
s = x.size() # returns tuple torch.Size
print(x)
print(s)


 0.0000e+00  8.5899e+09  0.0000e+00
 8.5899e+09  7.0065e-45  0.0000e+00
 0.0000e+00  0.0000e+00  0.0000e+00
 0.0000e+00  0.0000e+00  0.0000e+00
 0.0000e+00  0.0000e+00  2.0243e+24
[torch.FloatTensor of size 5x3]

torch.Size([5, 3])


There are many other different ways to create tensors:

In [3]:
x1 = th.rand(5, 3)
x2 = th.randn(5, 3)
x3 = th.eye(5, 3)
x4 = th.zeros(5, 3)
x5 = th.ones(5, 3)

### Operations

Operations can be expressed in different ways. Let's consider addition as an example. The simplest form is to use `+` operator:

In [4]:
x = th.rand(2, 2)
y = th.rand(2, 2)
print(x + y)


 0.7764  1.3517
 0.3095  0.5741
[torch.FloatTensor of size 2x2]



There's analogous version using `add` function:

In [5]:
print(th.add(x, y))


 0.7764  1.3517
 0.3095  0.5741
[torch.FloatTensor of size 2x2]



The same `add` function can be used to output result into output variable:

In [6]:
result = th.Tensor(2, 2)
th.add(x, y, out=result)
print(result)


 0.7764  1.3517
 0.3095  0.5741
[torch.FloatTensor of size 2x2]



`add` can be also used as a method on tensor instances:

In [7]:
print(x.add(y))


 0.7764  1.3517
 0.3095  0.5741
[torch.FloatTensor of size 2x2]



And there's one more instance function, namely `add_`, which modifies operand in place:

In [8]:
x.add_(y) # this will modify x
print(x)


 0.7764  1.3517
 0.3095  0.5741
[torch.FloatTensor of size 2x2]



By convension used in PyTorch methods ending with `_` mutate the tensor.

### Indexing

PyTorch tensors indexing works much like in NumPy arrays.

In [9]:
print(x[:1])
print(x[:,:1])
# etc.


 0.7764  1.3517
[torch.FloatTensor of size 1x2]


 0.7764
 0.3095
[torch.FloatTensor of size 2x1]



### CUDA Tensors

If you have CUDA installed, you can move tensors into GPU:

In [10]:
if th.cuda.is_available():
    x = x.cuda()
    y = y.cuda()
    print(x + y)

## 2. Autograd

PyTorch `autograd` package can be used to automatically calculate derivatives of calculation graphs with respect to the "leaves" of this
graph.

In [11]:
from torch.autograd import Variable

Create a variable:

In [12]:
x = Variable(th.ones(2, 2), requires_grad=True)
print(x)
print("Is leaf?", x.is_leaf)
print("Grad function:", x.grad_fn)
print("Grad:", x.grad)

Variable containing:
 1  1
 1  1
[torch.FloatTensor of size 2x2]

Is leaf? True
Grad function: None
Grad: None


As you can see, it's a "leaf" node, but both `grad_fn` and `grad` properties are empty. `grad_fn` will be allways empty, but `grad` is populated once we use `backward` in computation graph.

Let' do some more calculations:

In [13]:
y = x + 2
z = y * y * 3
out = z.mean()
print(z)

Variable containing:
 27  27
 27  27
[torch.FloatTensor of size 2x2]



Results of the calculations are `Variable`s (non-leaf) themselves. We can "backpropagate" through the whole calculation:

In [14]:
out.backward()

In [15]:
x.grad

Variable containing:
 4.5000  4.5000
 4.5000  4.5000
[torch.FloatTensor of size 2x2]

Let's make sure the result is correct:

$
\frac{\partial o}{\partial x_i}=
\frac{1}{4}\frac{\partial 3(x_i + 2)^2}{\partial x_i}=
\frac{3}{2}(x_i + 2)=4.5
$

We can do much more complex things with Autograd:

In [16]:
x = th.randn(3)
x = Variable(x, requires_grad=True)
y = x * 2
while y.data.norm() < 1000:
    y = y * 2
    
print(y)

Variable containing:
 -294.5496
 -364.7347
 1358.2596
[torch.FloatTensor of size 3]



Autograd supports differentiation of scalar values. Because `y` is tensor, we should provide initial gradient values as well:

In [17]:
gradients = th.FloatTensor([0.1, 1.0, 0.0001])
y.backward(gradients)
print(x.grad)

Variable containing:
  102.4000
 1024.0000
    0.1024
[torch.FloatTensor of size 3]

