![](http://pytorch.org/tutorials/_images/pytorch-logo-flat.png)

<h1 id="tocheading">Table of Contents</h1>
<div id="toc"></div>

# What is PyTorch

It's a Python based scientific computing package targeted at two sets of audiences:
- A replacement for numpy to use the power of GPUs
- a deep learning research platform that provides maximum flexibility and speed.


# Tensors

Tensors are similar to numpy's numpy's ndarrays, with the addition being that Tensors can also be used on a GPU to accelerate computing.

In [4]:
from __future__ import print_function
import torch

Construct a 5x3 matrix, uninitialized

In [5]:
x = torch.Tensor(5, 3)
print(x)


 3.5766e-38  0.0000e+00  3.0018e-27
 4.5720e-41  3.0020e-27  4.5720e-41
 1.9372e-38  0.0000e+00  1.9372e-38
 0.0000e+00  1.9372e-38  0.0000e+00
 1.9373e-38  0.0000e+00 -4.0834e+31
[torch.FloatTensor of size 5x3]



Construct a randomly initialized matrix

In [6]:
x = torch.rand(5, 3)
print(x)


 0.9360  0.7240  0.4903
 0.6586  0.2276  0.0010
 0.7021  0.9414  0.4748
 0.4946  0.3443  0.0150
 0.0135  0.7855  0.9218
[torch.FloatTensor of size 5x3]



Get its size

In [7]:
x.size()

torch.Size([5, 3])

## Operations
There are multiple syntaxes for operations. 

In [8]:
y = torch.rand(5,3)
y


 0.8861  0.2896  0.8096
 0.1767  0.4769  0.1052
 0.6643  0.1559  0.8690
 0.1832  0.3198  0.9253
 0.4387  0.3724  0.4029
[torch.FloatTensor of size 5x3]

In [9]:
# syntax 1
x+y


 1.8220  1.0136  1.2999
 0.8353  0.7046  0.1062
 1.3664  1.0973  1.3438
 0.6778  0.6641  0.9404
 0.4522  1.1579  1.3247
[torch.FloatTensor of size 5x3]

In [10]:
# syntax 2
torch.add(x, y)


 1.8220  1.0136  1.2999
 0.8353  0.7046  0.1062
 1.3664  1.0973  1.3438
 0.6778  0.6641  0.9404
 0.4522  1.1579  1.3247
[torch.FloatTensor of size 5x3]

In [11]:
# syntax 3
result = torch.Tensor(5, 3)
torch.add(x, y, out=result)
print(result)


 1.8220  1.0136  1.2999
 0.8353  0.7046  0.1062
 1.3664  1.0973  1.3438
 0.6778  0.6641  0.9404
 0.4522  1.1579  1.3247
[torch.FloatTensor of size 5x3]



In [12]:
# Addition: in-place
y.add_(x)
print(y)


 1.8220  1.0136  1.2999
 0.8353  0.7046  0.1062
 1.3664  1.0973  1.3438
 0.6778  0.6641  0.9404
 0.4522  1.1579  1.3247
[torch.FloatTensor of size 5x3]



<p style="color:blue;">Any operation that mutates a tensor in-place is post-fixed with an _ For example: x.copy_(y), x.t_(), will change x.</p>

You can use standard numpy-like indexing with all bells and whistles!

In [17]:
print(x[:,0:2])
print(x[:,1])


 0.9360  0.7240
 0.6586  0.2276
 0.7021  0.9414
 0.4946  0.3443
 0.0135  0.7855
[torch.FloatTensor of size 5x2]


 0.7240
 0.2276
 0.9414
 0.3443
 0.7855
[torch.FloatTensor of size 5]



## Numpy Bridge
Converting a torch Tensor to a numpy array and vice versa is a breeze.

The torch Tensor and numpy array will share their underlying memory locations, and changing one will change the other.

### Converting torch Tensor to numpy Array

In [19]:
a = torch.ones(5)
a


 1
 1
 1
 1
 1
[torch.FloatTensor of size 5]

In [20]:
b = a.numpy()
b

array([ 1.,  1.,  1.,  1.,  1.], dtype=float32)

In [21]:
# see how the numpy array change in value.
a.add_(1)
print(a)
print(b)


 2
 2
 2
 2
 2
[torch.FloatTensor of size 5]

[ 2.  2.  2.  2.  2.]


### Converting numpy Array to torch Tensor


In [22]:
import numpy as np

In [23]:
a = np.ones(5)
a

array([ 1.,  1.,  1.,  1.,  1.])

In [24]:
b = torch.from_numpy(a)
b


 1
 1
 1
 1
 1
[torch.DoubleTensor of size 5]

In [25]:
np.add(a, 1, out=a)
print(a)
print(b)

[ 2.  2.  2.  2.  2.]

 2
 2
 2
 2
 2
[torch.DoubleTensor of size 5]



<p style="color:blue;"> All the Tensors on the CPU except a CharTensor support converting to NumPy and back.</p>

# Autograd: automatic differentiation

Central to all neural networks in PyTorch is the **```autograd```** package.

The **```autograd```** package provides automatic differentiation for all operations on Tensors. It is a define-by-run framework, which means that our backprop is defined by how our code is run, and that every single iteration can be different.


##  Variable
 

**```autograd.Variable```** is the central class of the package. It warps a Tensor, and supports nearly all of the operations defined on it. Once we finish our computation we can call ```.backward()``` and have all the gradients computed automatically.

We can access the raw tensor through ```.data``` attribute, while the gradient with respect to this variable is accumulated into ```.grad```.

![Variable](http://pytorch.org/tutorials/_images/Variable.png)

There's one more class which is very important for autograd implementation -a ```Function```.

Both ```Variable``` and ```Function``` are interconnected and build up an acyclic graph, that encodes a complete history of computation. Each Variable has a ```.creator``` attribute that references a ```Function``` that has created the ```Variable``` (except for Variables created by the user- their```creator is None```).

If we want to compute the derivatives, we can call ```.backward()``` on a ```Variable```. If ```Variable``` is a scalar(i.e it holds a one element data), you don't need to specify any arguments to ```backward()```, however if it has more elements, we need to specify a ```grad_output``` argument that is a tensor of matching shape.

## Create a Variable

In [33]:
from torch.autograd import Variable

In [34]:
x = Variable(torch.ones(2, 2), requires_grad=True)
x

Variable containing:
 1  1
 1  1
[torch.FloatTensor of size 2x2]

## Do an Operation of Variable

In [35]:
y = x + 2
y

Variable containing:
 3  3
 3  3
[torch.FloatTensor of size 2x2]

y was created as a result of an operation, so it has a creator.

In [36]:
y.creator

<torch.autograd._functions.basic_ops.AddConstant at 0x7f72d2179138>

In [37]:
# Do more operations on y
z = y * y * 3
z

Variable containing:
 27  27
 27  27
[torch.FloatTensor of size 2x2]

In [38]:
out = z.mean()
out

Variable containing:
 27
[torch.FloatTensor of size 1]

## Gradients
Let's backprop now ```out.backward()``` is equivalent to doing ```out.backward(torch.Tensor([1.0]))```.

In [40]:
out.backward()

In [41]:
x

Variable containing:
 1  1
 1  1
[torch.FloatTensor of size 2x2]

In [42]:
print(x.grad)

Variable containing:
 4.5000  4.5000
 4.5000  4.5000
[torch.FloatTensor of size 2x2]



Let's call the ```out``` Variable "0". We have that

$$o = \frac{1}{4}\sum _i z_i$$

$$z_i = 3(y_i)^2 = 3(x_i+2)^2$$

$$z_i |_{x_i=1}=27$$

Therefore, $\frac{\partial o }{\partial x_i} = \frac{3}{2}(x_i + 2)$,

hence $$ \frac{\partial o }{\partial x_i} \mid _{x_i=1}=\frac{9}{2}=4.5$$

## Many Crazy things with autograd!


In [44]:
x = torch.randn(3)
x


-0.0701
 0.5492
 0.8409
[torch.FloatTensor of size 3]

In [45]:
x = Variable(x, requires_grad=True)
x

Variable containing:
-0.0701
 0.5492
 0.8409
[torch.FloatTensor of size 3]

In [46]:
y = x * 2
y

Variable containing:
-0.1402
 1.0983
 1.6819
[torch.FloatTensor of size 3]

In [47]:
print(y*2)

Variable containing:
-0.2803
 2.1966
 3.3638
[torch.FloatTensor of size 3]



In [48]:
while y.data.norm() < 1000:
    y = y * 2
    

In [49]:
print(y)

Variable containing:
 -71.7625
 562.3411
 861.1279
[torch.FloatTensor of size 3]



In [50]:
gradients = torch.FloatTensor([0.1, 1.0, 0.0001])
y.backward(gradients)

print(x.grad)

Variable containing:
  102.4000
 1024.0000
    0.1024
[torch.FloatTensor of size 3]



In [43]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')

<IPython.core.display.Javascript object>