In [1]:
%matplotlib inline


What is PyTorch?
================

It’s a Python-based scientific computing package targeted at two sets of
audiences:

-  A replacement for NumPy to use the power of GPUs
-  a deep learning research platform that provides maximum flexibility
   and speed

## Getting Started

### Tensors


Tensors are similar to NumPy’s ndarrays, with the addition being that
Tensors can also be used on a GPU to accelerate computing.



In [2]:
import torch

Construct a 5x3 matrix, uninitialized:



In [3]:
x = torch.empty(5, 3)
print x

tensor([[     0.0000,      0.0000,      0.0000],
        [     0.0000,  21238.8438,      0.0000],
        [ 20856.1562,      0.0000,  29012.9375],
        [     0.0000,  20856.7031,      0.0000],
        [     0.0000,      0.0000,      0.0000]])


Construct a randomly initialized matrix:



In [6]:
x = torch.rand(5, 3)
print x
print x.dtype

tensor([[ 0.9728,  0.9349,  0.4987],
        [ 0.8644,  0.0221,  0.5359],
        [ 0.2540,  0.5607,  0.9338],
        [ 0.0177,  0.7512,  0.5048],
        [ 0.7924,  0.7063,  0.3772]])
torch.float32


Construct a matrix filled zeros and of dtype long:



In [7]:
x = torch.zeros(5, 3, dtype=torch.long)
print x
print x.dtype

tensor([[ 0,  0,  0],
        [ 0,  0,  0],
        [ 0,  0,  0],
        [ 0,  0,  0],
        [ 0,  0,  0]])
torch.int64


Construct a tensor directly from data:



In [9]:
x = torch.tensor([5.5, 3])
print x
x.dtype

tensor([ 5.5000,  3.0000])


torch.float32

or create a tensor based on an existing tensor. These methods
will reuse properties of the input tensor, e.g. dtype, unless
new values are provided by user



In [20]:
x = x.new_ones(5, 3, dtype=torch.double)      # new_* methods take in sizes
print x

x = torch.randn_like(x, dtype=torch.float)    # override dtype!
print x  # result has the same size
x.dtype
print x.numel()
print x.sum()/x.numel()
print x.mean()

tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.],
        [ 1.,  1.,  1.],
        [ 1.,  1.,  1.],
        [ 1.,  1.,  1.]], dtype=torch.float64)
tensor([[ 0.2639,  0.5860, -1.6559],
        [ 0.9287, -0.7821,  1.4907],
        [-0.5993,  2.5149, -1.4660],
        [ 0.9951,  0.6530,  1.4907],
        [ 0.5419,  0.4566, -0.5260]])
15
tensor(0.3261)
tensor(0.3261)


Get its size:



In [12]:
print x.size()

torch.Size([5, 3])


<div class="alert alert-info"><h4>Note</h4><p>``torch.Size`` is in fact a tuple, so it supports all tuple operations.</p></div>

## Operations

There are multiple syntaxes for operations. In the following
example, we will take a look at the addition operation.

Addition: syntax 1



In [36]:
y = torch.rand(5, 3)
print y

tensor([[ 0.1627,  0.7346,  0.8221],
        [ 0.0905,  0.1984,  0.1749],
        [ 0.5911,  0.6713,  0.5217],
        [ 0.3082,  0.8749,  0.8446],
        [ 0.5158,  0.2335,  0.9977]])


In [30]:
%%timeit
result = x + y
#print result

The slowest run took 30.67 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.8 µs per loop


Addition: syntax 2



In [31]:
%%timeit
result = torch.add(x, y)
#print result

The slowest run took 25.10 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.75 µs per loop


Addition: providing an output tensor as argument



In [32]:
%%timeit
result = torch.empty(5, 3)
torch.add(x, y, out=result)
#print result

The slowest run took 17.79 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.71 µs per loop


Addition: in-place



In [33]:
# adds x to y
y.add_(x)
print y

tensor([[ 0.8802,  0.8180, -0.7560],
        [ 1.7553, -0.4484,  1.6749],
        [-0.1496,  2.8744, -1.2717],
        [ 1.8128,  0.9243,  1.5242],
        [ 1.2921,  1.0975,  0.3919]])


<div class="alert alert-info"><h4>Note</h4><p>Any operation that mutates a tensor in-place is post-fixed with an ``_``.
    For example: ``x.copy_(y)``, ``x.t_()``, will change ``x``.</p></div>

You can use standard NumPy-like indexing with all bells and whistles!



In [34]:
print x[:, 1]

tensor([ 0.5860, -0.7821,  2.5149,  0.6530,  0.4566])


Resizing: If you want to resize/reshape tensor, you can use ``torch.view``:



In [35]:
x = torch.randn(4, 4)
y = x.view(16)
z = x.view(-1, 8)  # the size -1 is inferred from other dimensions
print x.size(), y.size(), z.size()

torch.Size([4, 4]) torch.Size([16]) torch.Size([2, 8])


If you have a one element tensor, use ``.item()`` to get the value as a
Python number



In [37]:
x = torch.randn(1)
print x
print x.item()

tensor([-0.5042])
-0.504151940346


**Read later:**


  100+ Tensor operations, including transposing, indexing, slicing,
  mathematical operations, linear algebra, random numbers, etc.,
  are described
  `here <http://pytorch.org/docs/torch>`_.

NumPy Bridge
------------

Converting a Torch Tensor to a NumPy array and vice versa is a breeze.

The Torch Tensor and NumPy array will share their underlying memory
locations, and changing one will change the other.

## Converting a Torch Tensor to a NumPy Array



In [38]:
a = torch.ones(5)
print a

tensor([ 1.,  1.,  1.,  1.,  1.])


In [39]:
b = a.numpy()
print b
type(b)

[1. 1. 1. 1. 1.]


numpy.ndarray

See how the numpy array changed in value.



In [40]:
a.add_(1)
print a
print b

tensor([ 2.,  2.,  2.,  2.,  2.])
[2. 2. 2. 2. 2.]


## Converting NumPy Array to Torch Tensor

See how changing the np array changed the Torch Tensor automatically



In [41]:
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print a
print b

[2. 2. 2. 2. 2.]
tensor([ 2.,  2.,  2.,  2.,  2.], dtype=torch.float64)


All the Tensors on the CPU except a CharTensor support converting to
NumPy and back.

CUDA Tensors
------------

Tensors can be moved onto any device using the ``.to`` method.



In [44]:
# let us run this cell only if CUDA is available
# We will use ``torch.device`` objects to move tensors in and out of GPU
print torch.cuda.is_available()
if torch.cuda.is_available():
    device = torch.device("cuda")          # a CUDA device object
    y = torch.ones_like(x, device=device)  # directly create a tensor on GPU
    x = x.to(device)                       # or just use strings ``.to("cuda")``
    z = x + y
    print z
    print z.to("cpu", torch.double)        # ``.to`` can also change dtype together!

False


### If you want to write device independent code

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
x = x.to(device)  # this will go to GPU if available otherwise it doesn't change anything

Check the following link for working with multiple GPUs

https://pytorch.org/docs/stable/notes/cuda.html#cuda-semantics

## Explicit cpu/cuda syntax

In [45]:
# Send a pyTorch tensor to GPU 
if torch.cuda.is_available():

    # How many GPUs do we have?
    num_gpus    = torch.cuda.device_count()
    current_gpu = torch.cuda.device_count()
    print "Current device index: {}. Total number of devices: {}".format(current_gpu,num_gpus)
    torch.cuda.set_device(0)
    z = z.cuda() 
    
    print "\nOn GPU: z = \n{}".format(z)
    print "Type ",z.type() # that's different to type(z) python does not know the difference)

    print "Notice the type has now become torch.cuda.FloatTensor."
    if num_gpus>=2: 
        # Now let's send it to a different GPU if available
        print "Switching to a different device..."
        torch.cuda.set_device(1)
        print "Current device index: {}".format(torch.cuda.current_device())
        z = z.cuda()
        print "On a different GPU: z = {}".format(z)
        print "Type ", z.type()
else:
    print "No GPU available"
# Send Tensor back to CPU
z = z.cpu()
print("\nOn CPU: z = {}".format(z))
print "Type ",z.type()

No GPU available

On CPU: z = tensor([[ 0.2623, -0.0198,  1.1783,  0.2415,  0.2092,  0.9426,  1.5065,
         -1.1077],
        [ 0.9881,  0.0359, -1.0039,  0.7913,  0.5045, -1.2458, -0.6967,
         -0.0208]])
Type  torch.FloatTensor


In [46]:
# Convert Tensor to Numpy array and vice versa
import numpy as np
z = x + y
z = z.numpy()
print "Converted to numpy, z = {}\n{}".format(type(z), z)

z = torch.from_numpy(z)
print "Converted to Tensor, z = {}".format(z)

# Note that the conversion is not possible when the Tensor is on GPU
z = z.cuda()
z = z.numpy()  # <<<< create error on purpose!!
#try:
#    z = z.numpy()
#except RuntimeError as e:
#    print "Error: {}".format(e)

Converted to numpy, z = <type 'numpy.ndarray'>
[[-0.34143215  0.2304405   0.31792617]
 [-0.4136486  -0.30574924 -0.3292904 ]
 [ 0.08693099  0.16712463  0.01755226]
 [-0.19599134  0.3707618   0.34043664]
 [ 0.01163363 -0.2706508   0.49351948]]
Converted to Tensor, z = tensor([[-0.3414,  0.2304,  0.3179],
        [-0.4136, -0.3057, -0.3293],
        [ 0.0869,  0.1671,  0.0176],
        [-0.1960,  0.3708,  0.3404],
        [ 0.0116, -0.2707,  0.4935]])


RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch_1524577523076/work/aten/src/THC/THCGeneral.cpp:70

### Timing
The speed improvement on GPU is subject of the next tutorial tensor_tutorial_timing.ipynb