 PyTorch is an open source machine learning framework that allows you to write your own neural networks and optimize them efficiently.

 We will use a set of standard libraries that are often used in machine learning projects.

In [1]:
import time

import matplotlib.pyplot as plt

%matplotlib inline
import matplotlib_inline.backend_inline
import numpy as np
import torch
import torch.nn as nn
import torch.utils.data as data
from matplotlib.colors import to_rgba
from torch import Tensor
from tqdm.notebook import tqdm  # Progress bar


Tensors are the PyTorch equivalent to Numpy arrays, with the addition to also have support for GPU acceleration (more on that later). The name "tensor" is a generalization of concepts you already know. For instance, a vector is a 1-D tensor, and a matrix a 2-D tensor. When working with neural networks, we will use tensors of various shapes and number of dimensions.

Most common functions you know from numpy can be used on tensors as well. Actually, since numpy arrays are so similar to tensors, we can convert most tensors to numpy arrays (and back) but we don't need it too often.

Initialization
Let's first start by looking at different ways of creating a tensor. There are many possible options, the most simple one is to call Tensor passing the desired shape as input argument:

In [3]:
x = Tensor(2, 3, 4)
print(x)

tensor([[[ 1.5638e-42,  0.0000e+00, -2.1231e-09,  3.1142e-41],
         [ 9.8091e-45,  3.1142e-41, -2.1231e-09,  3.1142e-41],
         [ 1.5624e-42,  0.0000e+00, -2.1231e-09,  3.1142e-41]],

        [[ 9.8091e-45,  4.5307e-41, -2.1231e-09,  3.1142e-41],
         [ 1.5456e-42,  0.0000e+00, -2.1231e-09,  3.1142e-41],
         [ 9.8091e-45,  4.5307e-41, -2.1231e-09,  3.1142e-41]]])


The function `torch.Tensor` allocates memory for the desired tensor, but reuses any values that have already been in the memory.
To directly assign values to the tensor during initialization, there are many alternatives including:

* `torch.zeros`: Creates a tensor filled with zeros
* `torch.ones`: Creates a tensor filled with ones
* `torch.rand`: Creates a tensor with random values uniformly sampled between 0 and 1
* `torch.randn`: Creates a tensor with random values sampled from a normal distribution with mean 0 and variance 1
* `torch.arange`: Creates a tensor containing the values $N,N+1,N+2,...,M$
* `torch.Tensor` (input list): Creates a tensor from the list elements you provide

In [4]:
# Create a tensor from a (nested) list
x = Tensor([[1, 2], [3, 4]])
print(x)

tensor([[1., 2.],
        [3., 4.]])


In [5]:
# Create a tensor with random values between 0 and 1 with the shape [2, 3, 4]
x = torch.rand(2, 3, 4)
print(x)

tensor([[[0.6763, 0.3808, 0.1976, 0.6443],
         [0.3350, 0.4124, 0.3835, 0.3416],
         [0.2083, 0.6784, 0.2926, 0.1096]],

        [[0.8918, 0.3804, 0.4394, 0.4114],
         [0.5700, 0.4252, 0.3843, 0.4608],
         [0.7291, 0.8797, 0.8366, 0.6992]]])


You can obtain the shape of a tensor in the same way as in numpy (`x.shape`), or using the `.size` method:

In [6]:
shape = x.shape
print("Shape:", x.shape)

size = x.size()
print("Size:", size)

dim1, dim2, dim3 = x.size()
print("Size:", dim1, dim2, dim3)

Shape: torch.Size([2, 3, 4])
Size: torch.Size([2, 3, 4])
Size: 2 3 4


#### Tensor to Numpy, and Numpy to Tensor

Tensors can be converted to numpy arrays, and numpy arrays back to tensors.
To transform a numpy array into a tensor, we can use the function `torch.from_numpy`:

In [7]:
np_arr = np.array([[1, 2], [3, 4]])
tensor = torch.from_numpy(np_arr)

print("Numpy array:", np_arr)
print("PyTorch tensor:", tensor)

Numpy array: [[1 2]
 [3 4]]
PyTorch tensor: tensor([[1, 2],
        [3, 4]])


*To* transform a PyTorch tensor back to a numpy array, we can use the function `.numpy()` on tensors:

In [8]:
tensor = torch.arange(4)
np_arr = tensor.numpy()

print("PyTorch tensor:", tensor)
print("Numpy array:", np_arr)

PyTorch tensor: tensor([0, 1, 2, 3])
Numpy array: [0 1 2 3]


#### Operations

Most operations that exist in numpy, also exist in PyTorch.
A full list of operations can be found in the [PyTorch documentation](https://pytorch.org/docs/stable/tensors.html#), but we will review the most important ones here.

The simplest operation is to add two tensors:

In [9]:
x1 = torch.rand(2, 3)
x2 = torch.rand(2, 3)
y = x1 + x2

print("X1", x1)
print("X2", x2)
print("Y", y)

X1 tensor([[0.8191, 0.9490, 0.3723],
        [0.9975, 0.4540, 0.8013]])
X2 tensor([[0.5628, 0.5738, 0.5600],
        [0.7522, 0.9454, 0.1220]])
Y tensor([[1.3819, 1.5228, 0.9323],
        [1.7497, 1.3994, 0.9233]])


Calling `x1 + x2` creates a new tensor containing the sum of the two inputs.
However, we can also use in-place operations that are applied directly on the memory of a tensor.
We therefore change the values of `x2` without the chance to re-accessing the values of `x2` before the operation.
An example is shown below:

In [10]:
x1 = torch.rand(2, 3)
x2 = torch.rand(2, 3)
print("X1 (before)", x1)
print("X2 (before)", x2)

x2.add_(x1)
print("X1 (after)", x1)
print("X2 (after)", x2)

X1 (before) tensor([[0.1022, 0.9202, 0.6550],
        [0.9406, 0.1478, 0.0271]])
X2 (before) tensor([[0.8510, 0.9075, 0.8578],
        [0.5802, 0.5429, 0.4748]])
X1 (after) tensor([[0.1022, 0.9202, 0.6550],
        [0.9406, 0.1478, 0.0271]])
X2 (after) tensor([[0.9531, 1.8278, 1.5128],
        [1.5208, 0.6908, 0.5019]])


In-place operations are usually marked with a underscore postfix (for example `torch.add_` instead of `torch.add`).

Another common operation aims at changing the shape of a tensor.
A tensor of size (2,3) can be re-organized to any other shape with the same number of elements (e.g. a tensor of size (6), or (3,2), ...).
In PyTorch, this operation is called `view`:

In [11]:
x = torch.arange(6)
print("X", x)

X tensor([0, 1, 2, 3, 4, 5])


In [12]:
x = x.view(2, 3)
print("X", x)

X tensor([[0, 1, 2],
        [3, 4, 5]])


In [13]:
x = x.permute(1, 0)  # Swapping dimension 0 and 1
print("X", x)

X tensor([[0, 3],
        [1, 4],
        [2, 5]])


Other commonly used operations include matrix multiplications, which are essential for neural networks.
Quite often, we have an input vector $\mathbf{x}$, which is transformed using a learned weight matrix $\mathbf{W}$.
There are multiple ways and functions to perform matrix multiplication, some of which we list below:

* `torch.matmul`: Performs the matrix product over two tensors, where the specific behavior depends on the dimensions.
If both inputs are matrices (2-dimensional tensors), it performs the standard matrix product.
Can also be written as `a @ b`, similar to numpy.
* `torch.mm`: Performs the matrix product over two matrices.

In [16]:
x = torch.arange(6)
x = x.view(2, 3)
print("X", x)

X tensor([[0, 1, 2],
        [3, 4, 5]])


In [17]:
W = torch.arange(9).view(3, 3)  # We can also stack multiple operations in a single line
print("W", W)

W tensor([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])


In [18]:
h = torch.matmul(x, W)  # Verify the result by calculating it by hand too!
print("h", h)

h tensor([[15, 18, 21],
        [42, 54, 66]])


#### Indexing

We often have the situation where we need to select a part of a tensor.
Indexing works just like in numpy, so let's try it:

In [19]:
x = torch.arange(12).view(3, 4)
print("X", x)

X tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])


In [20]:
print(x[:, 1])  # Second column

tensor([1, 5, 9])


In [21]:
print(x[0])  # First row

tensor([0, 1, 2, 3])


In [22]:
print(x[:2, -1])  # First two rows, last column

tensor([3, 7])


In [23]:
print(x[1:3, :])  # Middle two rows

tensor([[ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])


Assignment: Study other libraries of python such as Matplotlib and Seaborn.