# PyTorch 101

## Introduction
PyTorch has quite stirred machine learning society since Facebook open-sourced it in early 2017. Much of its attention comes both from its pythonic designs and its dynamic computation graph. 

Pytorch's competitive counterpart is Tensorflow, which was released well over a year earlier, backed by Google, and had established itself as the gold standard for a varities of neural networking tools.

At its core, just like Google's Tensorflow, PyTorch is a library for processing tensors. Pytoch may still lack the widespread adoption of what TensorFlow has, but it has become extremely popular, especially among research community.

We begin by importing PyTorch:

In [1]:
import torch

## Tensors

A tensor is a number, vector, matrix or any n-dimensional array. Let's start with creating a tensor with a single number:

In [2]:
# create a number, rank 0
t0 = torch.tensor(4.)
t0

tensor(4.)

Here the number `4.` is a shorthand for `4.0`. It is used to indicate to Python (and PyTorch) that we will create a floating point number. We can verify this by checking the `dtype` attribute of our tensor. In addition, Tensors can have any number of dimensions, and different lengths along each dimension. We can inspect the length along each dimension using the `.shape` property of a tensor. The number of dimensions can be obtained with its member function `dim()`.

In [3]:
t0.dtype, t0.shape, t0.dim()

(torch.float32, torch.Size([]), 0)

A rank `1` tensor can be created from a Python list:

In [4]:
# create a vector, rank 1
t1 = torch.tensor([1., 2, 3, 4])
t1, t1.shape, t1.dim()

(tensor([1., 2., 3., 4.]), torch.Size([4]), 1)

A matrix, which is a rank `2` tensor, can be created with a nested list:

In [5]:
# create a matrix, rank 2
t2 = torch.tensor([[5, 6], [7, 8], [9, 10]], dtype=torch.int32)
t2, t2.shape, t2.dim(), t2.dtype

(tensor([[ 5,  6],
         [ 7,  8],
         [ 9, 10]], dtype=torch.int32),
 torch.Size([3, 2]),
 2,
 torch.int32)

So is a rank `3` tensor:

In [6]:
# create 3-dimensional array, rank 3
t3 = torch.tensor([
    [[11, 12, 13], 
     [13, 14, 15]], 
    [[15, 16, 17], 
     [17, 18, 19.]]])
t3, t3.shape, t3.dim()

(tensor([[[11., 12., 13.],
          [13., 14., 15.]],
 
         [[15., 16., 17.],
          [17., 18., 19.]]]),
 torch.Size([2, 2, 3]),
 3)

Tensor can be created with Pytorch functions, such as creating a tensor with fill it with zeros or ones:

In [7]:
import numpy as np

In [8]:
tt = np.zeros((2, 3))
tt

array([[0., 0., 0.],
       [0., 0., 0.]])

In [9]:
# fill tensor with zeros
t = torch.zeros(2, 3)
t

tensor([[0., 0., 0.],
        [0., 0., 0.]])

In [10]:
# fill tensor with ones
t = torch.ones(3, 2)
t

tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])

create a tensor with random numbers from normal distribution:

In [11]:
tt = np.random.random((4, 3))
tt

array([[0.79985573, 0.21326207, 0.85330254],
       [0.46372578, 0.07566513, 0.56195847],
       [0.40796397, 0.90580995, 0.77255632],
       [0.94736895, 0.05747822, 0.49638903]])

In [12]:
# create a tensor from a normal distribution random
t = torch.randn(4, 3)
t

tensor([[-0.5623, -1.1501,  0.8111],
        [ 0.1832, -0.7619,  0.2897],
        [ 0.7406,  1.4099,  0.2377],
        [-0.2244, -0.1181, -1.1494]])

## Tensor view
Tensor can be reshaped with view, for example, given a `2x3` matrix:

In [13]:
t = torch.Tensor([[1, 2, 3], [4, 5, 6]])
print(t, t.shape, t.dtype)

tensor([[1., 2., 3.],
        [4., 5., 6.]]) torch.Size([2, 3]) torch.float32


It can be reshaped into a `3x2` matrix via:

In [14]:
s = t.view(3, 2)
s.shape

torch.Size([3, 2])

In [15]:
t.reshape(3, 2)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])

or another view:

In [16]:
t.view(6, 1)

tensor([[1.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.]])

## Tensor Operations

We can transpose a tensor in two ways and neither will change the original tensor:

In [17]:
# transpose a tensor
t2.t()

tensor([[ 5,  7,  9],
        [ 6,  8, 10]], dtype=torch.int32)

In [18]:
# or permute
t2.permute(1, 0)

tensor([[ 5,  7,  9],
        [ 6,  8, 10]], dtype=torch.int32)

Here are a few tensor operations. the first is an elementwise multiplication:

In [19]:
# Elementwise multiplication
t1 = torch.Tensor([[1, 2], [3, 4]])
t2 = torch.Tensor([[5, 6], [7, 8]])
t1*t2
# t1.mul(t2)

tensor([[ 5., 12.],
        [21., 32.]])

And here is matrix product:

In [20]:
# Compute matrix product
m1 = torch.Tensor([[1, 2], [3, 4]])
m2 = torch.Tensor([[5], [7]])
m = m1.mm(m2)
m

tensor([[19.],
        [43.]])

## Tensor slicing
It should be obvious so far that the similarities between numpy ndarray and pytorch tensor operations. PyTorch tensors can be sliced similarly as numpy ndarrays, which should be familiar to anyone who uses Python structures:

In [21]:
# Slicing
t = torch.Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Every row, only the last column
print(t[:, -1])

# First 2 rows, all columns
print(t[:2, :])

# Lower right most corner
print(t[-1:, -1:])

tensor([3., 6., 9.])
tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor([[9.]])


## Tensor gradients

We can combine tensors with the usual arithmetic operations. Let's look an example:

In [22]:
# Create tensors.
x = torch.tensor(3.)
w = torch.tensor(4., requires_grad=True)
b = torch.tensor(5., requires_grad=True)

We've created 3 tensors `x`, `w` and `b`, all numbers. `w` and `b` have an additional parameter `requires_grad` set to `True`. We'll see what it does in just a moment. 

Let's create a new tensor `y` by combining these tensors:

In [23]:
# Arithmetic operations
y = w * x + b
y

tensor(17., grad_fn=<AddBackward0>)

As expected, `y` is a tensor with the value `3 * 4 + 5 = 17`. What makes PyTorch special is that we can automatically compute the derivative of `y` w.r.t. the tensors that have `requires_grad` set to `True` i.e. w and b. To compute the derivatives, we can call the `.backward` method on our result `y`.

In [24]:
# Compute derivatives
y.backward()

The derivates of `y` w.r.t the input tensors are stored in the `.grad` property of the respective tensors.

In [25]:
# Display gradients
print('dy/dx:', x.grad)   # Note dy/dx is None because x's requires_grad is set to False
print('dy/dw:', w.grad)
print('dy/db:', b.grad)

dy/dx: None
dy/dw: tensor(3.)
dy/db: tensor(1.)


As expected, `dy/dw` has the same value as `x` i.e. `3`, and `dy/db` has the value `1`. Note that `x.grad` is `None`, because `x` doesn't have `requires_grad` set to `True`. 

The "grad" in `w.grad` stands for gradient, which is another term for derivative, used mainly when dealing with matrices. 

## Pytorch Tensor to and from Numpy ndarray
Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object and tools for working with these arrays.

Instead of reinventing the wheel, PyTorch interoperates really well with Numpy to leverage its existing ecosystem of tools and libraries. We can easily create a tensor from an `ndarray` and vice visa. These operations are usually fast since the data of both structures will share the same memory space, and therefore no copy is involved.

Here's an Numpy array:

In [26]:
import numpy as np

x = np.array([[1, 2, 3], [3, 4., 5]])
x

array([[1., 2., 3.],
       [3., 4., 5.]])

We can convert a Numpy array to a PyTorch tensor using `torch.from_numpy`.

In [27]:
# Convert the numpy array to a torch tensor.
y = torch.from_numpy(x)
y

tensor([[1., 2., 3.],
        [3., 4., 5.]], dtype=torch.float64)

Let's verify that the numpy array and torch tensor have similar data types.

In [28]:
x.dtype, y.dtype

(dtype('float64'), torch.float64)

We can convert a PyTorch tensor to a Numpy array using the `.numpy` method of a tensor.

In [29]:
# Convert a torch tensor to a numpy array
z = y.numpy()
z, z.dtype

(array([[1., 2., 3.],
        [3., 4., 5.]]),
 dtype('float64'))

The interoperability between PyTorch and Numpy is really important because most datasets you'll work with will likely be read and preprocessed as Numpy arrays.

PyTorch support a variety of tensor operations, and we have only scratched the surface. You can learn more about tensors and tensor operations [here](https://pytorch.org/docs/stable/tensors.html). You can take advantage of the interactive Jupyter environment to experiment with tensors and try different combinations of operations discussed above.