Introduction to Pytorch Tensors
===============================

Pytorch is an optimised `tensor` manipulation library that offers an array of packages for deep learning. As compared to static frameworks such as Theano, Caffe and Tensorflow, Pytorch is in the family of dynamic frameworks, which does not require pre-defined computational graphs. This allows for a more flexible, imperative style of development, as it does not require the computational graphs to be first declared, compiled, and then excuted. However, this is potentially at the cost of computational efficiency, which makes it not as advantageous for production and mobile settings, but extremely useful during research and development.

```{image} ../images/nlp_pytorch_book.jpg
:alt: Pytorch for NLP Book
:class: bg-primary mb-1
:width: 200px
:align: left
```
```{image} ../images/logo_pytorch.jpeg
:alt: Pytorch Logo
:class: bg-primary mb-1
:width: 100px
:align: right
```

Reference: *Natural Lanuage Processing with PyTorch* - Building intelligent lanaguage applications using deep learning, by Delip Rao and Brian McMahan (copyright O'REILLY Feb 2019)


## Tensors

```{admonition} Tensor
A tensor is a mathematical object holding some multidimensional data. 
```

```{image} ../images/tensor.png
:alt: Tensors
:class: bg-primary mb-1
:width: 80%
:align: center
```
* A tensor of order zero is just a number, or a `scalar`.
* A tensor of order one (1st-order tensor) is an array of numbers, or a `vector`.
* A tensor of order two (2nd-order tensor) is an array of vectors, or a `matrix`.
* A tenosr of order n (nth-order tensor) is a generalised n-dimensional array of scalars. 

## Creating Tensors

### A helper function `describe(x)`
`x` is a torch tensor

NOTE: `tensor.shape` is a property, not a callable function

In [1]:
def describe(x):
    print("Type:{}".format(x.type()))
    print("Shape/size:{}".format(x.shape))
    print("Values: \n{}".format(x))


### Creating a tensor with `torch.Tensor()`

In [2]:
import torch

describe(torch.Tensor(2,3))

Type:torch.FloatTensor
Shape/size:torch.Size([2, 3])
Values: 
tensor([[1.0862e+21, 3.5241e-40, 1.0879e+21],
        [3.5241e-40, 1.0863e+21, 3.5241e-40]])


### Creating a randomly initialized tensor

In [3]:
import torch

describe(torch.rand(2,3))   # uniform random
describe(torch.randn(2,3))  # normal random

Type:torch.FloatTensor
Shape/size:torch.Size([2, 3])
Values: 
tensor([[0.7992, 0.9784, 0.0448],
        [0.9578, 0.0454, 0.0026]])
Type:torch.FloatTensor
Shape/size:torch.Size([2, 3])
Values: 
tensor([[ 0.3601, -0.7138, -1.8004],
        [-0.2314, -0.3745,  0.3433]])


### Creating a filled tensor

In [4]:
import torch

describe(torch.zeros(2,3))

x = torch.ones(2,3)
describe(x)

x.fill_(5)
describe(x)

Type:torch.FloatTensor
Shape/size:torch.Size([2, 3])
Values: 
tensor([[0., 0., 0.],
        [0., 0., 0.]])
Type:torch.FloatTensor
Shape/size:torch.Size([2, 3])
Values: 
tensor([[1., 1., 1.],
        [1., 1., 1.]])
Type:torch.FloatTensor
Shape/size:torch.Size([2, 3])
Values: 
tensor([[5., 5., 5.],
        [5., 5., 5.]])


### Creating and initialising a tensor from lists

In [5]:
x = torch.Tensor([[1, 2, 3],
                  [4, 5, 6]])
describe(x)                  

Type:torch.FloatTensor
Shape/size:torch.Size([2, 3])
Values: 
tensor([[1., 2., 3.],
        [4., 5., 6.]])


### Creating and initialising a tensor from Numpy

In [6]:
import torch
import numpy as np

npy = np.random.rand(2,3)
describe(torch.from_numpy(npy))

Type:torch.DoubleTensor
Shape/size:torch.Size([2, 3])
Values: 
tensor([[0.6475, 0.9183, 0.0888],
        [0.1407, 0.9231, 0.8320]], dtype=torch.float64)


## Tensor Slicing, Indexing and Joining

In [7]:
import torch
from functions import describe

x = torch.arange(6).view(2,3)
describe(x)

Type:torch.LongTensor
Shape/size:torch.Size([2, 3])
Values: 
tensor([[0, 1, 2],
        [3, 4, 5]])


### Contiguous Indexing using `[:a, :b]`

The code below accesses up to row 1 but not including row 1, and up to col 2, but no including col 2.

In [8]:
describe(x[:1, :2])

Type:torch.LongTensor
Shape/size:torch.Size([1, 2])
Values: 
tensor([[0, 1]])


### Noncontiguous Indexing

Using function `torch.index_select()`, the code below accesses column (`dim=1`) indexed by 0 and 2. 

In [9]:
indices = torch.LongTensor([0, 2])
describe(torch.index_select(x, dim=1, index=indices))

Type:torch.LongTensor
Shape/size:torch.Size([2, 2])
Values: 
tensor([[0, 2],
        [3, 5]])


You can duplicate the same row or column multiple times, by specifying the same index multiple times. 

In [10]:
indices = torch.LongTensor([0, 0, 0])
describe(torch.index_select(x, dim=0, index=indices))

Type:torch.LongTensor
Shape/size:torch.Size([3, 3])
Values: 
tensor([[0, 1, 2],
        [0, 1, 2],
        [0, 1, 2]])


Use indices directly `[inices_list, indices_list]` can also achieve the same outcome.

In [11]:
row_indices = torch.arange(2).long()
col_indices = torch.LongTensor([0,2])
describe(x[row_indices, col_indices])

Type:torch.LongTensor
Shape/size:torch.Size([2])
Values: 
tensor([0, 5])


In [12]:
describe(x[[0,1], [0,2]])

Type:torch.LongTensor
Shape/size:torch.Size([2])
Values: 
tensor([0, 5])


### Concatenating Tensors

In [13]:
x = torch.arange(6).view(2,3)
describe(x)

Type:torch.LongTensor
Shape/size:torch.Size([2, 3])
Values: 
tensor([[0, 1, 2],
        [3, 4, 5]])


In [14]:
describe(torch.cat([x, x], dim=0))

Type:torch.LongTensor
Shape/size:torch.Size([4, 3])
Values: 
tensor([[0, 1, 2],
        [3, 4, 5],
        [0, 1, 2],
        [3, 4, 5]])


In [15]:
describe(torch.cat([x, x], dim=1))

Type:torch.LongTensor
Shape/size:torch.Size([2, 6])
Values: 
tensor([[0, 1, 2, 0, 1, 2],
        [3, 4, 5, 3, 4, 5]])


In [16]:
describe(torch.stack([x, x], dim=1))

Type:torch.LongTensor
Shape/size:torch.Size([2, 2, 3])
Values: 
tensor([[[0, 1, 2],
         [0, 1, 2]],

        [[3, 4, 5],
         [3, 4, 5]]])


### Linear Algebra on tensors: multiplication

In [17]:
x1 = torch.arange(6).view(2,3).float()
describe(x1)

Type:torch.FloatTensor
Shape/size:torch.Size([2, 3])
Values: 
tensor([[0., 1., 2.],
        [3., 4., 5.]])


In [18]:
x2 = torch.ones(3,2)
x2[:, 1] += 1
describe(x2)

Type:torch.FloatTensor
Shape/size:torch.Size([3, 2])
Values: 
tensor([[1., 2.],
        [1., 2.],
        [1., 2.]])


In [19]:
describe(torch.mm(x1, x2))

Type:torch.FloatTensor
Shape/size:torch.Size([2, 2])
Values: 
tensor([[ 3.,  6.],
        [12., 24.]])


## CUDA tensors

In [20]:
import torch
from functions import describe

print(torch.cuda.is_available())

True


In [21]:
# prefered method: device agnostic tensor instantiation

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)


cuda


In [22]:
x = torch.rand(3,2).to(device)
describe(x)

Type:torch.cuda.FloatTensor
Shape/size:torch.Size([3, 2])
Values: 
tensor([[0.5107, 0.8196],
        [0.9484, 0.2701],
        [0.1767, 0.9292]], device='cuda:0')


```{warning}
Mixing CUDA tensors with CPU-bound tensors will lead to errors. This is because we need to ensure the tensors are on the same device. 
```

In [23]:
y = torch.rand(3,2)
x + y

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

In [None]:
cpu_device = torch.device("cpu")
x = x.to(cpu_device)
y = y.to(cpu_device)
x + y

tensor([[0.6276, 0.9583],
        [0.7592, 1.2605],
        [1.0946, 0.9480]])

```{note}
It is expensive to move data back and forth from the GPU. Best practice is to carry out as much computation on GPU as possible and then just transfering the final results to CPU. 
```

```{warning}
`torch.arange()` creates LongTensor, for `torch.mm()`, we need to convert the LongTensor to FloatTensor by using `x.float()`.
```