# PyTorch Playground

In this code tutorial we will learn the PyTorch's basic tensor manipulation. The code in this tutorial is borrowd from the great Valerio Maggio. Follow him here: https://github.com/leriomaggio



## What is PyTorch

-  A library for Tensor manipulation;
-  a deep learning framework that provides maximum flexibility and speed

### PyTorch Basics

One of the **striking feature** of PyTorch is its *natural* integration with **NumPy** which aided a lot the adoption among researchers.

Indeed, learning `PyTorch APIs` is by far easier than learning the API of other frameworks 
(e.g. `TensorFlow`)[$^1$](#fn1)

`NumPy` integration + **Dynamic Graph** computation makes `torch` a great tool for researcher and practitioners.


<span id="fn1">Actually there are people who spent so much time and effort to learn `TF 1.x` API so that they are very reluctuant to even updated to `TF 2.x`</span>

In [1]:
import torch
import numpy as np

np.random.seed(42)

torch.manual_seed(7)

<torch._C.Generator at 0x7ff778088270>

### Tensors

**Tensors** are the **main** data structure supported by PyTorch (*everything is built around `torch.Tensor`*)

* Scalar is a single number.
* Vector is an array of numbers.
* Matrix is a 2-D array of numbers.
* Tensors are N-D arrays of numbers.

##### Creating Tensors

You can create tensors by specifying the shape as arguments.  

Here is a tensor with `6` rows and `4` columns

In [2]:
t = torch.Tensor(6, 4)

In [3]:
t.shape  # shape of tensor

torch.Size([6, 4])

In [4]:
t.type()  # dtype of tensor

'torch.FloatTensor'

In [5]:
t[2:4, :]  #slicing

tensor([[5.0252e-37, 0.0000e+00, 5.0249e-37, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 1.8077e-43, 0.0000e+00]])

It's common in prototyping to create a tensor with random numbers of a specific shape.

In [6]:
x = torch.rand(2, 3)
x.shape

torch.Size([2, 3])

You can also initialize tensors of ones or zeros.

In [7]:
zeros = torch.zeros(4, 4)
ones = torch.ones(4, 4)

Tensors can be initialized and then filled in place. 

Note: operations that **end in an underscore** (`_`) are in place operations by convention

In [8]:
x = torch.Tensor(3,4).fill_(5)
print(x.type())
print(x.shape)
print(x)

torch.FloatTensor
torch.Size([3, 4])
tensor([[5., 5., 5., 5.],
        [5., 5., 5., 5.],
        [5., 5., 5., 5.]])


###### `torch.Tensor` from `list` and `np.ndarray`

Tensors can be initialized from a list of lists (for 2D tensor)

In [9]:
x = torch.Tensor([[1, 2,],  
                  [3, 4,],
                  [5, 6,]
                 ])
print(x.type())
x

torch.FloatTensor


tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])

... or `np.array`

In [10]:
npy = np.random.rand(4, 4)
torch.from_numpy(npy)

tensor([[0.3745, 0.9507, 0.7320, 0.5987],
        [0.1560, 0.1560, 0.0581, 0.8662],
        [0.6011, 0.7081, 0.0206, 0.9699],
        [0.8324, 0.2123, 0.1818, 0.1834]], dtype=torch.float64)

#### Tensor Types

The `FloatTensor` is the **default** tensor type.

However, we can control the type of **tensors** by explicit casting, or specialised constructors.

In [11]:
x = torch.FloatTensor([[1, 2, 3],  
                       [4, 5, 6]])
# converts to long
x = x.long()
x.type()

'torch.LongTensor'

In [12]:
# Specify `dtype` in the constructor
x = torch.tensor([[1, 2, 3], 
                  [4, 5, 6]], dtype=torch.int64)
x.type()

'torch.LongTensor'

In [13]:
x.float().type()

'torch.FloatTensor'

In [14]:
x.type()

'torch.LongTensor'

**What happened??** 

Method like `.float()` or `.long()` return a new copy of the Tensor.

###### Excercise

Try to create a long tensor from a `random` NumPy array and then cast it as `float`

In [15]:
# Your code here



### Element-wise Operations

In [16]:
x = torch.rand(4, 5)

In [17]:
x

tensor([[0.2071, 0.6297, 0.3653, 0.8513, 0.8549],
        [0.5509, 0.2868, 0.2063, 0.4451, 0.3593],
        [0.7204, 0.0731, 0.9699, 0.1078, 0.8829],
        [0.4132, 0.7572, 0.6948, 0.5209, 0.5932]])

In [18]:
x + x

tensor([[0.4142, 1.2595, 0.7306, 1.7025, 1.7099],
        [1.1019, 0.5737, 0.4126, 0.8902, 0.7186],
        [1.4408, 0.1461, 1.9398, 0.2156, 1.7658],
        [0.8263, 1.5144, 1.3897, 1.0419, 1.1865]])

In [19]:
x * 2

tensor([[0.4142, 1.2595, 0.7306, 1.7025, 1.7099],
        [1.1019, 0.5737, 0.4126, 0.8902, 0.7186],
        [1.4408, 0.1461, 1.9398, 0.2156, 1.7658],
        [0.8263, 1.5144, 1.3897, 1.0419, 1.1865]])

The convention of `_` indicating in-place operations continues:

In [20]:
x = torch.arange(12).reshape(3, 4)
print(x)
print(x.add_(x))

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
tensor([[ 0,  2,  4,  6],
        [ 8, 10, 12, 14],
        [16, 18, 20, 22]])


##### `arange` and `reshape`

In [21]:
x = torch.arange(6)
x

tensor([0, 1, 2, 3, 4, 5])

In [22]:
x = x.view(2, 3)
x

tensor([[0, 1, 2],
        [3, 4, 5]])

**Q**: What is the equavalent of `torch.view` in `numpy`?

In [23]:
x = torch.arange(6).view(2, 3)

x

tensor([[0, 1, 2],
        [3, 4, 5]])

In [24]:
torch.sum(x, dim=0)  # over axis = 0

tensor([3, 5, 7])

In [25]:
torch.sum(x, dim=1)  # over axis = 1 (column-wise)

tensor([ 3, 12])

In [26]:
# np.swapaxes

torch.transpose(x, 0, 1)

tensor([[0, 3],
        [1, 4],
        [2, 5]])

Long Tensors are used for indexing operations and mirror the `int64` numpy type

In [32]:
x = torch.LongTensor([[1, 2, 3],  
                      [4, 5, 6],
                      [7, 8, 9]])
print(x.dtype)
print(x.numpy().dtype)

torch.int64
int64


You can convert a FloatTensor to a LongTensor

In [31]:
x = torch.FloatTensor([[1, 2, 3],  
                       [4, 5, 6],
                       [7, 8, 9]])
print(x.dtype)
print(x.numpy().dtype)

x = x.long()
print(x.dtype)
print(x.numpy().dtype)

torch.float32
float32
torch.int64
int64


### Special Tensor initializations

We can create a vector of incremental numbers

In [33]:
x = torch.arange(0, 10)
print(x)

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


Sometimes it's useful to have an integer-based arange for indexing

In [34]:
x = torch.arange(0, 10).long()
print(x)

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


#### Operations

Using the tensors to do linear algebra is a foundation of modern Deep Learning practices

**Reshaping** allows you to move the numbers in a tensor around.  

In PyTorch, reshaping is called `view`:

(*from the documentation: *)
> Returns a new tensor with the same data as the self tensor but of a different shape.
>  The returned tensor **shares the same** data and must have the same number of elements, but may have a different size. For a tensor to be viewed, the new view size must be compatible with its original size and stride

We can use view to add size-1 dimensions, which can be useful for combining with other tensors.  

**This is called broadcasting.**

In [35]:
x = torch.arange(0, 20)

print(x.view(1, 20))
print(x.view(2, 10))
print(x.view(4, 5))
print(x.view(5, 4))
print(x.view(10, 2))
print(x.view(20, 1))

tensor([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
         18, 19]])
tensor([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
tensor([[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]])
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15],
        [16, 17, 18, 19]])
tensor([[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7],
        [ 8,  9],
        [10, 11],
        [12, 13],
        [14, 15],
        [16, 17],
        [18, 19]])
tensor([[ 0],
        [ 1],
        [ 2],
        [ 3],
        [ 4],
        [ 5],
        [ 6],
        [ 7],
        [ 8],
        [ 9],
        [10],
        [11],
        [12],
        [13],
        [14],
        [15],
        [16],
        [17],
        [18],
        [19]])


In [36]:
x = torch.arange(12).view(3, 4)
y = torch.arange(4).view(1, 4)
z = torch.arange(3).view(3, 1)

print(x)
print(y)
print(z)
print(x + y)
print(x + z)

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
tensor([[0, 1, 2, 3]])
tensor([[0],
        [1],
        [2]])
tensor([[ 0,  2,  4,  6],
        [ 4,  6,  8, 10],
        [ 8, 10, 12, 14]])
tensor([[ 0,  1,  2,  3],
        [ 5,  6,  7,  8],
        [10, 11, 12, 13]])


Unsqueeze and squeeze will add and remove 1-dimensions.

In [37]:
x = torch.arange(12).view(3, 4)
print(x.shape)

x = x.unsqueeze(dim=1)
print(x.shape)

x = x.squeeze()
print(x.shape)

torch.Size([3, 4])
torch.Size([3, 1, 4])
torch.Size([3, 4])


##### Swapping Axes

Another very common operation to perform requires `swapping` axes when analysing images, due to different internal representation between `numpy` and `torch`.

For instance, for a 3D tensor representing a simple `RGB` (three channels) image:

`torch` $\mapsto$ (`channel`, `row`, `col`);
`numpy` $\mapsto$ (`row`, `col`, `channel`).

Transposing allows you to switch the dimensions to be on different axis. So we can make it so all the rows are columsn and vice versa. 

In [38]:
x = torch.arange(2352).view(28, 28, 3)
print("x: \n", x.shape) 
print("tranpose: \n", x.transpose(2, 0).shape)
print("x - again: \n", x.shape)

x: 
 torch.Size([28, 28, 3])
tranpose: 
 torch.Size([3, 28, 28])
x - again: 
 torch.Size([28, 28, 3])


###### Understanding Dimensions in `Tensor`

A `3D` tensor would represent a `batch` of sequences, where each sequence item has a feature vector.  

It is common to switch the batch and sequence dimensions so that we can more easily index the sequence in a sequence model.

Note: `transpose` will only let you swap `2` axes.  `permute` allows for multiple axes:

In [39]:
batch_size = 3
seq_size = 28
feature_size = 28

x = torch.arange(batch_size * seq_size * feature_size).view(batch_size, seq_size, feature_size)

Permute is a more general version of tranpose:

In [40]:
print("x.permute(1, 0, 2).shape: \n", x.permute(1, 0, 2).shape)

x.permute(1, 0, 2).shape: 
 torch.Size([28, 3, 28])


In [41]:
x = torch.arange(0, 12).view(3,4).float()
print(x)

x2 = torch.ones(4, 2)
x2[:, 1] += 1
print(x2)

print(x.mm(x2))

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])
tensor([[1., 2.],
        [1., 2.],
        [1., 2.],
        [1., 2.]])
tensor([[ 6., 12.],
        [22., 44.],
        [38., 76.]])


See the [PyTorch Math Operations Documentation](https://pytorch.org/docs/stable/torch.html#math-operations) for more!

### Computing Gradients

Central to Tensors is the *free* computation of **gradients** - **when required**!

To specify that a `Tensor` requires `gradient`, the option `requires_grad` has to be provided.

In [42]:
x = torch.tensor([[2.0, 3.0]], requires_grad=True)
z = 3 * x
print(z)

tensor([[6., 9.]], grad_fn=<MulBackward0>)


In the **next** example, we are going to slightly complicate a bit more the (**graph**) of operations..

1. We create a tensor and multiply it by `3`.  
2. We create a scalar output using `sum()`.  

A Scalar output is needed as the the loss variable. Then,
3. We call `backward` on the loss means it computes its rate of change with res`pect to the inputs.  

Since the scalar was created with `sum`, each position in `z` and `x` are 
independent with respect to the loss scalar. 

The rate of change of `x` with respect to the output is just the constant `3` that we multiplied `x` by.

In [43]:
x = torch.tensor([[2.0, 3.0]], requires_grad=True)
print("x: \n", x)
print("---")
z = 3 * x
print("z = 3*x: \n", z)
print("---")

loss = z.sum()
print("loss = z.sum(): \n", loss)
print("---")

loss.backward()

print("after loss.backward(), x.grad: \n", x.grad)


x: 
 tensor([[2., 3.]], requires_grad=True)
---
z = 3*x: 
 tensor([[6., 9.]], grad_fn=<MulBackward0>)
---
loss = z.sum(): 
 tensor(15., grad_fn=<SumBackward0>)
---
after loss.backward(), x.grad: 
 tensor([[3., 3.]])


### CUDA Tensors

PyTorch's operations can seamlessly be used on the GPU or on the CPU.  

There are a couple basic operations for interacting in this way.

In [2]:
print(torch.cuda.is_available())  # should be True if executed on Colab

True


In [3]:
x = torch.rand(3,3)

In [4]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


In [5]:
x = torch.rand(3, 3).to(device)
print(x.device)

cuda:0


In [6]:
cpu_device = torch.device("cpu")

In [7]:
# this will break if X is on GPU!
y = torch.rand(3, 3)
x + y

RuntimeError: expected device cuda:0 but got device cpu

In [9]:
y = y.to(device)
x = x.to(device)
x + y

tensor([[1.8212, 0.9627, 1.4338],
        [0.7000, 0.9635, 1.1399],
        [0.8802, 1.3136, 0.9528]], device='cuda:0')

In [10]:
if torch.cuda.is_available(): # only is GPU is available
    a = torch.rand(3,3).to(device='cuda:0') #  CUDA Tensor
    print(a)
    
    b = torch.rand(3,3).cuda()
    print(b)

    print(a + b)

    a = a.cpu() # Error expected
    print(a + b)

tensor([[0.6286, 0.7653, 0.1132],
        [0.8559, 0.6721, 0.6267],
        [0.5691, 0.7437, 0.9592]], device='cuda:0')
tensor([[0.3887, 0.2214, 0.3742],
        [0.1953, 0.7405, 0.2529],
        [0.2332, 0.9314, 0.9575]], device='cuda:0')
tensor([[1.0173, 0.9867, 0.4874],
        [1.0512, 1.4126, 0.8795],
        [0.8022, 1.6751, 1.9168]], device='cuda:0')


RuntimeError: expected device cpu but got device cuda:0

#### Exercise 1

Create a 2D tensor and then add a dimension of size 1 inserted at the 0th axis.

In [52]:
a = torch.rand(3,3)
a = a.unsqueeze(0)
print(a)
print(a.shape)

tensor([[[0.4703, 0.1049, 0.5137],
         [0.2674, 0.4990, 0.7447],
         [0.7213, 0.4414, 0.5550]]])
torch.Size([1, 3, 3])


#### Exercise 2 

Remove the extra dimension we just added to the previous tensor.

#### Exercise 3

Create a random tensor of shape 5x3 and move it to the GPU - if available

#### Exercise 4

Create a random tensor of size (3,1) and then horizonally stack 4 copies together.

You can use both `stack` or `expand`