<a href="https://colab.research.google.com/github/pluflou/nlp_pycon/blob/master/Part_1_Primer_on_Tensors.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Part 1 - Primer on Tensors for Deep Learning using PyTorch

**Required Time: 20 minutes**

In this notebook, we will cover the basic tensor operations which are useful when implementing and training deep learning models. This guide should provide you with enough background on some of the most important tensor operations and computations used in deep learning and machine learning. This will form the foundation for implementing your NLP models. The focus of this notebook will be on use of PyTorch's high-level APIs, but it will provide insights and concise details where necessary to help the reader. 


After completing this guide, and with enough practise, the learner should feel comfortable with basic tensor creation/manipulation, and composing a pipeline of tensor operations, which will become useful when implimenting deep learning models with PyTorch.

---

### Journey
- What is a tensor?
- Building Tensors
- Special Tensors
- Tensor Properties
- Tensor Slicing
- Tensor operations and transformations
- Cuda Tensors
- Pipeline of Tensor Operations

### What is a Tensor?
In different fields and studies, tensors mean completely different things. In the deep learning world, tensors are just high-dimensional data structures which allow for efficient storage and operations. Why are they important? Well, when we are dealing with different kinds of data, such as text and images, we need to *numericalize* (convert to numbers) the data. This means our data will eventually need to be converted into numerical format using some conversion technique. That's not important here. The important part is that tensors store data and we need to learn some properties and chacteristics of them before understanding their use and applicability. 

In [0]:
# Let's important pytorch -- the main PyTorch library;
# already installed for us on the Colab environment
import torch

### Building Tensors
In deep learning, we are mostly dealing with Tensors. Therefore, it's important to understand different concepts about tensors, such as types, size, special tensors, etc. Let's briefly review a few concepts related to building tensors. Pay close attention to the comments.  

In [0]:
# a scalar
A = torch.tensor(1)
print(A)

tensor(1)


In [0]:
# a vector (1D tensor); 3-dimensional vector
A = torch.tensor([1,2,3])
print(A)

tensor([1, 2, 3])


In [0]:
# a matrix (2D tensor)
A = torch.tensor([[1, 2, 3, 4],
                  [5, 6, 7, 8]])
print(A)

tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])


In [0]:
# a matrix (3D tensor)
A = torch.tensor([[[1, 2, 3, 4],
                  [5, 6, 7, 8]],
                  [[1, 2, 3, 4],
                  [5, 6, 7, 8]],
                 ])
print(A)
print(A.size())

tensor([[[1, 2, 3, 4],
         [5, 6, 7, 8]],

        [[1, 2, 3, 4],
         [5, 6, 7, 8]]])
torch.Size([2, 2, 4])


In [0]:
# build tensor from a numpy array; useful for data transformations
import numpy as np
A = torch.tensor(np.random.rand(1, 3))
print(A)

tensor([[0.2874, 0.0514, 0.2712]], dtype=torch.float64)


----

### Exercise - Constructing Tensor

Build a PyTorch tensor usiong a list from 1-10. Hint: use `range(...)`.

----

In [0]:
### YOUR CODE HERE


### YOUR CODE HERE

### Special Tensors

We can also build special tensors such as all-zeros matrix, all-ones matrix, already offered by PyTorch.

In [0]:
# builing tensor with zeros
A = torch.zeros(5, 4, dtype=torch.long)
print(A)

tensor([[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]])


In [0]:
# All-ones matrix
A = torch.ones(5, 4)
print(A)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])


In [0]:
# diagonal matrix
A = torch.eye(5, 5)
print(A)

tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]])


In [0]:
# building tensors with random information
A = torch.rand(5, 4)
print(A)

tensor([[0.2259, 0.0250, 0.0473, 0.5494],
        [0.7806, 0.1233, 0.4988, 0.1047],
        [0.9699, 0.4446, 0.8159, 0.6791],
        [0.9325, 0.4231, 0.2701, 0.3718],
        [0.5849, 0.3372, 0.8695, 0.1236]])


In [0]:
# like another tensor
A = torch.rand(5, 4)
B = torch.ones_like(A)
print(B)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])


In [0]:
# 1-dimensional ranges
torch.linspace(3, 10, steps=5)

tensor([ 3.0000,  4.7500,  6.5000,  8.2500, 10.0000])

----

### Exercise - Constructing Tensor

It's important to familiarize oneself with the [PyTorch documentation](https://pytorch.org/docs/stable/torch.html). To make this obvious, let try to construct a tensor tensor with the list of numbers from 1-10. There is an easy way: use `linspace(...)` shown above.

----

In [0]:
### YOUR CODE HERE


### YOUR CODE HERE

### Tensor Properties
We can check different properties of the tensors we are building. These special functions are helpful to observe the properties of the tensor data structure we are operating on.

In [0]:
# type
A = torch.randn(size = (2,3), dtype = torch.float32)
print(A.type())

torch.FloatTensor


In [0]:
# size and view and their difference
A = torch.eye(4,5)
A.size()

torch.Size([4, 5])

The best way to learn and memorize all these properties is to go into the documentation and practice as much as you can. 

### Tensor Slicing
Understanding how to query and slice tensors.

In [0]:
A =  torch.eye(5, 5)

print(A[3:]) # specific row, all columns
print("\n")
print(A[: 3]) # slice of rows, all columns
print("\n")
print(A[: -1]) # from first row to second to last row

tensor([[0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]])


tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.]])


tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.]])


### Tensor Operations and Transformations

Times, 
element-wise multiplication,
to.tensor() etc...

In [0]:
# modify tensor content
A = torch.ones(5, 4)
A[:2] = 0
print(A)

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])


In [0]:
# transpose
A.transpose(1,0)

tensor([[0., 0., 1., 1., 1.],
        [0., 0., 1., 1., 1.],
        [0., 0., 1., 1., 1.],
        [0., 0., 1., 1., 1.]])

What happens if we print A again, woud it be transposed?

In [0]:
print(A)

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])


Apparently not! There is a quick fix to make transformations and operations in place. Use "`_`" after the method.

In [0]:
A.transpose_(1, 0)
print(A)

tensor([[0., 0., 1., 1., 1.],
        [0., 0., 1., 1., 1.],
        [0., 0., 1., 1., 1.],
        [0., 0., 1., 1., 1.]])


Let's do some simple math...

In [0]:
# addition
A = torch.ones(5, 3)
B = torch.ones(5, 3)
print(A + B)

# addition in place
print(torch.add(A, B))

# addition in place; uses symbol "_"
A.add_(B)

tensor([[2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.]])
tensor([[2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.]])


tensor([[2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.]])

In [0]:
# power and sum
A =  torch.tensor([1.0, 2.0])
out = A.pow(2).sum()
print(out)

tensor(5.)


In [0]:
# element-wise multiplication

A = torch.eye(5, 5)
B = torch.ones(5, 5)

print(A * B)

tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]])


---

### EXERCISE - Tensor Operations and Transformations

Generate this matrix (Hint: `torch.rot90()`):

```python

tensor([[0., 0., 0., 0., 1.],
        [0., 0., 0., 1., 0.],
        [0., 0., 1., 0., 0.],
        [0., 1., 0., 0., 0.],
        [1., 0., 0., 0., 0.]])
```

----




In [0]:
A = torch.eye(5, 5)
B = torch.ones(5, 5)
C = A * B

### YOUR CODE HERE


### YOUR CODE HERE

### Cuda Tensors
PyTorch provides the option to easily manipulate tensors in addition to moving them from gpu and cpu, which allows for ease of training and controlling which parts should be commited to the gpu or the cpu. This gives the control/flexibility to the programmer to design the models for different scenarios with different compute resources.

In [0]:
# check if cuda is available in your computer
print("---------------------------------------------------")
print("Cuda status:", torch.cuda.is_available())

device_0 = torch.device("cuda")
device_1 = torch.device("cpu")

print("---------------------------------------------------")

x = torch.rand(2, 3)
y = torch.ones_like(x, device=device_0)
z = torch.zeros_like(x, device=device_1)
print("Tensor y is stored in: ", y.device)
print(y)
print("---------------------------------------------------")
print("Tensor x is stored in: ", x.device)
print(x)

---------------------------------------------------
Cuda status: True
---------------------------------------------------
Tensor y is stored in:  cuda:0
tensor([[1., 1., 1.],
        [1., 1., 1.]], device='cuda:0')
---------------------------------------------------
Tensor x is stored in:  cpu
tensor([[0.5155, 0.3415, 0.0959],
        [0.2833, 0.8981, 0.5401]])


In [0]:
# check if tensor stored in GPU

print(y.is_cuda)
print(x.is_cuda)

True
False


There are significant differences in execution time when dealing with huge tensors.

In [0]:
%%time
a = torch.rand(1000, 1000)
b = torch.rand(1000, 1000)

a.matmul(b)

CPU times: user 52 ms, sys: 6.41 ms, total: 58.4 ms
Wall time: 63.7 ms


In [0]:
%%time
a = torch.rand(1000, 1000).cuda()
b = torch.rand(1000, 1000).cuda()

a.matmul(b)

CPU times: user 24.3 ms, sys: 822 µs, total: 25.2 ms
Wall time: 26.8 ms


### Pipeline of Tensor Operations
With PyTorch everything is a chain of operations  and the high-level APIs offer all the necessary functionalities to achieve this. Below, we will explore a simple approach to chaining tensor operations. In other words, we are designing a graph or a flow of operations that are chained together, designed to output something. This is essential for training neural networks. This is why neural networks are sometimes called **computation graphs**, because they are simply a chain of operations.

In [0]:
import torch.nn as nn

class Chain(nn.Module):
  def __init__(self, ):
    super(Chain, self).__init__()
    
  def forward(self, x):
    print("---------------------------------------------------")
    print("X:")
    print(x)
    print("---------------------------------------------------")
    print("X powered by 2:")
    print(x.pow(2))
    print("mean(X): ")
    out = x.mean()
    return out

In [0]:
x =  torch.rand((4,5), dtype=torch.float64)
chain = Chain()
mean_x = chain(x)
print(mean_x)

---------------------------------------------------
X:
tensor([[4.7131e-01, 4.2154e-02, 7.6620e-04, 1.4177e-01, 2.1020e-02],
        [5.3419e-02, 2.2027e-01, 9.2426e-01, 2.5027e-01, 8.3245e-01],
        [5.2839e-01, 9.2095e-01, 9.5120e-01, 9.4880e-01, 7.9236e-01],
        [6.1429e-01, 8.8064e-01, 4.6491e-01, 5.8703e-01, 3.9816e-01]],
       dtype=torch.float64)
---------------------------------------------------
X powered by 2:
tensor([[2.2214e-01, 1.7769e-03, 5.8706e-07, 2.0098e-02, 4.4184e-04],
        [2.8536e-03, 4.8519e-02, 8.5426e-01, 6.2637e-02, 6.9297e-01],
        [2.7919e-01, 8.4816e-01, 9.0478e-01, 9.0022e-01, 6.2783e-01],
        [3.7735e-01, 7.7552e-01, 2.1614e-01, 3.4460e-01, 1.5853e-01]],
       dtype=torch.float64)
mean(X): 
tensor(0.5022, dtype=torch.float64)


The benefit of the chainer or computation graph is that it can resused and different inputs can be fed into it.

In [0]:
y = torch.eye(5,5)
print(torch.mean(y))
chain_2 = Chain()
mean_y = chain(y)
print(mean_y)

tensor(0.2000)
---------------------------------------------------
X:
tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]])
---------------------------------------------------
X powered by 2:
tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]])
mean(X): 
tensor(0.2000)


---

### References
- [GitHub repo](https://github.com/omarsar/nlp_pycon)
- [PyTorch official documentation](https://pytorch.org/docs/stable/tensors.html)
- [A Simple Neural Network from Scratch with PyTorch and Google Colab](https://medium.com/dair-ai/a-simple-neural-network-from-scratch-with-pytorch-and-google-colab-c7f3830618e0)
- [Deep Learning Emotion Recognition with PyTorch](https://github.com/omarsar/appworks_meetup_2018/blob/master/Deep%20Learning%20Emotion%20Recognition%20PyTorch.ipynb)
- [Hacking Neural Networks](https://colab.research.google.com/drive/1Loc882hPQwhq212TS4bpUYsTF7KZUD9x#scrollTo=xeRo4GLruC72)