#  PyTorch Tensors

## Tensors - the atoms of machine learning

### Representation of input data

- 2D matrix $\mathbf{X}$  of $\mathbb{R}^{m \times n}$ 
    - Rows = m samples
    - Columns = n features

$$\mathbf{X} = \begin{bmatrix}
    x_{1}^{(1)} & x_{2}^{(1)} & x_{3}^{(1)}  & \dots & x_{n}^{(1)} \\
    x_{1}^{(2)} & x_{2}^{(2)} & x_{3}^{(2)}  & \dots & x_{n}^{(2)} \\
    \vdots & \vdots & \vdots & \ddots & \vdots \\
    x_{1}^{(m)} & x_{2}^{(m)} & x_{3}^{(m)}  & \dots & x_{n}^{(m)}
\end{bmatrix}.$$

### Representation of model parameters

- model : $$\mathbf{\hat{Y}} = \mathbf{X}.\mathbf{W}$$

- 2D matrix $\mathbf{W}$ of $\mathbb{R}^{m \times k}$ 
    - Rows = n parameters for the n features
    - Columns = k predictions

$$\mathbf{W} = \begin{bmatrix}
    w_{1}^{(1)} & w_{2}^{(1)} & w_{3}^{(1)}  & \dots & w_{k}^{(1)} \\
    w_{1}^{(2)} & w_{2}^{(2)} & w_{3}^{(2)}  & \dots & w_{k}^{(2)} \\
    \vdots & \vdots & \vdots & \ddots & \vdots \\
    w_{1}^{(n)} & w_{2}^{(n)} & w_{3}^{(n)}  & \dots & w_{k}^{(n)}
\end{bmatrix}$$

### Representation of output predictions

- 2D matrix $\mathbf{\hat{Y}}$ of $\mathbb{R}^{m \times k}$ 
    - Rows = m labels
    - Columns = k predictions

$$\mathbf{\hat{Y}} = \begin{bmatrix}
    y_{1}^{(1)} & y_{2}^{(1)} & y_{3}^{(1)} & \dots & y_{k}^{(1)} \\
    y_{1}^{(2)} & y_{2}^{(2)} & y_{3}^{(2)} & \dots & y_{k}^{(2)} \\
    \vdots & \vdots & \vdots & \vdots & \vdots\\
    y_{1}^{(m)} & y_{2}^{(m)} & y_{3}^{(m)} & \dots & y_{k}^{(m)}
\end{bmatrix}$$

### PyTorch tensors

- Represents data and model

Conceptually identical to numpy array but:
- Can be compute on GPUs to accelerate their numeric computations
- Keep track of local computational graph operations and gradients of loss (next section)

## PyTorch tensors and numpy array

In [35]:
import numpy as np
from numpy.linalg import inv
from numpy.linalg import multi_dot as mdot
import torch

In [36]:
# numpy
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [37]:
# pytorch
torch.eye(3)

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

In [38]:
# numpy
A = np.random.random((5, 3))
A

array([[0.05171524, 0.16619127, 0.85743527],
       [0.39318348, 0.67743555, 0.22306779],
       [0.43898895, 0.27952786, 0.19438591],
       [0.10396969, 0.0915456 , 0.65658477],
       [0.57974962, 0.8014285 , 0.63356562]])

In [39]:
# pytorch
B = torch.rand((5, 3))
B

tensor([[0.4562, 0.3881, 0.6899],
        [0.4360, 0.4203, 0.3144],
        [0.6204, 0.0700, 0.3056],
        [0.9660, 0.7672, 0.2256],
        [0.5234, 0.6531, 0.9459]])

In [40]:
A.shape

(5, 3)

In [41]:
B.shape

torch.Size([5, 3])

In [42]:
# numpy
A.T @ A # same as np.dot(A.T, A)

array([[0.69689833, 0.87180656, 0.65295664],
       [0.87180656, 1.21534252, 0.91581357],
       [0.65295664, 0.91581357, 1.6552493 ]])

In [43]:
# torch
B.t() @ B # same as torch.mm(B.t(), B)

tensor([[1.9902, 1.4866, 1.3545],
        [1.4866, 1.3472, 1.2121],
        [1.3545, 1.2121, 1.6139]])

In [44]:
# numpy
inv(A.T @ A)

array([[ 13.98497154, -10.07543637,   0.05778065],
       [-10.07543637,   8.66997028,  -0.82238573],
       [  0.05778065,  -0.82238573,   1.03635371]])

In [45]:
# torch
torch.inverse(B.t() @ B)

tensor([[ 2.8639, -3.0768, -0.0928],
        [-3.0768,  5.5946, -1.6195],
        [-0.0928, -1.6195,  1.9138]])

## A little more on PyTorch tensors

### Operations on tensors

In [46]:
x = torch.eye(3)
x.add(1) # Operations are also available as methods. Same as x + 1

tensor([[2., 1., 1.],
        [1., 2., 1.],
        [1., 1., 2.]])

In [47]:
x # x is not modified

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

In [48]:
x.add_(1) 
# Any operation that mutates a tensor in-place is post-fixed with an _. 
# For example: x.copy_(y), x.t_(), will change x.
x

tensor([[2., 1., 1.],
        [1., 2., 1.],
        [1., 1., 2.]])

In [51]:
x = torch.eye(3)
y = torch.eye(3) * 10
# x.copy(y)
x.copy_(y)
x

tensor([[10.,  0.,  0.],
        [ 0., 10.,  0.],
        [ 0.,  0., 10.]])

### Concatenate

In [52]:
# By default, it concatenates along the first axis (concatenates rows)
x = torch.randn(1, 3)
y = torch.randn(2, 3)
print(x)
print(y)
z = torch.cat([x, y])
z

tensor([[ 1.2126, -1.1050, -0.4007]])
tensor([[-0.0779, -0.7310,  0.0293],
        [ 0.7298,  1.2243, -0.3148]])


tensor([[ 1.2126, -1.1050, -0.4007],
        [-0.0779, -0.7310,  0.0293],
        [ 0.7298,  1.2243, -0.3148]])

In [53]:
# Concatenate columns:
x = torch.randn(2, 1)
y = torch.randn(2, 2)
print(x)
print(y)
# second arg specifies which axis to concat along
z = torch.cat([x, y], 1)
print(z)

tensor([[-0.6222],
        [ 2.8865]])
tensor([[ 2.3767, -0.5537],
        [ 0.4160,  1.1959]])
tensor([[-0.6222,  2.3767, -0.5537],
        [ 2.8865,  0.4160,  1.1959]])


### Indexing and broadcasting

In [54]:
x = torch.rand(4,3)
x

tensor([[0.0816, 0.2577, 0.7760],
        [0.3333, 0.2221, 0.5775],
        [0.1140, 0.5851, 0.2798],
        [0.0575, 0.4447, 0.7491]])

In [55]:
x[0, 0] # Access to elemment (0,0)

tensor(0.0816)

In [56]:
x[0] # Access the first row

tensor([0.0816, 0.2577, 0.7760])

In [57]:
x[0:2] # Access to the first two rows

tensor([[0.0816, 0.2577, 0.7760],
        [0.3333, 0.2221, 0.5775]])

In [None]:
x[:, 0:2] # Access to all the rows and the first two columns

### Reshaping

Use the .view() method to reshape a tensor. 

This method is very usefull, because many neural network components 
expect their inputs to have a certain shape.

In [58]:
x = torch.randn(2, 3, 4)
x

tensor([[[-1.2028, -0.6602, -0.7507, -0.5043],
         [ 2.0428,  0.0962,  0.3422, -0.0451],
         [ 1.4442,  0.4766,  0.6738,  2.0606]],

        [[-0.2790,  1.0957,  2.2131, -1.0098],
         [ 0.6771, -1.0460,  0.7044,  0.7722],
         [-0.2685,  0.4074, -0.7081, -0.6265]]])

In [59]:
x.view(2, 12) # Reshape to 2 rows, 12 columns

tensor([[-1.2028, -0.6602, -0.7507, -0.5043,  2.0428,  0.0962,  0.3422, -0.0451,
          1.4442,  0.4766,  0.6738,  2.0606],
        [-0.2790,  1.0957,  2.2131, -1.0098,  0.6771, -1.0460,  0.7044,  0.7722,
         -0.2685,  0.4074, -0.7081, -0.6265]])

In [60]:
# Same as above.  If one of the dimensions is -1, its size can be inferred
x.view(2, -1)

tensor([[-1.2028, -0.6602, -0.7507, -0.5043,  2.0428,  0.0962,  0.3422, -0.0451,
          1.4442,  0.4766,  0.6738,  2.0606],
        [-0.2790,  1.0957,  2.2131, -1.0098,  0.6771, -1.0460,  0.7044,  0.7722,
         -0.2685,  0.4074, -0.7081, -0.6265]])

### Converting

In [61]:
B = torch.eye(3)
B

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

In [62]:
# torch --> numpy
A = B.numpy() 
A

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]], dtype=float32)

In [63]:
A[0,0] = 0. # A and B have the same reference for the array
B

tensor([[0., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

In [64]:
# numpy --> torch
A = np.eye(3)
B = torch.from_numpy(A)
B

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]], dtype=torch.float64)

## Move tensors to GPU with cuda package

If you have a GPU make sure that the right pytorch is installed

```
conda install pytorch torchvision cuda91 -c pytorch
```
Check https://pytorch.org/ for details.

cuda package adds support for CUDA tensor types, that implement the same function as CPU tensors, but they utilize GPUs for computation.

See : https://pytorch.org/docs/stable/cuda.html?highlight=cuda#module-torch.cuda

In [65]:
torch.cuda.current_device()

AssertionError: Torch not compiled with CUDA enabled

In [66]:
torch.cuda.is_available()

False

### Select device

In [None]:
device = torch.device("cuda:0" if torch.cuda.is_available() 
                      else "cpu")
device

### Move the tensor to device

In [None]:
w = torch.rand(5,3)
w.to(device) # Move the tensor to device

## Recap - what we learned so far

- Tensors represents data, model parameters and predictions
- Tensors pretty much like numpy arrays
- Tensors operations can be compute on GPU's to accelerate computation