# Tensor operations in PyTorch

Basic look into the operations with tensors in PyTorch.

- [Manipulating tensor](#manipulating-tensors)
  - [Matrix multiplication](#matrix-multiplication)
    - [Dot product rules](#dot-product-rules)
- [Tensor Aggregation](#tensor-aggregation)


In [14]:
# Importing packages
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [15]:
# Setting device to mac gpu
torch.set_default_device('mps')

## Manipulating tensors

Tensor operations include:

- Addition
- Subtraction
- Division
- scalar multiplication
- element-wise multiplication
- dot-product

These operations are the core of Deep Learning, it's all a bunch of linear algebra.


In [16]:
# Creating tensor
tensor = torch.tensor([1.0, 2.0, 3.0])
tensor

tensor([1., 2., 3.], device='mps:0')

In [17]:
# Addition
tensor + 10

tensor([11., 12., 13.], device='mps:0')

In [18]:
# Same operation
tensor.add(10)

tensor([11., 12., 13.], device='mps:0')

In [19]:
# Scalar multiplication by 10
tensor * 10

tensor([10., 20., 30.], device='mps:0')

In [20]:
# Same operation
tensor.mul(10)

tensor([10., 20., 30.], device='mps:0')

In [21]:
# Subtracting
tensor - 10

tensor([-9., -8., -7.], device='mps:0')

In [22]:
# Same operation
tensor.sub(10)

tensor([-9., -8., -7.], device='mps:0')

As shown above, the basic operations all have corresponding functions that perform the same operation. It's preferable to use the python operator's due to readability.


### Matrix multiplication

Two main ways of performing multiplication in linear algebra, element-wise multiplication and dot-product.

- **_Element-wise multiplication_** (or Hadamard product) is applied between two tensors, and the multiplication happens element by element. If two tensors are of the same shape, each element in the first tensor is multiplied by the corresponding element in the second tensor.

$$\begin{pmatrix}1 & 2 & 3\\4 & 5 & 6\end{pmatrix}_{2\times3} \circ \begin{pmatrix}7 & 8 & 9\\10 & 11 & 12\end{pmatrix}_{2\times3} =$$
$$\begin{pmatrix}1\times7 & 2\times8 & 3\times9\\4\times10 & 5\times11 & \times12\end{pmatrix}_{2\times3} = $$
$$\begin{pmatrix}7 & 16 & 27\\40 & 55 & 72\end{pmatrix}\_{2\times3}$$

- **_Dot-product_** (or Matrix Multiplication) is the most common way of multiplying tensors. We multiply the matching members, then sum up.

$$\begin{pmatrix}1 & 2 & 3\\4 & 5 & 6\end{pmatrix}_{2\times3} \cdot \begin{pmatrix}7 & 8 \\9 & 10 \\ 11 & 12\end{pmatrix}_{3\times2} =$$
$$\begin{pmatrix}1\times7 + 2\times9 + 3\times11 & 1\times8 + 2\times10 + 3\times12 \\4\times7 + 5\times9 + 6\times11 & 4\times8 + 5\times10 + 6\times12\end{pmatrix}_{2\times2} = $$
$$\begin{pmatrix}58 & 64 \\ 139 & 154\end{pmatrix}\_{2\times2}$$


In [23]:
# Element-wise multiplication
tensor * tensor

tensor([1., 4., 9.], device='mps:0')

In [24]:
# Matrix multiplication
tensor.matmul(tensor)

tensor(14., device='mps:0')

For our tensor the matrix multiplication with itself is:

$$\begin{pmatrix} 1 & 2 & 3\end{pmatrix} \cdot \begin{pmatrix} 1 & 2 & 3\end{pmatrix}^T$$

- Since the tensor is multiplying with itself, it has to be transposed. Since the input is 1D, PyTorch will transpose before performing the operation.

$$\begin{pmatrix} 1 \times 1 + 2 \times 2 + 3 \times3 \end{pmatrix} = $$

$$(14)$$


In [27]:
# Matrix multiplication with operator '@'
tensor @ tensor

tensor(14., device='mps:0')

Timing operations by hand and using torch functions


In [25]:
%%time
value = 0
for i in range(len(tensor)):
    value += tensor[i] * tensor[i]
value

CPU times: user 8.01 ms, sys: 1.49 ms, total: 9.5 ms
Wall time: 8.33 ms


tensor(14., device='mps:0')

In [26]:
%%time
tensor.matmul(tensor)

CPU times: user 293 μs, sys: 212 μs, total: 505 μs
Wall time: 275 μs


tensor(14., device='mps:0')

#### Dot product rules

Dot product also has some rules that need to be followed in order for a successful operation:

1. The **_inner_** dimensions must match
   - $2\times3$ and $3\times2$ match.
   - $2\times3$ and $2\times3$ don't.
2. The resulting matrix has the shape of the **_outer_** dimensions.
   - $2\times3$ and $3\times2$ output $3\times3$.
   - $3\times2$ and $2\times3$ output $2\times2$.
   - $3\times1$ and $1\times3$ output $1\times1$ (scalar).


In [None]:
# Multiplying matrices
torch.matmul(torch.rand(3, 2), torch.rand(2, 3))

tensor([[0.2605, 0.1980, 0.2973],
        [0.4970, 0.5551, 0.7661],
        [0.3733, 0.5208, 0.6919]], device='mps:0')

You can use `matmul` or `mm` kind of interchangeably for 2D tensors. The difference is `mm` doesn't do broadcasting.


In [None]:
# Multiplying matrices using .mm
torch.matmul(torch.rand(2, 42), torch.rand(42, 7))

tensor([[10.4685, 12.1434, 11.1493,  8.2612, 11.0989,  9.1639, 10.0399],
        [ 9.1047, 11.0180,  8.9983,  7.8798,  9.8034,  8.2894,  9.9635]],
       device='mps:0')

In order to multiply two matrices of same shape (without them being square, $n\times n$) you need to transpose one of them. A transpose switches the axes or dimensions of a given tensor.


In [47]:
# Transposing matrix
torch.matmul(torch.rand(3, 4), torch.rand(3, 4).T)

tensor([[0.5316, 1.1483, 1.1191],
        [0.4536, 1.1741, 0.8876],
        [0.4417, 0.7525, 0.6905]], device='mps:0')

In [None]:
# Creating matrices
tensor_a = torch.tensor(
    [
        [1.0, 2.0],
        [3.0, 4.0],
        [5.0, 6.0],
    ],
)

tensor_b = torch.tensor(
    [
        [7.0, 8.0],
        [9.0, 10.0],
        [11.0, 12.0],
    ],
)

In [None]:
# Transposing tensor b to multiply with tensor a
tensor_a.matmul(tensor_b.T)

tensor([[ 23.,  29.,  35.],
        [ 53.,  67.,  81.],
        [ 83., 105., 127.]], device='mps:0')

In [None]:
# Multiplying transposed tensor b with tensor a
tensor_b.T.matmul(tensor_a)

tensor([[ 89., 116.],
        [ 98., 128.]], device='mps:0')

## Tensor Aggregation

Operations such as Min, Max, Mean and Sum, arg Max, arg Min.

Most of the aggregation methods have two ways of getting these aggregation methods, either by using `torch.<aggregator function>(<tensor>)` or `<tensor>.<aggregation function>()`.


In [77]:
# Create tensor
x = torch.arange(0, 100, 10, dtype=torch.float32)
x

tensor([ 0., 10., 20., 30., 40., 50., 60., 70., 80., 90.], device='mps:0')

In [76]:
x = torch.arange(0, 100, 10, dtype=torch.float32)

In [51]:
# Finding the min
torch.min(x), x.min()

(tensor(0., device='mps:0'), tensor(0., device='mps:0'))

In [52]:
# Finding the max
torch.max(x), x.max()

(tensor(90., device='mps:0'), tensor(90., device='mps:0'))

In [53]:
# Finding mean
torch.mean(x), x.mean()

(tensor(45., device='mps:0'), tensor(45., device='mps:0'))

In [54]:
# Getting sum
torch.sum(x), x.sum()

(tensor(450., device='mps:0'), tensor(450., device='mps:0'))

In [55]:
# Getting index of min
torch.argmin(x), x.argmin()

(tensor(0, device='mps:0'), tensor(0, device='mps:0'))

In [56]:
# Getting index of max
torch.argmax(x), x.argmax()

(tensor(9, device='mps:0'), tensor(9, device='mps:0'))