In [1]:
import torch

# Mathematical Operations with PyTorch

PyTorch, like other frameworks that work with arrays/tensors, is extremely efficient when it comes to matrix operations. These operations are done in parallel and can be transfered to the GPU if you have a cuda compatibale graphics card.

In [2]:
A = torch.ones(size=(2, 2), dtype=torch.float32)
B = torch.tensor([[1, 2],[3, 4]], dtype=torch.float32)

In [3]:
A

tensor([[1., 1.],
        [1., 1.]])

In [4]:
B

tensor([[1., 2.],
        [3., 4.]])

## Common Matrix Operations

The addition operations works as expected: $ \mathbf{A} + \mathbf{B} $. Each cell in the matrix is added together.

In [5]:
A + B

tensor([[2., 3.],
        [4., 5.]])

The subtraction operations is similarly easy to grasp: $ \mathbf{A} - \mathbf{B} $. Each cell in the matrix is subtracted.

In [6]:
A - B

tensor([[ 0., -1.],
        [-2., -3.]])

The multiplication operator `*` should not be confused with matrix multiplication. When we multiply two matrices, the individual cells in the matrices are multiplied.

In [7]:
A * B

tensor([[1., 2.],
        [3., 4.]])

The same goes for the division operator `\`. When we apply the operator, individual cells are divided.

In [8]:
A / B

tensor([[1.0000, 0.5000],
        [0.3333, 0.2500]])

Additionally to the above operators, PyTorch also provides methods for the same actions: `add()`, `subtract()`, `multiply()`, and `divide()`

In [9]:
print(A.add(B))
print(A.subtract(B))
print(A.multiply(B))
print(A.divide(B))

tensor([[2., 3.],
        [4., 5.]])
tensor([[ 0., -1.],
        [-2., -3.]])
tensor([[1., 2.],
        [3., 4.]])
tensor([[1.0000, 0.5000],
        [0.3333, 0.2500]])


Each of the methods provides methods that change the tensor in place. These methods always end with a `_`: `add_()`, `subtract_()`, `multiply_()`, `divide_()`.

In [10]:
test = torch.tensor([[1, 2], [4, 4]], dtype=torch.float32)

In [11]:
test.add_(A)
# the test tensor was changed
print(test)

tensor([[2., 3.],
        [5., 5.]])


If we want to apply matrix multiplication $ \mathbf{A} * \mathbf{B} $ we use the `mathmul` method.

In [12]:
A.matmul(B)

tensor([[4., 6.],
        [4., 6.]])

In [13]:
B.matmul(A)

tensor([[3., 3.],
        [7., 7.]])

In [14]:
torch.matmul(A, B)

tensor([[4., 6.],
        [4., 6.]])

Alternatively we can use `@` as a convenient way to use matrix multiplication.

In [15]:
A @ B

tensor([[4., 6.],
        [4., 6.]])

## Broadcasting

Broadcasting is a technique, by which Tensors of different dimensions can still use mathematical operations like addition and subtraction.

In [16]:
A = torch.ones(size=(2,), dtype=torch.float32)
B = torch.tensor([[1, 2],[3, 4]], dtype=torch.float32)

In [17]:
A.shape

torch.Size([2])

In [18]:
B.shape

torch.Size([2, 2])

The shapes of the two tensors are different, therefore the mathematical operation is not defined. Yet below we see that the addition still works in PyTorch. This is due to broadcasting. PyTorch "duplicates" the smaller tensor to match the dimensions of the larger tensor. Broadcasting is an advanced topic, but usually the operations and the mentioned "duplication" process works in a very intuitive way.

In [19]:
A + B

tensor([[2., 3.],
        [4., 5.]])

## Tensor Functions

PyTorch also provides a wide variaty of mathematical functions that can be applied to tensors. For example `exp()`, `log()`, `sin()` and `cos()`.

In [20]:
B

tensor([[1., 2.],
        [3., 4.]])

In [21]:
B.exp()

tensor([[ 2.7183,  7.3891],
        [20.0855, 54.5981]])

In [22]:
B.log()

tensor([[0.0000, 0.6931],
        [1.0986, 1.3863]])

In [23]:
B.cos()

tensor([[ 0.5403, -0.4161],
        [-0.9900, -0.6536]])

## Reduction Operations

Often in deep learing we need to reduce a matrix through certain operations. For example to find the maximum value in a Tensor or to calculate the average. Often we use the `dim` parameter to reduce the tensor along a specific dimension. Below we can see some examples.

Below we look for the index that corresponds to the maximum value in a tensor using the `argmax(dim)` method: $ \arg\max A $. 

In [24]:
# argmax, argmin return the index
A = torch.tensor([[3, 4, 1, 22, 9, 2, 5], [3, 4, 1, 22, 9, 2, 88]])
A.shape

torch.Size([2, 7])

In [25]:
torch.argmax(A, dim=1)

tensor([3, 6])

Below we look for the maximum value in the Tensor `amax(dim)` method: $ \max A $. 

In [26]:
# amax, amin return the max, min value.
A = torch.tensor([[3, 4, 1, 22, 9, 2, 5], [3, 4, 1, 22, 9, 2, 88]])
torch.amax(A, dim=1)

tensor([22, 88])

The `max(dim)` method combines the `max()` and the `argmax()` methods.

In [27]:
# max, min return both indices and values
A = torch.tensor([[3, 4, 1, 22, 9, 2, 5], [3, 4, 1, 22, 9, 2, 88]])
torch.max(A, dim=1)

torch.return_types.max(
values=tensor([22, 88]),
indices=tensor([3, 6]))

PyTorch also provides `sum(dim)`, `mean(dim)` and many more.

In [28]:
A = torch.tensor([1, 2, 3, 4, 5], dtype=torch.float32)

In [29]:
A.mean()

tensor(3.)

In [30]:
A.sum()

tensor(15.)