In [4]:
import torch
from torch import einsum

### 1. Tranpose

##### Example 1

In [66]:
a = torch.arange(6).reshape(2, 3)

In [70]:
from torch import einsum

In [71]:
a.shape

torch.Size([2, 3])

Transpose matrix `a` using `einsum` and explain

In [72]:
output = einsum('ij->ji', a)

**Explain**: `ij -> ji` means: the element `a_ij` becomes the element `output_ji`

In [73]:
output.shape

torch.Size([3, 2])

### 2. Sum

##### Example 2

In [75]:
a = torch.arange(6).reshape(2, 3)

In [76]:
a.shape

torch.Size([2, 3])

In [78]:
output = torch.einsum('ij->', [a])

**Explain**: `ij ->` means: the element `a_ij`

##### Example 1

In [10]:
a = torch.tensor([0, 1, 2, 3])

In [11]:
b = torch.tensor([0, 1, 2, 3])

Do matrix multiplication between `a` and `b`. The output's size is the same as `a`

In [12]:
einsum('i,j -> i', a, b)

tensor([ 0,  6, 12, 18])

The output's size is a scalar

In [14]:
einsum('i,j ->', a, b)

tensor(36)

In [15]:
einsum('i,j -> ij', a, b)

tensor([[0, 0, 0, 0],
        [0, 1, 2, 3],
        [0, 2, 4, 6],
        [0, 3, 6, 9]])

##### Example 2

In [47]:
a, b = torch.randn(2, 3), torch.randn(2, 3)

In [48]:
a.shape, b.shape

(torch.Size([2, 3]), torch.Size([2, 3]))

Do matrix multiplication between `a` and `b`

In [52]:
a@b.T

tensor([[ 2.9083,  1.0462],
        [ 0.2614, -0.3838]])

In [53]:
einsum('ij, ji -> i', a, b)

RuntimeError: einsum(): subscript j has size 2 for operand 1 which does not broadcast with previously seen size 3

For `A`
- `i` represents the first dim of `A`
- `j` represents the second dim of `A`

For `B`
- `j` represents the first dim of `B`
- `i` represents the second dim of `B`

**Notations**
- `ij, ji` means: do the dot product between an element `A_ij` and an element `B_ji`
- `ji` means: update the element `output_ji` of the output matrix

##### Example 3

In [24]:
a, b = torch.tensor([[1, 2], [3, 4]]), torch.tensor([[5, 6], [7, 8]])

$\text { output }+=A_{i j} * B_{j i}$

Do matrix multiplication between `a` and `b`

$\begin{aligned}
& \text { initialize output as } 0 \\
& \text { for each } i: \\
& \quad \text { for each } j: \\
& \quad \text { output }+=\boldsymbol{A}_{i j} * \boldsymbol{B}_{j i}
\end{aligned}$

In [31]:
a, b

(tensor([[1, 2],
         [3, 4]]),
 tensor([[5, 6],
         [7, 8]]))

In [32]:
einsum('ij,ji -> ij', a, b)

tensor([[ 5, 14],
        [18, 32]])

In [29]:
einsum('ij,ji -> ji', a, b)

tensor([[ 5, 18],
        [14, 32]])