In [2]:
import torch
import numpy as np

# Broadcasting


In [23]:
x = torch.tensor([1, 2, 3, 4])
print(x==1)

tensor([ True, False, False, False])


The above code is equivilent to the next code:

In [24]:
y = torch.tensor([1]*x.numel())  # Broadcasting here
print(x, y, x==y, sep='\n')

tensor([1, 2, 3, 4])
tensor([1, 1, 1, 1])
tensor([ True, False, False, False])


Why is it called broadcasting?

Becuase comparing tensor `x` to scalar `1` mathematically is not really interesting, because in all cases it reutnrs False. So, instead pytorch returns tensor mask, of boolean, which is compared cell by cell, with tensor `y` which is tensor of same dimension as `x` and has values filled with the scalar `1`.

כאשר ננסה להפעיל אופרטורים על טנסורים בעלי מימדים שונים, קודם לביצוע הפעולה אחד מהם(או שניהם) יעבור שידור, מימדיו יגדלו והפעולה תבוצע רק לאחר ששני הטנזורים בעלי אותו מימד.

## Broadcasting - example 1

In [25]:
A=torch.arange(9).reshape(3,3)
B=torch.arange(3)
print(A, B, A+B, sep='\n') # Add A+B

tensor([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])
tensor([0, 1, 2])
tensor([[ 0,  2,  4],
        [ 3,  5,  7],
        [ 6,  8, 10]])


### Explanation - `expand_as()`

The tensor `B` is converted (<b>broadcasted</b>) to higher dimension:

In [26]:
B.expand_as(A)

tensor([[0, 1, 2],
        [0, 1, 2],
        [0, 1, 2]])

and only then, pytorch adds `A` and `B`, only after broadcasting!

In [27]:
A + B.expand_as(A) # Equivilent to: A+B

tensor([[ 0,  2,  4],
        [ 3,  5,  7],
        [ 6,  8, 10]])

But, why is the broadcasting converted to [0, 1, 2] in the rows, and not [0, 0, 0], [1, 1, 1], [2, 2, 2] in the rows? (transposed)

Won't the answer change? Yes, it will

In [28]:
B.reshape(3,1).expand_as(A)

tensor([[0, 0, 0],
        [1, 1, 1],
        [2, 2, 2]])

## Broadcasting - example 2

Notice: `A` is one column, `B` is one row, but the `A+B` output is tensor of 2D

In [29]:
A=torch.arange(3).reshape(3,1)
B=torch.arange(4).reshape(1,4)
print(A,B,A+B,sep='\n')

tensor([[0],
        [1],
        [2]])
tensor([[0, 1, 2, 3]])
tensor([[0, 1, 2, 3],
        [1, 2, 3, 4],
        [2, 3, 4, 5]])


## Broadcasting - example 3 - can't add diffirent sizes

You can't just add any 2 tensors. Unlike previous example, where we can add column and row, here we can't add dimension 1 (2 for A, 5 for B)

In [30]:
A=torch.arange(6).reshape(3,2)
B=torch.arange(5).reshape(1,5)
print(A,B,sep='\n')
A+B

tensor([[0, 1],
        [2, 3],
        [4, 5]])
tensor([[0, 1, 2, 3, 4]])


RuntimeError: The size of tensor a (2) must match the size of tensor b (5) at non-singleton dimension 1

# Rules for broadcasting

![](img/rules.png)

## Broadcasting - example 4 - summary


In [6]:
A=torch.arange(24).reshape(2,3,4)
B=torch.tensor([0,1,2,3])
print("A", A.size(), A,sep='\n')
print("B", B.size(), B, sep='\n')

C=B.expand_as(A)
print("C", C.size(), C, sep='\n')

D=A+B
print("D", D.size(), D, sep='\n')

A
torch.Size([2, 3, 4])
tensor([[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
         [ 8,  9, 10, 11]],

        [[12, 13, 14, 15],
         [16, 17, 18, 19],
         [20, 21, 22, 23]]])
B
torch.Size([4])
tensor([0, 1, 2, 3])
C
torch.Size([2, 3, 4])
tensor([[[0, 1, 2, 3],
         [0, 1, 2, 3],
         [0, 1, 2, 3]],

        [[0, 1, 2, 3],
         [0, 1, 2, 3],
         [0, 1, 2, 3]]])
D
torch.Size([2, 3, 4])
tensor([[[ 0,  2,  4,  6],
         [ 4,  6,  8, 10],
         [ 8, 10, 12, 14]],

        [[12, 14, 16, 18],
         [16, 18, 20, 22],
         [20, 22, 24, 26]]])


Notice: The last dimension of `A` is equal to last dimension of `B` (which is 4).

So, `A+B` is valid broadcasting.

Then, we call `expand_as`, which converts `B` size (4) to size of `A` (2x3x4) which is also the size of `C`.

*Notice: For each i,j,k we can say that: `C[i,j,k] == B[k]` (last index = last dimension) is always True, because of `expand_as`*

# Question 1

![](img/q1.png)

## Answer 1.a.

We check `A` first:
- A,B can't be broadcasted, because last dimension of A is 2, and last dimension of B is 3, and not equal.
- A,C can be broadcasted:
  - 2,1 is OK because one of them is 1
  - Move left dimension
  - 1,5 is OK because one on them is 1
  - Move left dimension
  - 2,Nan is OK because we can always say Nan is 1: `C = 1x5x1` is equal to `C = 5x1`
- A,D can't be broadcasted, last dimension mismatch
- A,E can be broadcasted:
  - We can say dimension of E is `1x1x1`
  - Because all of them is 1, is match with `2x1x2`
- A,F can't be broadcasted, last dimension mismatch
Now we check starting with `B`:
- B, C can be broadcasted:
  - `B = 2x1x1x3`
  - `C = 5x1` but also `C = 1x1x5x1`
  - `2x1x1x3` is match with `1x1x5x1`
- B, D can't be broadcasted, last dimension mismatch
- B, E can be broadcasted:
  - E is 1
- B, F can be broadcasted:
  - `B = 2x1x1x3`, `F = 1x1x5x3`
Now we check startig with `C`:
- C, D can be broadcasted:
  - `C = 5x1`, `D = 1x5`
- C, E can be broadcasted:
  - `C = 5x1`, `E = 1x1`
- C, F can be broadcasted:
  - `C = 5x1`, `F = 5x3`
  - Notice, 3x1 is ok, and also 5x5


We continue like this.

In [4]:
A = torch.arange(4).reshape(2,1,2)
B = torch.arange(6).reshape(2,1,1,3)
C = torch.arange(5).reshape(5,1)
D = torch.arange(5).reshape(1,5)
E = torch.arange(1)
F = torch.arange(15).reshape(5,3)

In [18]:
C + F

tensor([[ 0,  1,  2],
        [ 4,  5,  6],
        [ 8,  9, 10],
        [12, 13, 14],
        [16, 17, 18]])

## Answer 1.b.

<b>We take the biggest dimension of each dimension</b>
- `A+C = 2x5x2`
  - `A = 2x1x2`
  - `C = 1x5x1`
- `A+E = 2x1x2`
  - `A = 2x1x2`
  - `E = 1x1x1`
- `B+C = 2x1x5x3`
  - `B = 2x1x1x3`
  - `C = 1x1x5x1`

And so on

In [22]:
Z = B+C
Z.size()

torch.Size([2, 1, 5, 3])