We now illustrate broadcasting. Both PyTorch and NumPy support the same broadcasting convention.

In [1]:
import torch
from torch import tensor

a = tensor([10., 6, -4])
b = tensor([2., 8, 7])
c = tensor([10.,20,30])
m = tensor([[1., 2, 3], [4, 5, 6], [7, 8, 9]])
a, b, c, m


(tensor([10.,  6., -4.]),
 tensor([2., 8., 7.]),
 tensor([10., 20., 30.]),
 tensor([[1., 2., 3.],
         [4., 5., 6.],
         [7., 8., 9.]]))

In [2]:
a

tensor([10.,  6., -4.])

In [3]:
a > 0

tensor([ True,  True, False])

In [4]:
a + 1


tensor([11.,  7., -3.])

In [5]:
m


tensor([[1., 2., 3.],
        [4., 5., 6.],
        [7., 8., 9.]])

In [6]:
2 * m


tensor([[ 2.,  4.,  6.],
        [ 8., 10., 12.],
        [14., 16., 18.]])

We now illustrate broadcasting a vector to a matrix.

In [7]:
m, c, m.shape, c.shape


(tensor([[1., 2., 3.],
         [4., 5., 6.],
         [7., 8., 9.]]),
 tensor([10., 20., 30.]),
 torch.Size([3, 3]),
 torch.Size([3]))

In [8]:
m + c, c + m

(tensor([[11., 22., 33.],
         [14., 25., 36.],
         [17., 28., 39.]]),
 tensor([[11., 22., 33.],
         [14., 25., 36.],
         [17., 28., 39.]]))

Here is `c` expanded as though broadcasted to `m`.

In [9]:
t = c.expand_as(m)
t

tensor([[10., 20., 30.],
        [10., 20., 30.],
        [10., 20., 30.]])

In [10]:
m + t, m + c

(tensor([[11., 22., 33.],
         [14., 25., 36.],
         [17., 28., 39.]]),
 tensor([[11., 22., 33.],
         [14., 25., 36.],
         [17., 28., 39.]]))

Note that `t` still retains the same physical space of `c`:

In [11]:
t.storage()


  t.storage()


 10.0
 20.0
 30.0
[torch.storage.TypedStorage(dtype=torch.float32, device=cpu) of size 3]

Since physically array elements are stored continuously; there is a `stride` such that an increase in a particular dimension by one corresponds to an increase in stride in that dimension. So we see that in the broadcasted dimension, we have a stride of `0`.

In [12]:
m.stride(), m.shape, t.stride(), t.shape


((3, 1), torch.Size([3, 3]), (0, 1), torch.Size([3, 3]))

When broadcasting, we may need to change which dimension a lower-dimension object occupies so that it is broadcasted to the correct dimensions.

It is possible to use `unsqueeze` on the index to insert a dimension of size 1 there, or use index notation with `None` to do likewise.

In [13]:
c.unsqueeze(0), c[None, :]


(tensor([[10., 20., 30.]]), tensor([[10., 20., 30.]]))

In [14]:
c.shape, c.unsqueeze(0).shape


(torch.Size([3]), torch.Size([1, 3]))

In [15]:
c.unsqueeze(1), c[:, None]


(tensor([[10.],
         [20.],
         [30.]]),
 tensor([[10.],
         [20.],
         [30.]]))

In [16]:
c.shape, c.unsqueeze(1).shape


(torch.Size([3]), torch.Size([3, 1]))

You can use `...`, the literal for an object of type `Ellipsis` to mean "all preceding dimensions", and you can skip trailing `:`'s.

In [17]:
c[None].shape, c[..., None].shape


(torch.Size([1, 3]), torch.Size([3, 1]))

In [18]:
c[:, None].expand_as(m)


tensor([[10., 10., 10.],
        [20., 20., 20.],
        [30., 30., 30.]])

In [19]:
m + c[:, None]


tensor([[11., 12., 13.],
        [24., 25., 26.],
        [37., 38., 39.]])

In [20]:
m + c[None, :]


tensor([[11., 22., 33.],
        [14., 25., 36.],
        [17., 28., 39.]])

When operating on two arrays/tensors, Numpy/PyTorch compares their shapes element-wise. It starts with the **trailing dimensions**, and works its way forward. Two dimensions are **compatible** when

- they are equal, or
- one of them is 1, in which case that dimension is broadcasted to make it the same size
