# NumPy Broadcasting

The term broadcasting refers to how numpy treats arrays with different Dimension during arithmetic operations which lead to certain constraints, the smaller array is broadcast across the larger array so that they have compatible shapes. 

In [1]:
import numpy as np

In [2]:
a = np.array([0,1,2,3])
b = np.array([1,1,1,1])

In [7]:
c = a * b
c

array([0, 1, 2, 3])

## example 1 : Single Dimension array 

In [21]:
import numpy as np
a = np.array([17, 11, 19]) # 1x3 Dimension array
print(a)

[17 11 19]


In [22]:
b = 3 
print(b)

3


In [23]:
# Broadcasting happened because of
# miss match in array Dimension.
c = a + b
print(c)

[20 14 22]


## example 2 : Two Dimensional Array

In [24]:
A = np.array([[11, 22, 33], [10, 20, 30]])

In [25]:
A.shape

(2, 3)

In [26]:
print(A)

[[11 22 33]
 [10 20 30]]


In [28]:
b = 4
print(b)

4


In [29]:
C = A + b
print(C)

[[15 26 37]
 [14 24 34]]


## example 3

In [30]:
v = np.array([12, 24, 36]) 
w = np.array([45, 55])   

In [31]:
print(v.shape)
print(w.shape)

(3,)
(2,)


In [33]:
# To compute an outer product we first
# reshape v to a column vector of shape 3x1
# then broadcast it against w to yield an output
# of shape 3x2 which is the outer product of v and w
print(np.reshape(v, (3, 1)))

[[12]
 [24]
 [36]]


In [32]:
print(np.reshape(v, (3, 1)) * w)

[[ 540  660]
 [1080 1320]
 [1620 1980]]


In [37]:
x = np.array([[12, 22, 33], [45, 55, 66]])
print(x.shape)

(2, 3)


In [38]:
x

array([[12, 22, 33],
       [45, 55, 66]])

In [None]:
# x has shape  2x3 and v has shape (3, )
# so they broadcast to 2x3,

In [39]:
print(x + v)

[[ 24  46  69]
 [ 57  79 102]]


In [40]:
# Add a vector to each column of a matrix X has
# shape 2x3 and w has shape (2, ) If we transpose X
# then it has shape 3x2 and can be broadcast against w
# to yield a result of shape 3x2.

In [43]:
# Transposing this yields the final result
# of shape  2x3 which is the matrix.
print((x.T + w).T)


[[ 57  67  78]
 [100 110 121]]


In [44]:
# Another solution is to reshape w to be a column
# vector of shape 2X1 we can then broadcast it
# directly against X to produce the same output.
print(x + np.reshape(w, (2, 1)))

[[ 57  67  78]
 [100 110 121]]


In [45]:
# Multiply a matrix by a constant, X has shape  2x3.
# Numpy treats scalars as arrays of shape();
# these can be broadcast together to shape 2x3.
print(x * 2)

[[ 24  44  66]
 [ 90 110 132]]


# Pytorch Broadcasting

In [1]:
import torch

Broadcasting functionality in Pytorch has been borrowed from Numpy. Broadcasting allows the performing of arithmetic operations on tensors that are not of the same size. Pytorch automatically does the broadcasting of the ‘smaller’ tensor to the size of the ‘larger’ tensor, only if certain constraints are met.

## Example

In [2]:
a = torch.tensor([1,2,3,4])
b = torch.tensor([1,2,3])
print(a)
print(b)

tensor([1, 2, 3, 4])
tensor([1, 2, 3])


In [6]:
print(b.shape)

torch.Size([3])


In [7]:
print(a.shape)

torch.Size([4])


In [8]:
c = a + b
c

RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0

The addition operation is not possible in this case due to the different sizes of the tensors.

## Example

In [9]:
a = torch.tensor([1,2,3])
b = torch.tensor([1])
print(a)
print(b)

tensor([1, 2, 3])
tensor([1])


In [11]:
c = a + b
c

tensor([2, 3, 4])

The value 1 is added to all the elements in the first tensor.

This happened because the smaller tensor was broadcasted to the size of the larger tensor.

## Rules for Broadcasting

Two tensors are compatible for broadcasting only if, when starting from the trailing dimensions of the tensors:

* If the dimensions of both the tensors are the same.
* If the dimension of one of the tensors is 1.

In [12]:
tensor1 = torch.tensor([[1, 2], [0, 3]])
tensor2 = torch.tensor([[3, 1]])
tensor3 = torch.tensor([[5], [2]])
tensor4 = torch.tensor([7])

In [15]:
print(tensor1.shape)
print(tensor2.shape)
print(tensor3.shape)
print(tensor4.shape)

torch.Size([2, 2])
torch.Size([1, 2])
torch.Size([2, 1])
torch.Size([1])


In [16]:
print(tensor1 + tensor2)

tensor([[4, 3],
        [3, 4]])


In [17]:
print(tensor1 + tensor3)

tensor([[6, 7],
        [2, 5]])


In [18]:
print(tensor2 + tensor3)

tensor([[8, 6],
        [5, 3]])


In [19]:
print(tensor1 + tensor4)

tensor([[ 8,  9],
        [ 7, 10]])
