<a href="https://colab.research.google.com/github/mcnica89/manim/blob/main/Array_Broadcasting_Examples.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import numpy as np
np.set_printoptions(suppress=True)

#Motivation 

Start with the "Pairwise Diff Problem" for loop vs broadcasting


In [2]:
N = 1000 #Change this to 10,000 to see a bigger effect!
X = np.random.rand(N)

In [3]:
%%time
PairwiseDiff = np.zeros((N,N))
for i in range(N):
  for j in range(N):
    PairwiseDiff[i,j] = X[i] - X[j]

CPU times: user 691 ms, sys: 7.4 ms, total: 698 ms
Wall time: 738 ms


In [4]:
%%time
PairwiseDiff_2 = X.reshape((N,1)) - X.reshape((1,N))

CPU times: user 3.29 ms, sys: 3.04 ms, total: 6.34 ms
Wall time: 8.35 ms


In [5]:
%%time
PairwiseDiff_3 = np.expand_dims(X,axis=-1) - X

CPU times: user 4.73 ms, sys: 3.09 ms, total: 7.82 ms
Wall time: 5.72 ms


In [21]:
print(PairwiseDiff - PairwiseDiff_2)

[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]


# Level 0: Same ndim, Same shape

In this case NumPy just does the operation entriwise

In [7]:
A_2x3 = np.array([[10,20,30],[40,50,60]])
B_2x3 = np.array([[5,6,7],[8,9,0]])
print(A_2x3+B_2x3)

[[15 26 37]
 [48 59 60]]


#Level 1: Same ndim, Different shape

In this case, any dimension with a 1 is BROADCAST to be the size of the other array. Then the entriwise operation is the same

In [8]:
A_2x3 = np.array([[10,20,30],[40,50,60]])
B_1x3 = np.array([[7,8,9]])
print(A_2x3+B_1x3)

[[17 28 39]
 [47 58 69]]


In [9]:
A_2x3 = np.array([[10,20,30],[40,50,60]])
B_2x1 = np.array([[7],[8]])
print(A_2x3+B_2x1)

[[17 27 37]
 [48 58 68]]


In [10]:
A_1x3 = np.array([[10,20,30]])
B_2x1 = np.array([[4],[5]])
print(A_1x3+B_2x1)

[[14 24 34]
 [15 25 35]]


In [11]:
X = np.array([3,5,7])
print(X.reshape((3,1)) - X.reshape((1,3)))

[[ 0 -2 -4]
 [ 2  0 -2]
 [ 4  2  0]]


Note that if there is a dimension where the shapes are both not 1 and not equal, the broadcasting does not work!

In [12]:
A_1x3x4 = np.random.rand(1,3,4)
B_2x1x5 = np.random.rand(2,1,5)
try:
  print(A_1x3x4+B_2x1x5)
except ValueError as error:
  print(error)

operands could not be broadcast together with shapes (1,3,4) (2,1,5) 


# Level 2: Different ndim, Different shape

In this case the shapes get right aligned, and 1's are added to the shorter shape.

In [13]:
import numpy as np
A_2x3 = np.array([[10,20,30],[40,50,60]])
B_3vec = np.array([7,8,9])
print(A_2x3+B_3vec)

[[17 28 39]
 [47 58 69]]


Note that the 1's are always added to the LEFT, so even if you think it might make sense to broadcast, it will not work unless the RIGHT dimensions line up

In [14]:
A_2x3 = np.array([[10,20,30],[40,50,60]])
B_2vec = np.array([7,8])
try:
  print(A_2x3+B_2vec)
except ValueError as error:
  print(error)

operands could not be broadcast together with shapes (2,3) (2,) 


Here is how can you do pairwise difference, but this is a lot harder to read and easy for bugs to happen!!

In [15]:
X = np.array([3,5,7])
print(X.reshape((3,1)) - X)
print(np.expand_dims(X,axis=-1) - X) 

[[ 0 -2 -4]
 [ 2  0 -2]
 [ 4  2  0]]
[[ 0 -2 -4]
 [ 2  0 -2]
 [ 4  2  0]]


# Broadcasting and KeepDims=True

Imagine you have an array of observations of how often counts appeared and you want to normalize the rows to sum to 1.

In [16]:
Counts = np.array([[2,3,5],[10,20,70],[300,100,600]])

In [17]:
TotalSum = np.sum(Counts) #This is the total sum, not the row sum!

In [18]:
RowSumsVec = np.sum(Counts,axis=1) #Must specify the axis to be collapsed! 
print("RowSumsVec:\n", RowSumsVec)

WRONG_Percents = Counts / RowSumsVec #This wont work because the broadcasting is wrong!
print("Wrong Percents:\n", WRONG_Percents)

RowSumsVec:
 [  10  100 1000]
Wrong Percents:
 [[ 0.2    0.03   0.005]
 [ 1.     0.2    0.07 ]
 [30.     1.     0.6  ]]


In [19]:
RowSumsKeepDims = np.sum(Counts,axis=1,keepdims=True) #Must specify the axis to be collapsed! 
print("RowSumsKeepDims:\n", RowSumsKeepDims)

Percents = Counts / RowSumsKeepDims #This wont work because the broadcasting is wrong!
print("Correct Percents:\n", Percents)

RowSumsKeepDims:
 [[  10]
 [ 100]
 [1000]]
Correct Percents:
 [[0.2 0.3 0.5]
 [0.1 0.2 0.7]
 [0.3 0.1 0.6]]
