Broadcasting is a technique used in python to help with matrix operations(add, sub, mul(includes element-wise, dot and matmul), divide). 
Let's consider this case where the dimensions of matrices don't match while matrix-multiplying. Instead of throwing an error we get the right result. 



In [30]:
import numpy as np

a = np.array([[1, 2],
              [3, 4],
              [5, 6]])

# Vector b with shape (2,)
b = np.array([1, 2])


c = np.matmul(a,b)
print(c)
print(a.shape, b.shape)


[ 5 11 17]
(3, 2) (2,)


Above result is not a (3,3) matrix but instead a (3,) matrix. This is because the second argument is a 1d array it performs a dot product for each of matrix a. 

Let's consider a case of addition of two matrices whose dimension does not match. What numpy does here is it expands the second matrix to a two-dimensional matrix with same value. It becomes [[1,1],[1,1]] and then it adds to the original matrix. 


In [25]:
s1 = np.array([[2,3], [4,5]])
s2 = np.array([1,1])

print(np.add(s1,s2))

[[3 4]
 [5 6]]


In [26]:
# Note that multiplication here is not equal to np.matmul. This is element wise multiplication where each element is multiplied by it's corresponding matrix element

print(np.multiply(s1,s2))

[[2 3]
 [4 5]]


The general principle of broadcasting is that - If you have a (m,n) matrix and if we add/substract/divide/multiply by a (1,n) or (n,1) matrix then it is converted to a (m,n) matrix.



In [27]:
# This expands 20 to [20,20] and then calculates the dot product.
print(np.dot([1,1],20))

[20 20]


### Matmul when one is a 1D(rank 1) array
In matmul sometimes it doesn't need to broadcast - If the first argument is 1-D, it is promoted to a matrix by prepending a 1 to its dimensions. After matrix multiplication the prepended 1 is removed.
If the second argument is 1-D, it is promoted to a matrix by appending a 1 to its dimensions. After matrix multiplication the appended 1 is removed

In [28]:
a = np.array([[1,2],[3,4]])

# The shape of this b array is (2,) for this matrix multiplication to work it is converted to a (2,1) matrix. 
b = np.array([0,1])

c = np.matmul(a,b)
print(c)
print(a.shape, b.shape)

[2 4]
(2, 2) (2,)


# Important note when using vectors in Numpy

using data which has shape like this (2,) can lead to confusing outcomes sometimes. It is neither considered a row vector or a column vector. It is called a rank 1 array and can generaaly cause issues. Instead always initialise data structure with shape like (2,1) or (1,2)

Reshape is very inexpensive operation. We should call it as often as needed. This makes sure that we are operating on the right shape of data. 




In [29]:
do_not_do = np.random.randn(5)

#(5,) - This can cause issues.
print(do_not_do.shape)

do_this = np.random.randn(5,1)
# (5,1) shape.
print(do_this, do_this.shape)

#transpose of this is - 
print(do_this.T)

# Example of issue. If we do dot product of a do_not_do and it's transpose we get weird result. It's a constant rather then a matrix. 
print(np.dot(do_not_do, do_not_do.T))




(5,)
[[ 0.60840169]
 [ 1.17064274]
 [ 0.81525293]
 [-0.31086274]
 [-0.8785493 ]] (5, 1)
[[ 0.60840169  1.17064274  0.81525293 -0.31086274 -0.8785493 ]]
0.8895087950932811
