As you learned in the previous lecture, operations on numpy arrays are carried out elementwise. For example:

In [2]:
import numpy as np

A = np.array([1, 2, 3])
B = np.array([10, 20, 30])
A + B

array([11, 22, 33])

This is possible because each of the arrays has the same size and shape. So, what about this:

In [3]:
# These two arrays have different shapes:
C = np.array([1, 2, 3])
D = np.array([[10, 20, 30], [40, 50, 60]])
C + D

array([[11, 22, 33],
       [41, 52, 63]])

Well, it looks like this works too. But not all arrays of different shapes behave like this. Binary operations like addition, subtraction, multiplication and so on between two arrays are only possible if the arrays can be broadcasted into the same size and shape.

To start our discussion of broadcasting, let's have a look at operations between an array and a scalar first. This is a basic example of broadcasting. Here's an example:

In [4]:
# Let's create an array:
E = np.array([1, 2, 3])

# And now let's add 5 to each element of it:
E + 5

array([6, 7, 8])

Here broadcasting consists in expanding the scalar into an array of the same size and shape as the array. In our case the array has the shape (3,), so the scalar gets expanded into [5, 5, 5], so now it has the same size. So, effectively the operation above works exactly the same as this:

In [6]:
E + np.array([5, 5, 5])

array([6, 7, 8])

Now, if the operation is between two arrays, it is only possible if the smaller array can be broadcasted, which means expanded so that it matches the size and shape of the larger array. 

Broadcasting can occur only if the axes of the two arrays either have the same lengths on a one-by-one basis or if either of them has a length of 1. How about an example?

In [7]:
# Let's create two arrays: a 2-dimensional one and a vector.
F = np.arange(1, 13).reshape(4, 3)
F

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [8]:
G = np.arange(10, 31, 10)
G

array([10, 20, 30])

In [9]:
# Now let's add them together.
F + G

array([[11, 22, 33],
       [14, 25, 36],
       [17, 28, 39],
       [20, 31, 42]])

In [13]:
# Works fine because of broadcasting. The smaller array, G, was expanded to match the larger array, F.
# So, the G array worked as if it was stacked vertically 4 times so that its shape matched that of F.
# Here broadcasting is possible because the length of the second axis (columns) is the same in both arrays.

# Now, the shape of F is (4, 3) and the shape of G is (3,). Let's create another 2-dimensional array with the shape (4, 1),
# so with the same length of the first axis:
H = np.array([100, 200, 300, 400])[:, np.newaxis]
H

array([[100],
       [200],
       [300],
       [400]])

In [14]:
# And now let's add F and H together.
F + H

array([[101, 102, 103],
       [204, 205, 206],
       [307, 308, 309],
       [410, 411, 412]])

This time the length of the first axis is the same (4), so broadcasting works.

How about this:

In [15]:
# Let's create an array with shape (6, 4).
I = np.arange(1, 25).reshape(6, 4)
I

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16],
       [17, 18, 19, 20],
       [21, 22, 23, 24]])

The lengths of the axes are 6 and 4. So, according to the broadcasting rule, the other array must have one of the following axis length configurations if we want broadcasting to work:

6 x 1
1 x 4

In [31]:
# Let's try them both out. First a 6 x 1 array:
J = np.arange(100, 700, 100).reshape(6, 1)
J

array([[100],
       [200],
       [300],
       [400],
       [500],
       [600]])

In [32]:
I + J

array([[101, 102, 103, 104],
       [205, 206, 207, 208],
       [309, 310, 311, 312],
       [413, 414, 415, 416],
       [517, 518, 519, 520],
       [621, 622, 623, 624]])

In [33]:
# As you can see broadcasting works. Now let's create an array with shape (1, 4):
K = np.arange(100, 500, 100).reshape(1, 4)
K

array([[100, 200, 300, 400]])

In [34]:
I + K

array([[101, 202, 303, 404],
       [105, 206, 307, 408],
       [109, 210, 311, 412],
       [113, 214, 315, 416],
       [117, 218, 319, 420],
       [121, 222, 323, 424]])

This works too.

EXERCISE

Try to estimate just by looking at the following pairs of arrays whether broadcasting will work or not. Then add the arrays in each pair together to check it out.

1) X1 = np.arange(1, 22).reshape(7, 3)    and    Y1 = np.arange(1, 22).reshape(3, 7)
2) X2 = np.arange(1, 13).reshape(4, 3)    and    Y2 = np.arange(1, 5).reshape(4, 1)
3) X3 = np.arange(1, 9).reshape(4, 2)    and    Y3 = np.arange(1, 5).reshape(1, 4)

SOLUTION

In [35]:
# 1) X1 = np.arange(1, 22).reshape(7, 3)    and    Y1 = np.arange(1, 22).reshape(3, 7) - doesn't work
X1 = np.arange(1, 22).reshape(7, 3)
Y1 = np.arange(1, 22).reshape(3, 7)
X1 + Y1

ValueError: operands could not be broadcast together with shapes (7,3) (3,7) 

In [38]:
# 2) X2 = np.arange(1, 13).reshape(4, 3)    and    Y2 = np.arange(1, 5).reshape(4, 1) - works
X2 = np.arange(1, 13).reshape(4, 3)
Y2 = np.arange(1, 5).reshape(4, 1)
X2 + Y2

array([[ 2,  3,  4],
       [ 6,  7,  8],
       [10, 11, 12],
       [14, 15, 16]])

In [39]:
# X3 = np.arange(1, 9).reshape(4, 2)    and    Y3 = np.arange(1, 5).reshape(1, 4) - doesn't work
X3 = np.arange(1, 9).reshape(4, 2)
Y3 = np.arange(1, 5).reshape(1, 4)
X3 + Y3

ValueError: operands could not be broadcast together with shapes (4,2) (1,4) 