## Broadcasting

-   Basic operations on `numpy` arrays (addition, etc.) are elementwise

-   This works on arrays of the same size.


    **Nevertheless**, It's also possible to do operations on arrays of
    different sizes if *NumPy* can transform these arrays so that they all
    have the same size: this conversion is called **broadcasting**.


<img src="images/numpy_broadcasting.png" class="align-center" alt="image" style="height: 500px; width: 800px" />

Let's verify:

In [12]:
import numpy as np
a = np.tile(np.arange(0, 40, 10), (3, 1)).T
a

array([[ 0,  0,  0],
       [10, 10, 10],
       [20, 20, 20],
       [30, 30, 30]])

In [4]:
a = np.tile(np.arange(0, 40, 10),(3, 1)).T

array([[ 0,  0,  0],
       [10, 10, 10],
       [20, 20, 20],
       [30, 30, 30]])

In [5]:
b = np.array([0, 1, 2])

In [9]:
b

array([0, 1, 2])

In [11]:
a,b,a+b

(array([[ 0,  0,  0],
        [10, 10, 10],
        [20, 20, 20],
        [30, 30, 30]]),
 array([0, 1, 2]),
 array([[ 0,  1,  2],
        [10, 11, 12],
        [20, 21, 22],
        [30, 31, 32]]))

In [3]:
b = np.array([0, 1, 2])
a + b

array([[ 0,  1,  2],
       [10, 11, 12],
       [20, 21, 22],
       [30, 31, 32]])

We have already used broadcasting without knowing it!:

In [None]:
a = np.ones((4, 5))
a[0] = 2  # we assign an array of dimension 0 to an array of dimension 1
a

array([[2.,  2.,  2.,  2.,  2.],
       [1.,  1.,  1.,  1.,  1.],
       [1.,  1.,  1.,  1.,  1.],
       [1.,  1.,  1.,  1.,  1.]])

In [2]:
a = np.array([0, 1, 2])
b = np.array([5, 5, 5])
a + b

array([5, 6, 7])

Broadcasting allows these types of binary operations to be performed on arrays of different sizes–for example, we can just as easily add a scalar (think of it as a zero-dimensional array) to an array:

In [3]:
a + 5

array([5, 6, 7])

We can think of this as an operation that stretches or duplicates the value ``5`` into the array ``[5, 5, 5]``, and adds the results.
The advantage of NumPy's broadcasting is that this duplication of values does not actually take place, but it is a useful mental model as we think about broadcasting.

We can similarly extend this to arrays of higher dimension. Observe the result when we add a one-dimensional array to a two-dimensional array:

In [4]:
M = np.ones((3, 3))
M

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In [5]:
M + a

array([[ 1.,  2.,  3.],
       [ 1.,  2.,  3.],
       [ 1.,  2.,  3.]])

Here the one-dimensional array ``a`` is stretched, or broadcast across the second dimension in order to match the shape of ``M``.

While these examples are relatively easy to understand, more complicated cases can involve broadcasting of both arrays. Consider the following example:

In [6]:
a = np.arange(3)
b = np.arange(3)[:, np.newaxis]

print(a)
print(b)

[0 1 2]
[[0]
 [1]
 [2]]


In [7]:
a + b

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

Just as before we stretched or broadcasted one value to match the shape of the other, here we've stretched *both* ``a`` and ``b`` to match a common shape, and the result is a two-dimensional array!

The light boxes represent the broadcasted values: again, this extra memory is not actually allocated in the course of the operation, but it can be useful conceptually to imagine that it is.

## Rules of Broadcasting

Broadcasting in NumPy follows a strict set of rules to determine the interaction between the two arrays:

- Rule 1: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is *padded* with ones on its leading (left) side.
- Rule 2: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.
- Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is raised.

To make these rules clear, let's consider a few examples in detail.

### Broadcasting example 1

Let's look at adding a two-dimensional array to a one-dimensional array:

In [8]:
M = np.ones((2, 3))
a = np.arange(3)

Let's consider an operation on these two arrays. The shape of the arrays are

- ``M.shape = (2, 3)``
- ``a.shape = (3,)``

We see by rule 1 that the array ``a`` has fewer dimensions, so we pad it on the left with ones:

- ``M.shape -> (2, 3)``
- ``a.shape -> (1, 3)``

By rule 2, we now see that the first dimension disagrees, so we stretch this dimension to match:

- ``M.shape -> (2, 3)``
- ``a.shape -> (2, 3)``

The shapes match, and we see that the final shape will be ``(2, 3)``:

In [9]:
M + a

array([[ 1.,  2.,  3.],
       [ 1.,  2.,  3.]])

### Broadcasting example 2

Let's take a look at an example where both arrays need to be broadcast:

In [13]:
a = np.arange(3).reshape((3, 1))
b = np.arange(3)
a,b

(array([[0],
        [1],
        [2]]),
 array([0, 1, 2]))

Again, we'll start by writing out the shape of the arrays:

- ``a.shape = (3, 1)``
- ``b.shape = (3,)``

Rule 1 says we must pad the shape of ``b`` with ones:

- ``a.shape -> (3, 1)``
- ``b.shape -> (1, 3)``

And rule 2 tells us that we upgrade each of these ones to match the corresponding size of the other array:

- ``a.shape -> (3, 3)``
- ``b.shape -> (3, 3)``

Because the result matches, these shapes are compatible. We can see this here:

In [11]:
a + b

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

### Broadcasting example 3

Now let's take a look at an example in which the two arrays are not compatible:

In [14]:
M = np.ones((3, 2))
a = np.arange(3)
a,M

(array([0, 1, 2]),
 array([[1., 1.],
        [1., 1.],
        [1., 1.]]))

This is just a slightly different situation than in the first example: the matrix ``M`` is transposed.
How does this affect the calculation? The shape of the arrays are

- ``M.shape = (3, 2)``
- ``a.shape = (3,)``

Again, rule 1 tells us that we must pad the shape of ``a`` with ones:

- ``M.shape -> (3, 2)``
- ``a.shape -> (1, 3)``

By rule 2, the first dimension of ``a`` is stretched to match that of ``M``:

- ``M.shape -> (3, 2)``
- ``a.shape -> (3, 3)``

Now we hit rule 3–the final shapes do not match, so these two arrays are incompatible, as we can observe by attempting this operation:

In [15]:
M + a

ValueError: operands could not be broadcast together with shapes (3,2) (3,) 

Note the potential confusion here: you could imagine making ``a`` and ``M`` compatible by, say, padding ``a``'s shape with ones on the right rather than the left.
But this is not how the broadcasting rules work!
That sort of flexibility might be useful in some cases, but it would lead to potential areas of ambiguity.
If right-side padding is what you'd like, you can do this explicitly by reshaping the array (we'll use the ``np.newaxis`` keyword introduced in [The Basics of NumPy Arrays](02.02-The-Basics-Of-NumPy-Arrays.ipynb)):

In [14]:
a[:, np.newaxis].shape

(3, 1)

In [15]:
M + a[:, np.newaxis]

array([[ 1.,  1.],
       [ 2.,  2.],
       [ 3.,  3.]])

A useful trick:

In [None]:
a = np.arange(0, 40, 10)
a.shape

(4,)

In [None]:
a = a[:, np.newaxis]  # adds a new axis -> 2D array
a.shape

(4, 1)

In [None]:
a

array([[ 0],
       [10],
       [20],
       [30]])

In [None]:
a + b

array([[ 0,  1,  2],
       [10, 11, 12],
       [20, 21, 22],
       [30, 31, 32]])

Tip

Broadcasting seems a bit magical, but it is actually quite natural to
use it when we want to solve a problem whose output data is an array
with more dimensions than input data.

## Array shape manipulation

### Flattening

In [16]:
a = np.array([[1, 2, 3], [4, 5, 6]])
a,a.ravel()

(array([[1, 2, 3],
        [4, 5, 6]]),
 array([1, 2, 3, 4, 5, 6]))

In [5]:
a.T

array([[1, 4],
       [2, 5],
       [3, 6]])

In [6]:
a.T.ravel()

array([1, 4, 2, 5, 3, 6])

### Reshaping

The inverse operation to flattening:

In [7]:
a.shape

(2, 3)

In [17]:
b = a.ravel()
b

array([1, 2, 3, 4, 5, 6])

In [18]:
b = b.reshape((2, 3))
b

array([[1, 2, 3],
       [4, 5, 6]])

Or,

In [9]:
a.reshape((2, -1))    # unspecified (-1) value is inferred

array([[1, 2, 3],
       [4, 5, 6]])

Warning

`ndarray.reshape` **may** return a view (cf `help(np.reshape)`)), or
copy

Tip

In [None]:
b[0, 0] = 99
a

array([[99,  2,  3],
       [ 4,  5,  6]])

Beware: reshape may also return a copy!:

In [None]:
a = np.zeros((3, 2))
b = a.T.reshape(3*2)
b[0] = 9
a

array([[0.,  0.],
       [0.,  0.],
       [0.,  0.]])

### Adding a dimension

Indexing with the `np.newaxis` object allows us to add an axis to an
array (you have seen this already above in the broadcasting section):

In [None]:
z = np.array([1, 2, 3])
z

array([1, 2, 3])


In [None]:
z[:, np.newaxis]

array([[1],
       [2],
       [3]])


In [None]:
z[np.newaxis, :]

array([[1, 2, 3]])

### Resizing

Size of an array can be changed with `ndarray.resize` :

In [22]:
a = np.arange(4)
a.resize((8,))
a

array([0, 1, 2, 3, 0, 0, 0, 0])

However, it must not be referred to somewhere else:

In [23]:
b = a
a.resize((4,)) 

ValueError: cannot resize an array that references or is referenced
by another array in this way.
Use the np.resize function or refcheck=False

## Sorting data

Sorting along an axis:

In [26]:
a = np.array([[4, 3, 5], [1, 2, 1]])
b = np.sort(a, axis=1)
b

array([[3, 4, 5],
       [1, 1, 2]])

In [25]:
a = np.array([[4, 3, 5], [1, 2, 1]])
b = np.sort(a, axis=0)
b

array([[1, 2, 1],
       [4, 3, 5]])

Note

Sorts each row separately!

In-place sort:

In [None]:
a.sort(axis=1)
a

array([[3, 4, 5],
       [1, 1, 2]])

Sorting with fancy indexing:

In [None]:
a = np.array([4, 3, 1, 2])
j = np.argsort(a)
j

array([2, 3, 1, 0])

In [None]:
a[j]

array([1, 2, 3, 4])