# Numpy Broadcasting and boolean indexing

The goal of this notebook is to understand how numpy automatically (and virtually) increases the size of numpy arrays when it needs to perform operations between two arrays of different shapes. It should also cover boolean indexing but currently it has little details.


In [3]:
import numpy as np
SEP = "\n======================\n"

In [4]:
%%html
<style>
/* *********************************************************** */
/* styling the notebook, you can ignore it if it does not work */
/* *********************************************************** */
h3 { color: #60a5fa !important; text-decoration: underline; font-variant-caps: small-caps;}
.jp-OutputArea-output { border-left: 10px solid grey; margin-left: 20px; }
.spoiler { background: black; color: black;  margin-bottom: .1em; }
.spoiler:hover { background: white; transition: background 1s 1s;}
</style>

### Tip: another way of reshaping for adding axes

We will often need to add dimensions to an existing array, e.g., transform a 1d array into a 2d array with a single line (or column), and the same with more dimensions.
This will often be useful in combination with broadcasting. This can be done by indexing with `None`.

In [6]:
a = np.random.uniform(-1, 1, (51, 42))
# we want to add a third axis to a
b = a.reshape((51, 42, 1))
c = a[:,:,None]
# here we check that b and c are equal
np.testing.assert_equal(b, c)
print(b.shape)

(51, 42, 1)


In [7]:
# with more dimensions, it works too
a2 = np.random.uniform(-1, 1, (49, 51, 42)) # 3d
b2 = a2.reshape((1, 49,   1,    1, 51,   1, 42,   1,   1)) # 9d
c2 =        a2[None, :, None, None, :, None, :, None, None]
np.testing.assert_equal(b2, c2)
print(b2.shape)

(1, 49, 1, 1, 51, 1, 42, 1, 1)


NB
- If used a lot a trick is to affect `None` to a variable to have a shorter notation.
- Python additionaly allows `...` as a way to replace several `:`

In [8]:
a3 = np.random.uniform(-1, 1, (49, 51, 42)) # 3d
Z = None
b3 = a3[Z, :, Z, Z, :, :, Z, Z]
# Python also allows non-latin letters
ⵁ = None
c3 = a3[ⵁ, :, ⵁ, ⵁ, :, :, ⵁ, ⵁ]
print(c3.shape)

# using the ellipsis, numpy imagines as many ":" as needed (here 2 as there is a first one on the left and the a3 has 3 dimensions)
d3 = a3[ⵁ, :, ⵁ, ⵁ, ..., ⵁ, ⵁ]
print(d3.shape)

(1, 49, 1, 1, 51, 42, 1, 1)
(1, 49, 1, 1, 51, 42, 1, 1)


### Numpy Broadcasting

Imagine we have a 2d matrix and want to subtract to each value of the matrix the minimum of its column.

We can start by computing the mean (we start from a random matrix).

In [9]:
data = np.random.uniform(10, 20, (6, 4))
m = np.min(data, axis=0) # mean along rows so we get one value per column
print(data, m, sep=SEP, end=SEP)

[[11.54448359 12.22853877 18.6722278  10.98080832]
 [17.91521505 12.06056368 18.82292917 11.64309615]
 [12.73514833 17.40008437 10.67817168 12.9948881 ]
 [13.4962982  11.7942246  11.12903364 13.85446318]
 [10.71745962 15.78673231 18.5436297  18.0463786 ]
 [16.1090122  16.43036427 12.51605652 11.14857854]]
[10.71745962 11.7942246  10.67817168 10.98080832]


In [10]:
# numpy automatically changes the shape of m and replicates it when we do an operation
print(data.shape, m.shape, sep=SEP, end=SEP)
r1 = data - m
print(r1)

(6, 4)
(4,)
[[0.82702398 0.43431417 7.99405612 0.        ]
 [7.19775544 0.26633908 8.14475749 0.66228783]
 [2.01768871 5.60585977 0.         2.01407978]
 [2.77883858 0.         0.45086196 2.87365487]
 [0.         3.99250771 7.86545802 7.06557028]
 [5.39155258 4.63613967 1.83788484 0.16777022]]


However if we try the with rows, without further care, there is an error.

In [11]:
m2 = np.min(data, axis=1)
print(data.shape, m2.shape, sep=SEP, end=SEP)
print("we will get a shape error (cannot broadcast)")
r2 = data - m2
# ^ error

(6, 4)
(6,)
we will get a shape error (cannot broadcast)


ValueError: operands could not be broadcast together with shapes (6,4) (6,) 

Here we could use the parameters of aggregation functions to solve the problem, but in general we might do reshaping to achieve our goal.

In [12]:
m3 = np.min(data, axis=1, keepdims=True)
print(data.shape, m3.shape, sep=SEP, end=SEP)
r3 = data - m3
print(r3)
# or with a reshaping of m2
r4 = data - m2[:, None]
np.testing.assert_equal(r3, r4)

(6, 4)
(6, 1)
[[0.56367527 1.24773045 7.69141948 0.        ]
 [6.2721189  0.41746753 7.17983302 0.        ]
 [2.05697665 6.72191269 0.         2.31671642]
 [2.36726456 0.66519096 0.         2.72542954]
 [0.         5.0692727  7.82617008 7.32891898]
 [4.96043366 5.28178573 1.36747798 0.        ]]


### Broadcasting rules?
- To have any operation such as `a + b` work
    - for each axis i (NB: we can reverse a and b)
        - either we have the same number of elements (`a.shape[i] == b.shape[i]`)
        - or one of them is `1` (`a.shape[i] == 1 or b.shape[i] == 1`)
- Example
```python
a = np.random.uniform(0, 1, (10, 42,  1,  1, 12, 98))
b = np.random.uniform(0, 1, (10,  1, 21,  1,  1,  1))
c = np.random.uniform(0, 1, (10,  1, 21,  1,  4,  1))
d = a + b  # Ok
e = a + c  # not Ok (4 vs 12 ━━━━━━━━━━━━━━━━━┛)
f = c + b  # Ok
```
- Additional case with a different number of dimensions
    - if an array is "smaller" (less dimensions/axes),
    - numpy accepts to automatically prependd some `1` elements to the shape
```python
a = np.random.uniform(0, 1, (              12, 98))
b = np.random.uniform(0, 1, (       21, 1,  1,  1))
c = np.random.uniform(0, 1, (10, 1, 21, 1,  4,  1))
d = a + b  # Ok, a ->              ( 1, 1, 12, 98)
e = a + c  # Not ok 4 vs 12 ( 1, 1,  1, 1, 12, 98)
f = c + b  # Ok, b ->       ( 1, 1, 21, 1,  1,  1)
```


### Examples

In [13]:
# computing a multiplication table
oneten = np.arange(1, 11)
print(oneten[:, None] * oneten[None, :], end=SEP)
print(oneten[:, None] * oneten, end=SEP)

[[  1   2   3   4   5   6   7   8   9  10]
 [  2   4   6   8  10  12  14  16  18  20]
 [  3   6   9  12  15  18  21  24  27  30]
 [  4   8  12  16  20  24  28  32  36  40]
 [  5  10  15  20  25  30  35  40  45  50]
 [  6  12  18  24  30  36  42  48  54  60]
 [  7  14  21  28  35  42  49  56  63  70]
 [  8  16  24  32  40  48  56  64  72  80]
 [  9  18  27  36  45  54  63  72  81  90]
 [ 10  20  30  40  50  60  70  80  90 100]]
[[  1   2   3   4   5   6   7   8   9  10]
 [  2   4   6   8  10  12  14  16  18  20]
 [  3   6   9  12  15  18  21  24  27  30]
 [  4   8  12  16  20  24  28  32  36  40]
 [  5  10  15  20  25  30  35  40  45  50]
 [  6  12  18  24  30  36  42  48  54  60]
 [  7  14  21  28  35  42  49  56  63  70]
 [  8  16  24  32  40  48  56  64  72  80]
 [  9  18  27  36  45  54  63  72  81  90]
 [ 10  20  30  40  50  60  70  80  90 100]]


### Challenge

We start with 6 points in $R^{42}$ and want to compute their pairwise distances, so to produce a 6×6 matrix. Here we have 6 points but we want the code to work with 1000 points too. We must not use any loop.

In [14]:
np.random.seed(42000)
X = np.random.normal(0, 1, (6, 42))
print(X.shape)

(6, 42)


In [34]:
# TODO for you: try to implement the computation using broadcasting, operations and aggregation.

X1 = X.reshape(1, 6, 42)
X2 = X.reshape(6, 1, 42)

# print(X1)
# print(X2)

res = np.sum((X1 - X2)**2, axis=2)**0.5
print(res)

[[ 0.          9.35375341  8.24845551  9.84140797  9.11748407  9.50108475]
 [ 9.35375341  0.          9.77851827  9.9004302   8.98842626 10.21625238]
 [ 8.24845551  9.77851827  0.         10.14422513  8.1842549  10.04970856]
 [ 9.84140797  9.9004302  10.14422513  0.          8.75705682 10.87224785]
 [ 9.11748407  8.98842626  8.1842549   8.75705682  0.          9.47147629]
 [ 9.50108475 10.21625238 10.04970856 10.87224785  9.47147629  0.        ]]


Spoiler / help (move your mouse on a box for a few seconds to reveal)

<div class="spoiler">
First: make two versions of X but in 3d, so that they share the "feature" axis but not the "point" axis.
</div>
<div class="spoiler">
First, more: shapes could be (1, 6, 42) and (6, 1, 42)
</div>
<div class="spoiler">
Second: subtract, square.
</div>
<div class="spoiler">
Third: sum over the feature axis.
</div>
<div class="spoiler">
Third, more: sum over the feature axis which is 2.
</div>
<div class="spoiler">
Finally, take the square root.
</div>

In [35]:
expected = np.array([[0.0,               9.353753413010876,  8.248455506246838,  9.841407966770515,  9.117484070378321, 9.501084747277666],
                     [9.353753413010876, 0.0,                9.778518269006355,  9.90043019925482,   8.988426256845749, 10.216252382778956],
                     [8.248455506246838, 9.778518269006355,  0.0,                10.144225130751577, 8.184254902310725, 10.049708564600387],
                     [9.841407966770515, 9.90043019925482,   10.144225130751577, 0.0,                8.75705682231122,  10.872247848486728],
                     [9.117484070378321, 8.988426256845749,  8.184254902310725,  8.75705682231122,   0.0,               9.471476285979634],
                     [9.501084747277666, 10.216252382778956, 10.049708564600387, 10.872247848486728, 9.471476285979634, 0.0]])
np.testing.assert_allclose(res, expected)
print("Good job!")

Good job!


### Boolean indexing

It is possible to access some elements of an array using a boolean mask (of the same shape).
The results is always a 1d array, it can also be used on the left-hand side of an affectation.

In [36]:
t1 = np.random.randint(-9, 10, (3, 10))
print(t1)

[[-6  8  0  5 -1  5  5 -8  0  0]
 [ 2  1  9 -9  2  3 -9  7  5  6]
 [ 7  0  0 -2  3 -6  9 -5 -7 -5]]


In [37]:
# Creating boolean arrays... is just applying an existing comparison operator
t2 = t1 > 5
t3 = t1 < 0
t4 = t1**2 < 3
print(t2, t3, t4, t4*1, sep=SEP, end=SEP)

[[False  True False False False False False False False False]
 [False False  True False False False False  True False  True]
 [ True False False False False False  True False False False]]
[[ True False False False  True False False  True False False]
 [False False False  True False False  True False False False]
 [False False False  True False  True False  True  True  True]]
[[False False  True False  True False False False  True  True]
 [False  True False False False False False False False False]
 [False  True  True False False False False False False False]]
[[0 0 1 0 1 0 0 0 1 1]
 [0 1 0 0 0 0 0 0 0 0]
 [0 1 1 0 0 0 0 0 0 0]]


In [38]:
# Using a boolean array (mask) to access the elements (where the value is true)
t10 = t1[t2]
t11 = t1[t3]
t12 = t1[t4]
t13 = t1[t1%2 == 0]
print(t10, t11, t12, t13, sep=SEP, end=SEP)

[8 9 7 6 7 9]
[-6 -1 -8 -9 -9 -2 -6 -5 -7 -5]
[ 0 -1  0  0  1  0  0]
[-6  8  0 -8  0  0  2  2  6  0  0 -2 -6]


In [39]:
t20 = np.random.randint(-9, 10, (5, 10)) * 111
print(t20, end=SEP)
t20[t20<0] = 1
print(t20, end=SEP)
t20[t20 % 2 == 0] = 2
print(t20, end=SEP)
t20[t20 % 2 == 0] = 2
print(t20, end=SEP)

[[-222  888  111  333  444 -333  444  666  222 -777]
 [-999  555  999 -333  888 -111  666 -999 -666  111]
 [ 888  111 -111 -222  777 -555  999  111 -666 -999]
 [-888 -222    0  999  888 -888  111  111  888    0]
 [   0 -444 -111 -222 -999 -999 -555 -222  777  777]]
[[  1 888 111 333 444   1 444 666 222   1]
 [  1 555 999   1 888   1 666   1   1 111]
 [888 111   1   1 777   1 999 111   1   1]
 [  1   1   0 999 888   1 111 111 888   0]
 [  0   1   1   1   1   1   1   1 777 777]]
[[  1   2 111 333   2   1   2   2   2   1]
 [  1 555 999   1   2   1   2   1   1 111]
 [  2 111   1   1 777   1 999 111   1   1]
 [  1   1   2 999   2   1 111 111   2   2]
 [  2   1   1   1   1   1   1   1 777 777]]
[[  1   2 111 333   2   1   2   2   2   1]
 [  1 555 999   1   2   1   2   1   1 111]
 [  2 111   1   1 777   1 999 111   1   1]
 [  1   1   2 999   2   1 111 111   2   2]
 [  2   1   1   1   1   1   1   1 777 777]]
