In [1]:
import numpy as np

## 1. Advanced Indexing

NumPy offers more indexing facilities than regular Python sequences. In addition to indexing by integers and slices arrays can be indexed by arrays of integers and arrays of booleans.

In [5]:
X = np.arange(5)
X

array([0, 1, 2, 3, 4])

We create an array of indices with the indices we want to retrieve values from.

In [8]:
indices = np.array([3, 2])
X[indices]

array([3, 2])

We can repeat the index to select the element more than once

In [9]:
indices = np.array([3, 2, 2])
X[indices]

array([3, 2, 2])

Of course, negative indices are allowed

In [10]:
indices = np.array([2, -1])
X[indices]

array([2, 4])

For n-dimensional arrays we can index elements over each axis

In [17]:
X = np.arange(20).reshape(4, 5)
X

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [25]:
rows = np.array([0, 1])
cols = np.array([1, 2])
X[rows, cols]  # Equivalent to selecting X[0, 1] and X[1, 2]

array([1, 7])

It is not equivalent to the following code because slicing specifies a rectangular sub-array defined by ranges of rows and columns.

In [31]:
X[:2, 1:3]

array([[1, 2],
       [6, 7]])

In fact, rows and cols must have the same number of dimensions

In [36]:
rows = np.array([0, 1, 1])
cols = np.array([1, 2])
X[rows, cols]  # Equivalent to selecting X[0, 1] and X[1, 2]

IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,) 

We can combine both

In [38]:
X[:2, np.array([1, 2])]

array([[1, 2],
       [6, 7]])

## 2. Masking

Masking is a specific type of indexing used frequently to manipulate or filter data. A mask is a boolean array that matches the shape of the array you want to filter.

In [52]:
X = np.arange(12)
X

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [42]:
mask = np.ones_like(X)
mask[:5] = 0
mask = mask.astype(bool)
mask

array([False, False, False, False, False,  True,  True,  True,  True,
        True,  True,  True])

In [43]:
X[mask]

array([ 5,  6,  7,  8,  9, 10, 11])

We can generate a mask by evaluating conditions

In [45]:
mask_greater_than_5 = X > 5
X[mask_greater_than_5]

array([ 6,  7,  8,  9, 10, 11])

In [46]:
mask_is_even = X % 2 == 0
X[mask_is_even]

array([ 0,  2,  4,  6,  8, 10])

And we can combine conditions using logical operators

In [47]:
# AND
mask_even_and_greater_than_5 = mask_greater_than_5 & mask_is_even
X[mask_even_and_greater_than_5]

array([ 6,  8, 10])

In [48]:
# OR
mask_even_or_greater_than_5 = mask_greater_than_5 | mask_is_even
X[mask_even_or_greater_than_5]

array([ 0,  2,  4,  6,  7,  8,  9, 10, 11])

No need to generate the mask previously, we can directly generate the combined condition:

In [50]:
combined = (X > 5) & (X % 2 == 0)
X[combined]

array([ 6,  8, 10])

Masking can be used to assign elements:

In [53]:
X[X > 5] = 5
X

array([0, 1, 2, 3, 4, 5, 5, 5, 5, 5, 5, 5])

Works the same for n-dimensional arrays

In [55]:
X = np.arange(16).reshape(4,4)
X

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [58]:
mask = X > 5
mask

array([[False, False, False, False],
       [False, False,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])

In [59]:
X[mask]

array([ 6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

-----