# 1.7 Boolean Indexing

In [1]:
import numpy as np

Let's consider an example where we have some data in an array and an array of names with duplicates.

In [None]:
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Joe', 'Will', 'Joe'])
data = np.array([[4, 7], [0, 2], [-5, 6], [0, 0], [1, 2], [-12, -4], [3, 4]])

print(f"Names array: {names}")
print(f"Data array:\n{data}")

Suppose each name corresponds to a row in the `data` array and we want to select all the rows with the corresponding name `'Bob'`. Like arithmetic operations, comparisons with arrays are also vectorized. Thus, comparing `names` with the string `'Bob'` yields a boolean array.

In [3]:
bob_mask = names == 'Bob'
print(f"Boolean mask for 'Bob': {bob_mask}")

Boolean mask for 'Bob': [ True False False  True False False False]


## 1.7.1 Selecting Data with a Boolean Mask

This boolean array can be passed when indexing the array.

In [None]:
print(f"Data for 'Bob':\n{data[bob_mask]}")

The boolean array must be of the same length as the array axis itâ€™s indexing. You can even mix and match boolean arrays with slices or integers.

In [None]:
# Select rows where names == 'Bob' and all columns from index 1
print(f"Columns from index 1 for 'Bob':\n{data[names == 'Bob', 1:]}")

# Select rows where names == 'Bob' and only column at index 1
print(f"Column at index 1 for 'Bob': {data[names == 'Bob', 1]}")

## 1.7.2 Inverting a Boolean Mask

To select everything but `'Bob'`, you can use `!=` or negate the condition using `~`.

In [None]:
# Using !=
print(f"Mask for names != 'Bob': {names != 'Bob'}")

# Using ~
print(f"Inverted mask for 'Bob': {~(names == 'Bob')}")

# Applying the inverted mask
print(f"Data for everyone except 'Bob':\n{data[~(names == 'Bob')]}")

## 1.7.3 Combining Boolean Conditions

To select two of the three names, combine multiple boolean conditions using the boolean arithmetic operators `&` (and) and `|` (or).

In [None]:
mask = (names == 'Bob') | (names == 'Will')
print(f"Combined mask for 'Bob' or 'Will': {mask}")
print(f"Data for 'Bob' or 'Will':\n{data[mask]}")

> **Note**: Selecting data with a boolean array **always creates a copy** of the data, even if the returned array is unchanged. The Python keywords `and` and `or` do not work with boolean arrays.


## 1.7.4 Setting Values with Boolean Indexing

Setting values with boolean arrays works by transferring the values on the right-hand side to the locations where the boolean array is `True`.

In [None]:
# Set all negative values in data to 0
print(f"Original data:\n{data}")
data[data < 0] = 0
print(f"Data after setting negative values to 0:\n{data}")

In [None]:
# Set all rows for 'Joe' to 7
data[names == 'Joe'] = 7
print(f"Data after setting 'Joe' rows to 7:\n{data}")
