In [39]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn

# Comparisons, Masks, and Boolean Logic

Python's comparison operators can be used with NumPy arrays to create boolean arrays, or masks. These masks are useful for filtering data.

Furthermore, bitwise operators in Python are used as boolean operators on masks to create more complex filtering conditions.

In [40]:
rainfall = pd.read_csv('../data/Seattle2014.csv')['PRCP'].values

In [41]:
inch_rainfall = rainfall / 254 # 1/10th of a millimeter --> inches
print(inch_rainfall.shape)

(365,)


In [42]:
# Comparison Operators for Boolean Arrays
# Example: Element-Wise comparison w/ comparison operators
x = np.array([1, 2, 3, 4, 5])
print(x < 3) # --> 3 is promoted via shape padding from (1,) to (5,) and then compared

# Example: Comparison Between Arrays
print(x == x)

[ True  True False False False]
[ True  True  True  True  True]


In [43]:
# Operations on Boolean Arrays
y = np.array([
    [5, 0, 3, 3],
    [7, 9, 3, 5],
    [2, 4, 7, 6]
])

# Example: Finding the number of elements fulfilling a specific condition
print(np.count_nonzero(y < 6)) # True = 1; False = 0

# Example: Any/All condition fulfillment
print(np.any(y == 0)) # `True` if any element in array `y == 0` is True
print(np.all(y > 0)) # `True` if all elements in array `y > 0` are True

8
True
False


### Masks as a means of selecting subsets of data (boolean indexing)
A powerful feature of boolean arrays is their ability to act as masks for selecting subsets of data. A mask is a boolean array that has the same shape as the data array. When you apply the mask to the data array, only the elements corresponding to `True` values in the mask are returned.

In [44]:
# Example: Using a mask to select values less than 5
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
mask = x < 5
print(f'Original array: {x}')
print(f'Mask: {mask}')
print(f'Masked array: {x[mask]}')

Original array: [ 1  2  3  4  5  6  7  8  9 10]
Mask: [ True  True  True  True False False False False False False]
Masked array: [1 2 3 4]


### Combining Masks with Boolean Operators
You can combine multiple masks using boolean operators like `&` (and) and `|` (or) to create more complex selections. These Python bitwise operators work on an element-wise basis.

Notably, the `and` and `or` keywords do not operate the same way. This is because Python uses `and`/`or` to evaluate the truthiness of **whole** objects, but with NumPy arrays, it's more ambiguous:
- Should a NumPy array be `True` because *some* elements are `True`?
- Or should it be `False` because *not all* elements are `True`?
- A `ValueError` will be raised.

In [45]:
# Example: Combining masks to select values between 3 and 8
mask1 = x > 3
mask2 = x < 8
print(f'Mask 1 (x > 3): {mask1}')
print(f'Mask 2 (x < 8): {mask2}')
print(f'Combined mask (x > 3 & x < 8): {mask1 & mask2}')
print(f'Masked array: {x[mask1 & mask2]}')

Mask 1 (x > 3): [False False False  True  True  True  True  True  True  True]
Mask 2 (x < 8): [ True  True  True  True  True  True  True False False False]
Combined mask (x > 3 & x < 8): [False False False  True  True  True  True False False False]
Masked array: [4 5 6 7]


### Using Masks to Modify Values
You can also use masks to modify values in an array. When you assign a value to a masked array, the value is assigned to all elements of the original array where the mask is `True`.

In [46]:
# Example: Modifying values using a mask
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print(f'Original array: {x}')
x[x > 5] = 0
print(f'Modified array: {x}')

Original array: [ 1  2  3  4  5  6  7  8  9 10]
Modified array: [1 2 3 4 5 0 0 0 0 0]


In [47]:
# Boolean Operators
# Example: Find number of days with rainfall **between** 0.5-1 inches
print(np.count_nonzero((inch_rainfall > 0.5) & (inch_rainfall < 1)))

# Note: `and` and `or` perform a single boolean evaluation on an entire object, while `&` and `|` perform multiple boolean evaluations on an object (element-wise). When using boolean arrays, the latter is desired.

# Example: Find number of days with rainfall less than 0.5 OR more than 1 inches
print(np.count_nonzero((inch_rainfall < 0.5) | (inch_rainfall > 1)))

# Example: Find the number of days with zero rainfall
print(np.count_nonzero(inch_rainfall == 0))

# Example: Find the number of days with non-zero rainfall
print(np.count_nonzero(inch_rainfall != 0))

29
333
215
150
