## Comparison, Masking and Boolean Logic

This notebooks explains how we can use NumPy to manipulate array values based on a criterion, we'll talk about the following:

1. How to use comparison operators as element-wise universal functions.
2. How to use boolean arrays with composite conditions.
3. How to use boolean arrays as masks to extract the required values.

In [2]:
import numpy as np

Just like the arithmetic operators computes element-wise operations. NumPy also has comparison operators implemented within the package that returns boolean output after computing the specified condition.

In [5]:
##defining an array
arr = np.array([10,20,30,40,50])

arr <= 20

array([ True,  True, False, False, False])

In [6]:
arr == 30

array([False, False,  True, False, False])

In [8]:
##equivalent ufunc for the comparison operator
np.greater(arr, 30)

array([False, False, False,  True,  True])

### Boolean Arrays

In [13]:
##creating a 2D array of random integers with values < 10
arr = np.random.randint(10, size=(3,4))
arr

array([[4, 6, 0, 6],
       [0, 5, 0, 2],
       [6, 4, 2, 9]])

In [16]:
##counting number of values that are greater than 5 using np.sum()
np.sum(arr > 5)

4

In [18]:
##counting number of values that are greater than 5 in each row
np.sum(arr > 5, axis=1)

array([2, 0, 2])

In [20]:
##checking if any or all value are above or below a threshold
np.any(arr > 5)

True

In [21]:
np.all(arr > 5)

False

## Boolean Operators to handle multiple conditions

In [33]:
##math test scores of students in a class
test_scores = np.random.randint(100, size=(30))
test_scores


array([68, 35, 91, 47, 21,  5, 57, 37, 85, 81, 72, 20, 87, 72, 47, 80, 43,
       65, 53,  9, 48, 77, 96, 54, 55, 50, 19, 26, 99,  5])

In [34]:
##number of students who scored more than 50 but less than 75

np.sum((test_scores > 50) & (test_scores < 75))

8

In [40]:
##extracting all the indices which meet the condition
np.where((test_scores > 90) | (test_scores < 40))

(array([ 1,  2,  4,  5,  7, 11, 19, 22, 26, 27, 28, 29]),)

## Masking

Using Boolean arrays as masks

In [45]:
##Extracting all the elements(values) that adhere to the mask
mask = test_scores > 90
test_scores[mask]

array([91, 96, 99])

In [54]:
##extracting all the average students scoring between 50 and 85
test_scores[(test_scores > 50) & (test_scores < 85) ]

array([68, 57, 81, 72, 72, 80, 65, 53, 77, 54, 55])