# Searching and Filtering Arrays

This lesson is all about **searching** and **filtering** NumPy arrays, two useful operations used frequently when using NumPy.

In [1]:
import numpy as np

## Searching arrays

In addition to searching for *one* specific value in an array, NumPy arrays also allow us to search for elements through a *condition*. For instance, we can find the indices of all elements of an array that are even.

`numpy.where()` has the following syntax:

```python
np.where(condition)
```

The function call above will return an **array of indices** of the elements that match the condition.

In [2]:
arr = np.array(range(10))
print(np.where(arr % 2 == 0))
print(np.where(arr > 5))

(array([0, 2, 4, 6, 8], dtype=int64),)
(array([6, 7, 8, 9], dtype=int64),)


In [3]:
arr = np.array([1, 2, 3, 3, 4, 3])
print(np.where(arr == 3))

(array([2, 3, 5], dtype=int64),)


## Filtering arrays

Filtering is the process of creating a new array out of *some* elements of an existing array. For instance, if we have an array of ages, we can use filtering to create a new array of ages that are above 18.

In [4]:
ages = np.array([7, 9, 11, 13, 14, 19, 22, 24, 26, 29, 30, 32])
print(ages)

[ 7  9 11 13 14 19 22 24 26 29 30 32]


In [5]:
mask = ages >= 18
print(mask)

[False False False False False  True  True  True  True  True  True  True]


In [6]:
print(ages[mask])

[19 22 24 26 29 30 32]


Or, we can combine the filtering into 1 line as shown below.

In [7]:
print(ages[ages >= 18])

[19 22 24 26 29 30 32]


We can also use `and` and `or` in the conditions using the `&` and `|` operators. Note that the parentheses are required for the code to work.

In [8]:
print(ages[(ages >= 18) & (ages < 26)])
print(ages[(ages < 18) | (ages >= 26)])

[19 22 24]
[ 7  9 11 13 14 26 29 30 32]


`~` is the `not` operator in NumPy.

In [9]:
print(ages[(ages >= 18) & ~(ages >= 26)])

[19 22 24]


## `numpy.any()` and `numpy.all()`

The `numpy.any()` and `numpy.all()` functions determine whether **any** or **all** of the elements in an array evaluates to `True`, with an optional `axis` parameter.

In [10]:
scores = np.array([
    [93, 91, 94, 84, 86, 97, 83, 80],
    [90, 78, 98, 82, 90, 93, 90, 76]
])
print(scores)

a = scores >= 90
print(a)

[[93 91 94 84 86 97 83 80]
 [90 78 98 82 90 93 90 76]]
[[ True  True  True False False  True False False]
 [ True False  True False  True  True  True False]]


In [11]:
print(np.any(a))
print(np.any(a, axis=0))
print(np.any(a, axis=1))

True
[ True  True  True False  True  True  True False]
[ True  True]


In [12]:
print(np.all(a))
print(np.all(a, axis=0))
print(np.all(a, axis=1))

False
[ True False  True False False  True False False]
[False False]


## Summary

Today, we learned about

1. Searching through arrays with conditions using `np.where()`
2. Filtering arrays using conditions
3. `numpy.any()` and `numpy.all()`