### Filtering Data

Suppose you work as a meterologist at a weather service. Your task is to maintain hourly rainfall data in millimeters (mm) in a array like this one:
```python
rain_mm_by_hour = np.array([0, 0, 0, 0, 0, 0, 0, 4, 2, 2, 4, 4, 0, 0, 0, 6, 0, 3, 6, 0, 0, 0, 3, 0])
```

You want to find the number of hours with heavy rainfall (i.e., rainfall >= 4 mm). One way to do that is to use a for loop with an if statement (similar to how it is done for lists). But you can do this more efficiently using a filtering operation.


In [1]:
import numpy as np

#### Index Arrays

Indexing arrays is a way to filter arrays by indexes of another array. Each element in the index array is replaced with the corresponding values in original array. An copy is made.

In [2]:
arr1 = np.array([2, 4, 6, 8, 10])
arr1

array([ 2,  4,  6,  8, 10])

In [3]:
arr2 = arr1[np.array([1, 2, 4])]
arr2

array([ 4,  6, 10])

#### Boolean Arrays

Another way to filter data is to create an array based another array of Booleans. Each element in the original is returned if the corresponsind boolean scalar is True. An array copy is made.

In [4]:
arr1 = np.array([2, 4 ,6, 8, 10])
arr1

array([ 2,  4,  6,  8, 10])

In [5]:
arr2 = arr1[np.array([True, False, False, True, True])]
arr2

array([ 2,  8, 10])

### Conditional Filtering

Another way to filter data is to create an array based on a conditional value. Only elements that meet the criteria are included.
An array copy is made.

- `&` - and
- `|` - or

In [6]:
arr1 = np.arange(1, 11, 2)
arr1

array([1, 3, 5, 7, 9])

In [9]:
arr2 = arr1[arr1 % 3 == 0]
arr2

array([3, 9])

In [10]:
arr3 = arr1[(arr1 % 3 == 0) | (arr1 % 5 == 0)]
arr3

array([3, 5, 9])

### Did i forget my umbrella again?

You are asked to report the number of hours with heavy rainfall (i.e ranfall greater than 4mm).

In [11]:
rain_mm_by_hour = np.array([0, 0, 0, 0, 0, 0, 0, 4, 2, 2, 4, 4, 0, 0, 0, 6, 0, 3, 6, 0, 0, 0, 3, 0])

In [16]:
more_than_4mm = rain_mm_by_hour[rain_mm_by_hour >= 4].size

In [17]:
more_than_4mm

5

In [18]:
rain_mm_by_hour > 4

array([False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False,  True, False, False,
        True, False, False, False, False, False])

### Let's Call it Even.

You want to know how much it rained during the even-numbered hours (i.e, Hours at even indexes in the array).


In [19]:
rain_mm_by_hour[::2].sum()



15

In [20]:
rain_mm_by_hour[0::2].sum()

15

In [21]:
rain_mm_by_hour[np.arange(0,len(rain_mm_by_hour),2)].sum()

15