<a href="https://colab.research.google.com/github/ludawg44/jigsawlabs/blob/master/Copy_of_6_isin_any_all.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Isin, Any, and All

### Introduction

Now that we know how to filter by one column, let's see how we can write more advanced queries with numpy.  We'll do so using functions like `np.isin`, `np.any` and `np.all`, and we'll see how to use more complex conditionals.

### Checking for Multiple Values

Let's start by creating a grid of data.  We'll initialize an array of numbers 1 through 25, and then reshape this into a 5x5 array.

In [0]:
import numpy as np
increasing_grid = np.arange(1, 26).reshape(5, 5)
increasing_grid

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25]])

We already know how to select for rows that have a value in a specific location.

In [0]:
increasing_grid[0, :] == 1

array([ True, False, False, False, False])

For example, above we selected the first row, all columns, and checked if each item in the row equals 1.

Now, what if we want to check if any item in the row matches 1 or 5?  To do so, we can use the `np.isin` method.

In [0]:
np.isin(increasing_grid[0, :], [1, 5])

array([ True, False, False, False,  True])

So above, we found that the first and last items, match a 1 or a 5. 

In [0]:
# [ 1,  2,  3,  4,  5]

Now this also works across rows.  So above, we select all rows, where the first item in the row is 1 or 6.

In [0]:
increasing_grid[np.isin(increasing_grid[:, 0], [1, 6])]

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

### One query per Row

Now what if we want to ask whether, across the **entire row**, there is any match.  We may think that we can use `isin` to ask this question.

In [0]:
first_row = increasing_grid[0]
first_row

np.isin(first_row, [10])

array([False, False, False, False, False])

But this still returns us a True or False value or each item.  Really, we want to use `np.any`.

In [0]:
np.any(first_row > 10)

False

So with `np.isin` we get a boolean value for each item in the array.  

With `np.any`, we return a single value boolean value depending on if any item in the array equal to 10.  

> With `np.any` we ask a question of the *entire array*.  Only if all of the items in the array return False, does `np.any` return False.  Otherwise `np.any` returns True.

In [0]:
np.any(first_row > 4)

True

Let's take a moment to unpack how `np.any` works.  It works through boolean indexing.

In [0]:
first_row

array([1, 2, 3, 4, 5])

In [0]:
first_row > 10

array([False, False, False, False, False])

In [0]:
np.any(first_row > 10)

False

So with any, if there is any truthy statment in the array, we get a return value of True.

### Querying Multiple Rows

So we just saw that we can use `np.any` to see if any item in the row returns True for a query.  Now let's try to use `np.any` to perform this query for each row.

In [0]:
import numpy as np
increasing_grid = np.arange(1, 26).reshape(5, 5)
increasing_grid

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25]])

Our first attempt is to use `np.any` on the entire array. 

In [0]:
np.any(increasing_grid > 10)

True

 But this asks if *any* item in the entire array is greater than 10.

So instead we must provide the `axis = 1` to query across columns.

In [0]:
np.any(increasing_grid > 10, axis = 1)

array([False, False,  True,  True,  True])

And then we can then use boolean indexing to select the rows where any item is greater than 10.

In [0]:
increasing_grid[np.any(increasing_grid > 10, axis = 1)]

array([[11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25]])

> The `np.all` method works similarly, but only returns True **if every item** meets a certain criteria.

In [0]:
increasing_grid[np.all(increasing_grid < 10, axis = 1)]

array([[1, 2, 3, 4, 5]])

### Summary

In this lesson, we saw how we can use numpy for more advanced queries.  We saw how we can use `np.isin` to check if an item in the array matches one of multiple values, and to use methods like `np.any` or `np.all` to see if any or all items in an array are True.