# Agenda: Mask indexes

- Comparisons
- Broadcasts and comparisons
- Using that to filter our series with a "boolean index" or a "mask index"
- Complex comparisons with "and" and "or"

In [1]:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame

In [2]:
s = Series([10, 20, 30, 40, 50, 60, 70],
           index=list('abcdefg'))

In [3]:
s

a    10
b    20
c    30
d    40
e    50
f    60
g    70
dtype: int64

In [4]:
# I can retrieve any element of the series with either .loc (based on the index) or .iloc (based on the position)

In [5]:
s.loc['d']

40

In [6]:
s.iloc[4]

50

In [7]:
# Inside of the [], I can put a list of locations that I want to retrieve
# this is known as "fancy indexing"

s.loc[['a', 'd']]

a    10
d    40
dtype: int64

In [8]:
s.iloc[[2, 5]]

c    30
f    60
dtype: int64

In [9]:
# there is another way that we can retrieve values, though
# we can pass a list of boolean values (True and False)

s.loc[ [True, False, False, True, True, False, True] ]

a    10
d    40
e    50
g    70
dtype: int64

# Boolean/mask index

The idea here is:
- Pass, inside of `[]`, a list of booleans
- Wherever there is a True value, we get the value from the original series
- Wherever there is a False value, the original value is ignored

This is used all of the time, but you will almost never actually be typing True and False into square brackets.

That's because we can ask Pandas to create it for us automatically.

How? With comparison operators.

In [10]:
s > 30    # this is a comparison operation, broadcast across all values of s

a    False
b    False
c    False
d     True
e     True
f     True
g     True
dtype: bool

In [12]:
# I can take this boolean series and use it as a mask index with .loc

s.loc[ s > 30]   # only have [] once here, because we're getting a series back from s>30

# say this as: show the values of s where s > 30

d    40
e    50
f    60
g    70
dtype: int64