# Agenda: Mask Index

- Comparisons
- Broadcasts and comparisons
- Using that to filter our series with a "boolean index" or a "mask index"
- Complex comparisons with "and" and "or"

In [1]:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame

In [7]:
s = Series([10,20,30,40,50,60,70], index=list('abcdefg'))

In [9]:
s.loc['d']

40

In [10]:
s.iloc[4]

50

In [11]:
#inside of the [], I can put a list of locations that I want to retrieve

s.loc[['a','d']]

a    10
d    40
dtype: int64

In [12]:
s.iloc[[2,5]]

c    30
f    60
dtype: int64

In [13]:
#there is anoether way that we can retrieve values, though we can pass a list of boolean values

s.loc[[True,False,False,True,True,False,True]]

a    10
d    40
e    50
g    70
dtype: int64

# Boolean/mask index

The idea here is: 
- Pass, inside of [], a list of booleans
- Whereever there is a True value, we get the value from the original series
- Whereever there is a False value, the original value is ignored

In [14]:
s > 30 #This is a comparison operation, broadcast across all values of s

a    False
b    False
c    False
d     True
e     True
f     True
g     True
dtype: bool

In [16]:
# I can take this boolean and use it as a mask index with .loc

s.loc[s>30] #only have [] once here, becasue we're getting a series back from s>30

#say this as: show the values of s where s>30

d    40
e    50
f    60
g    70
dtype: int64

# How to read a mask index expression

- First, look at the stuff inside of the []. What expression is there, and what does it return?
- Next, think of it as an existing boolean series
- Then apply that boolean series to the series on the outside

In [18]:
#Let's find all of the values that are greater than the mean

s>s.mean()

a    False
b    False
c    False
d    False
e     True
f     True
g     True
dtype: bool

In [19]:
#this gives us all of the values where the value is greater than the mean of the series

s.loc[s>s.mean()]

e    50
f    60
g    70
dtype: int64

In [21]:
#this does not work with integer values

s.loc[[1,0,1,0,1,1,0]]

KeyError: "None of [Index([1, 0, 1, 0, 1, 1, 0], dtype='int64')] are in the [index]"

#