## Boolean indexing

Boolean indexing selects rows by applying a "mask" to the DataFrame. The mask consists of `True` or `False` values and should match the length of the DataFrame it's applied to. Applying the mask only selects the rows that have a `True` value in the mask. 

This is useful for selecting only the rows that match a certain condition:

In [1]:
example_df = pd.DataFrame({"A":[1.1,2.1,3.1],"B":[12,22,32]},
                    index=['obj1','obj2','obj3']) #example DataFrame
print(example_df)

        A   B
obj1  1.1  12
obj2  2.1  22
obj3  3.1  32


In [2]:
mask = example_df['A'] > 2  # determine with rows have a larger value than 2 in column A
print(mask)

obj1    False
obj2     True
obj3     True
Name: A, dtype: bool


In [3]:
example_df[mask] # select the rows that have True in the mask

Unnamed: 0,A,B
obj2,2.1,22
obj3,3.1,32


You can of course combine conditions as you learned in the first week with Boolean operators:

In [4]:
mask = (example_df['A'] > 2) & (example_df['B'] < 30)
print(mask)

obj1    False
obj2     True
obj3    False
dtype: bool


In [5]:
print(example_df[mask])

        A   B
obj2  2.1  22


By using the negate symbol `~`, you can flip `True` and `False` values:

In [6]:
print(~mask)

obj1     True
obj2    False
obj3     True
dtype: bool


In [7]:
print(example_df[~mask])

        A   B
obj1  1.1  12
obj3  3.1  32


Let's rehearse the Boolean indexing:

#### Which of the below statements selects both the first and the third row?

In [8]:
%%mc
mask = example_df['A'] >= 3.1
mask = example_df['B'] < 20
mask = (example_df['A'] >= 3.1) & (example_df['B'] < 20)
mask = (example_df['A'] >= 3.1) | (example_df['B'] < 20)

RadioButtons(layout=Layout(width='max-content'), options=("mask = example_df['A'] >= 3.1", "mask = example_df[…

In [9]:
%%check 
hashresult == 292691022 

0
That is the wrong answer


#### Create a mask that determines the rows where B > 20 and assign it to `mask`. Then use it to select these rows.

In [10]:
%%assignment
### ENTER YOUR CODE HERE


In [11]:
%%check
hashresult == 3593389503
mask.dtype == 'bool'
3.1 in example_df[mask].values 

0
The answer is wrong
