# Problem Set 2.4: Selection by Boolean Columns

[Click here to open this notebook in your browser](https://leifwalsh.github.io/data-analysis-problem-sets/lab/index.html?path=2-pandas-basics/2.4-selection-by-boolean-columns/2.4-selection-by-boolean-columns.ipynb)

Explore how to select rows in a DataFrame based on a condition, rather than
their position (or label).

We'll use the same 'deck.csv' from [Problem Set
2.3](../2.3-row-and-column-selection-by-location/2.3-row-and-column-selection-by-location.ipynb).

In [None]:
import pandas as pd

deck_df = pd.read_csv('deck.csv')
deck_df

## Comparing columns

Just like you can add something to a column, you can compare a column to
something. This will return a Series of True and False values, where each
position is True if the value in that row matches your condition.

In [None]:
ten_plus = deck_df['value'] >= 10
ten_plus

This is True whenever the `>= 10` comparison was true, and False otherwise.

## Selection by Boolean Columns

These are very powerful when used with `.loc[]`. You can use a Series of True
and False values to select rows from a DataFrame.

In [None]:
# Select rows where 'rank' is greater than or equal to 10
selected_rows = deck_df.loc[ten_plus]
selected_rows

This operation is similar to applying a WHERE clause in SQL, where only rows that satisfy the specified condition are selected.

Often you'll see the condition written directly in the `.loc[]` call, like
this:

In [None]:
selected_rows = deck_df.loc[deck_df['value'] >= 10]
selected_rows

Do you remember all the comparison operators in Python?
- `==` for equality
- `!=` for inequality
- `>` for greater than
- `<` for less than
- `>=` for greater than or equal to
- `<=` for less than or equal to

We can get a winning poker hand by selecting all aces:

In [None]:
deck_df.loc[deck_df['rank'] == 'A']

### Combining selectors

If you're familiar with SQL, you might know you can combine multiple conditions
with AND and OR. For example, `WHERE a = 1 AND b = 2` would select only the
rows where both `a` is 1 and `b` is 2, and `WHERE a = 1 OR b = 2` would select
all the rows where either `a` is 1 or `b` is 2.

You can do the same thing in pandas, using the `&` operator for AND and the `|`
operator for OR.

Note: For Python experts, you may know that in normal Python, "and" and "or"
are keywords, you just write those words in your program:

```python
# This is just the word!
#         vvv
if a == 1 and b == 1:
    print("Both a and b are 1")
```

But in pandas (for technical reasons), you need to use `&` and `|` instead, and
it's good practice (and often necessary) to put each condition in
parentheses.

So, we can select across multiple columns with these combining operators:

In [None]:
deck_df.loc[(deck_df['value'] < 7) & (deck_df['suit'] == 'hearts')]

A straight flush!

Python experts will note that in pandas, these `&` and `|` are overloading the
bitwise AND and OR operators - in Python you can't overload what the `and` and
`or` keywords mean, but you can overload bitwise operators.

Similarly, pandas overloads the `~` operator for NOT, so you can select all the
non-clubs like this:

In [None]:
deck_df.loc[~(deck_df['suit'] == 'clubs')]

Of course, you also could have used `!=` to do the same thing:

In [None]:
deck_df.loc[deck_df['suit'] != 'clubs']

But in some cases, you'll end up wanting to use `~` to negate a more complex
condition.

## Exercises

Let's use Boolean columns to solve the same problems we solved in [the previous notebook](../2.3-row-and-column-selection-by-location/2.3-row-and-column-selection-by-location.ipynb). This will help us see how Boolean slicing can achieve the same results more easily.

### Exercise 1

Select the 'symbol', 'rank', and 'value' columns for the hearts with ranks 10 and above.

### Exercise 2

Select the rows where the 'suit' is 'hearts' and the 'rank' is 'A'.

### Exercise 3

Select the 'suit', 'rank', and 'value' columns for the cards with 'symbol' 'â™ '.

By slicing with Boolean columns, we can answer the same questions, but now our
code is easier to write, and expresses our *intent*. This way, if we got a
different deck of cards, the code would still conceptually do the same thing.

### Exercise 4

Stack the deck! Deal yourself a good poker hand.