<img src='images/pandas.png' width='300px' align=left>
<img src='images/gdd-logo.png' width='200px' align='right' style="padding: 15px">



# Frequently asked questions about Pandas

## Is there a `between` method when filtering?

If you've worked in `SQL` you'll know bout the `BETWEEN` keyword that allows you to filter rows where a certain column has values that lie in between two values. There is something very similar to that in pandas:

In [None]:
import pandas as pd

chickweight = (
    pd.read_csv('data/chickweight.csv')
    .rename(columns=str.lower)
)

chickweight.head()

Imagine there was a task that required to only analyse days 8 - 12 (column: `time`). This data could be filtered using two separate filters, one for values *above 8* and one for values *below 12*:

In [None]:
(
    chickweight
    .loc[lambda df: df['time'] >= 8]
    .loc[lambda df: df['time'] <= 12]
)

This result can be achieved in a simpler way, in one line. 

Unfortunately it is not as simple as writing greater than & less than on one line. This for example will not work:

In [None]:
# (
#     chickweight
#     .loc[lambda df: 8 <= df['time'] <= 12]
# )

Instead, a method can be used:

In [None]:
(
    chickweight
    .loc[lambda df: df['time'].between(8, 12)]
)

Note that in the above example, (and so far), the filter is inclusive of the upper and lower bound, which can be seen using the unique method on the time column:

In [None]:
(
    chickweight
    .loc[lambda df: df['time'].between(8, 12)]
    ['time'].unique()
)

In the initial example with two filters, you could change the operators from **less than or equal** to **less than** (`<` instead of `<=`).



In [None]:
(
    chickweight
    .loc[lambda df: df['time'] > 8]
    .loc[lambda df: df['time'] < 12]
)

With the `.between()` method, you can specify with the parameter `inclusive=`.

The options are:
- `'both'` (default)
- `'neither'`
- `'left'`
- `'right'`

In [None]:
(
    chickweight
    .loc[lambda df: df['time'].between(8, 12, inclusive='neither')]
)

### <mark>Exercise: Try it out!</mark>

Try changing the parameter value (`'neither'`) to the values above - does the output look as you expect?

In [None]:
(
    chickweight
    .loc[lambda df: df['time'].between(8, 12, inclusive='neither')]
)

# Conclusion

You now know the between method, which can be used when filtering a column to return the rows where the values are between two specified values.

The between method can be used on the column - eg. `df['column'].between(n1, n2)` - inside the `.loc[]` indexer.

You can also include the parameter `inclusive=` which takes the values `'both'` (default), `'neither'`, `'left'`, `'right'`.