In [1]:
import pandas as pd
import numpy as np
np.random.seed(seed=42)

### Pandas Query

Pandas Query is the other way to filter data, the one that you don't usually use but you might want to consider.

We will run through 3 examples:
1. Simple filter for a column
2. Filtering columns based off of each other
3. Using an environment variable to filter

First, let's create our DataFrame

In [5]:
df = pd.DataFrame.from_dict({"Name": ['Liho Liho', 'Tompkins', 'The Square', 'Chambers'],
                             "Mon": np.random.randint(10,200, size=(1,4))[0],
                             "Tues": np.random.randint(12,200, size=(1,4))[0],
                             "Wed": np.random.randint(12,200, size=(1,4))[0],
                             "Thurs": np.random.randint(12,200, size=(1,4))[0]}, orient='columns')
df

Unnamed: 0,Name,Mon,Tues,Wed,Thurs
0,Liho Liho,159,169,32,100
1,Tompkins,62,49,172,60
2,The Square,11,141,69,70
3,Chambers,97,199,33,181


### 1. Simple filter for a column
To query (filter) your data, all you need to do is pass a string with a conditional expression. This is super similar to writing a forumla in an excel cell.

Notice here I'm querying my data for the rows where the "Mon" column is greater then the 90. There are only two rows that satisfy this filter, and they are returned.

In [9]:
df.query('Mon > 90')

Unnamed: 0,Name,Mon,Tues,Wed,Thurs
0,Liho Liho,159,169,32,100
3,Chambers,97,199,33,181


Notice that all the other rows which *don't* satisfies this query are *not returned*

### 2. Filtering columns based off of each other
You can also filter two columns based off of each other. In this case I'm querying for data where the Monday column is greater than the Tuesday column.

In [10]:
df.query('Mon > Tues')

Unnamed: 0,Name,Mon,Tues,Wed,Thurs
1,Tompkins,62,49,172,60


You can also achieve this result via the traditional filtering method.

In [12]:
filter_1 = df['Mon'] > df['Tues']
df[filter_1]

Unnamed: 0,Name,Mon,Tues,Wed,Thurs
1,Tompkins,62,49,172,60


### 3. Using an environment variable to filter
If needed you can also use an environment variable to filter your data. Make sure to put an "@" sign in front of your variable within the string.

In [13]:
dinner_limit=85
df.query('Thurs > @dinner_limit')

Unnamed: 0,Name,Mon,Tues,Wed,Thurs
0,Liho Liho,159,169,32,100
3,Chambers,97,199,33,181


Here I set a variable, dinner_limit to 85. I used this variable within my query by calling "@dinner_limit" and evaluating the 'Thurs' column against it.