# Advanced Filters

In this module, we will take a closer look at common filtering patterns.  Note that this list is based on the Common Filter Operations section of the [SQL Alchemy tutorial](https://docs.sqlalchemy.org/en/latest/orm/tutorial.html) from the SQL Alchemy documentation, which is copyright © by SQLAlchemy authors and contributors. SQLAlchemy and its documentation are licensed under the MIT license.

In [1]:
import pandas as pd
from dfply import *
import seaborn as sns
%matplotlib inline

### Common Filter Operators

Most filters consist of the following operations.

* Equals/not equals and other inequalities
* Like/ilike
* In/not in
* Is Null/is not null
* And/or


## Inequalities

In this (short) lecture, we will review filtering based on inequalities.

In [8]:
from more_dfply import fix_names
#for some reason the path did not work. I had to put the full path to make it work. I am very confused. 
heroes_raw = pd.read_csv('/home/fahad/module-4-lectures-nameer1811/data/heroes_information.csv', na_values=['-', '-99.0', ''])
heroes = (heroes_raw >> fix_names)
heroes.head()

Unnamed: 0,Unnamed_0,name,Gender,Eye_color,Race,Hair_color,Height,Publisher,Skin_color,Alignment,Weight
0,0,A-Bomb,Male,yellow,Human,No Hair,203.0,Marvel Comics,,good,441.0
1,1,Abe Sapien,Male,blue,Icthyo Sapien,No Hair,191.0,Dark Horse Comics,blue,good,65.0
2,2,Abin Sur,Male,blue,Ungaran,No Hair,185.0,DC Comics,red,good,90.0
3,3,Abomination,Male,green,Human / Radiation,No Hair,203.0,Marvel Comics,,bad,441.0
4,4,Abraxas,Male,blue,Cosmic Entity,Black,,Marvel Comics,,bad,


## Category 1 - Equality and Inequality

In all three frameworks, equalities/inequalities are performed using the regular Python operators on column expressions.

#### equals:

In [9]:
(heroes
 >> filter_by(X.Eye_color == 'blue')
 >> head(2))

Unnamed: 0,Unnamed_0,name,Gender,Eye_color,Race,Hair_color,Height,Publisher,Skin_color,Alignment,Weight
1,1,Abe Sapien,Male,blue,Icthyo Sapien,No Hair,191.0,Dark Horse Comics,blue,good,65.0
2,2,Abin Sur,Male,blue,Ungaran,No Hair,185.0,DC Comics,red,good,90.0


In [10]:
(heroes
 >> filter_by(X.Eye_color.eq('blue'))
 >> head(2))

Unnamed: 0,Unnamed_0,name,Gender,Eye_color,Race,Hair_color,Height,Publisher,Skin_color,Alignment,Weight
1,1,Abe Sapien,Male,blue,Icthyo Sapien,No Hair,191.0,Dark Horse Comics,blue,good,65.0
2,2,Abin Sur,Male,blue,Ungaran,No Hair,185.0,DC Comics,red,good,90.0


#### not equals:

In [11]:
(heroes
 >> filter_by(X.Eye_color != 'blue')
 >> head(2))

Unnamed: 0,Unnamed_0,name,Gender,Eye_color,Race,Hair_color,Height,Publisher,Skin_color,Alignment,Weight
0,0,A-Bomb,Male,yellow,Human,No Hair,203.0,Marvel Comics,,good,441.0
3,3,Abomination,Male,green,Human / Radiation,No Hair,203.0,Marvel Comics,,bad,441.0


In [12]:
(heroes
 >> filter_by(~X.Eye_color.eq('blue'))
 >> head(2))

Unnamed: 0,Unnamed_0,name,Gender,Eye_color,Race,Hair_color,Height,Publisher,Skin_color,Alignment,Weight
0,0,A-Bomb,Male,yellow,Human,No Hair,203.0,Marvel Comics,,good,441.0
3,3,Abomination,Male,green,Human / Radiation,No Hair,203.0,Marvel Comics,,bad,441.0


#### Other inequalities

In [13]:
(heroes
 >> filter_by(X.Height > 200)
 >> filter_by(X.Weight <= 440)
 >> head(2))

Unnamed: 0,Unnamed_0,name,Gender,Eye_color,Race,Hair_color,Height,Publisher,Skin_color,Alignment,Weight
17,17,Alien,Male,,Xenomorph XX121,No Hair,244.0,Dark Horse Comics,black,bad,169.0
19,19,Amazo,Male,red,Android,,257.0,DC Comics,,bad,173.0


## Reminder - Referencing Constructed Column

Recall that we can reference a constructed column by using `X` in `pandas` + `dfply`

In [14]:
(heroes
 >> mutate(Weight_kg = X.Weight/2.2046)
 >> filter_by(X.Weight_kg <= 200)
 >> head(2))

Unnamed: 0,Unnamed_0,name,Gender,Eye_color,Race,Hair_color,Height,Publisher,Skin_color,Alignment,Weight,Weight_kg
1,1,Abe Sapien,Male,blue,Icthyo Sapien,No Hair,191.0,Dark Horse Comics,blue,good,65.0,29.483807
2,2,Abin Sur,Male,blue,Ungaran,No Hair,185.0,DC Comics,red,good,90.0,40.823732


## <font color="red"> Exercise 4.1.1 - The Super Hero Dating Game - Part 1</font>

Yesterday, you notice a singles add in the local paper that reads

> SWF looking for BESHM (blue-eyed super hero).  Must be tall (70+ inches).  Only interested in bad boys! Must list height (in inches) in reply!

Write a query in each framework to help find candidates for this personal add.

In [20]:
# Your solution here
(heroes
>> mutate(Height_inches = X.Height*0.3937)
>> filter_by(X.Eye_color == "blue")
>> filter_by(X.Height_inches > 70)
>> filter_by(X.Alignment == "bad")
>> filter_by(X.Gender.eq('Male'))
>> select(X.name, X.Height_inches)
)

Unnamed: 0,name,Height_inches
5,Absorbing Man,75.9841
11,Air-Walker,74.0156
48,Atlas,77.9526
109,Blackwing,72.8345
140,Bullseye,72.0471
180,Clock King,70.0786
236,Electro,70.866
248,Exodus,72.0471
298,Green Goblin,70.866
299,Green Goblin II,70.0786
