![title](./pic/selecting/query/1_title.png)

In [64]:
import pandas as pd

In [65]:
df = pd.read_csv('./csv/titanic.csv')
df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,892,0,3,"Kelly, Mr. James",male,34.5,0,0,330911,7.8292,,Q
1,893,1,3,"Wilkes, Mrs. James (Ellen Needs)",female,47.0,1,0,363272,7.0,,S
2,894,0,2,"Myles, Mr. Thomas Francis",male,62.0,0,0,240276,9.6875,,Q
3,895,0,3,"Wirz, Mr. Albert",male,27.0,0,0,315154,8.6625,,S
4,896,1,3,"Hirvonen, Mrs. Alexander (Helga E Lindqvist)",female,22.0,1,1,3101298,12.2875,,S


---

Soblad du verstanden hast, wie die bedingte Selekton in `Pandas` funktioniert, kannst du dein Wissen erweitern. `Pandas` bietet die sog. `.query()`-Funktion an, die wir bereits kurz beim Einlesen von `.csv` Dateien kennengelernt haben. Diese Funktion ist nicht nur hierbei nützlich, sondern vereinfacht dir auch deine Abfragen über ein vorliegendes `DataFrame`.

![title](./pic/selecting/query/2_query.png)

### Abfrage ohne `.query()`: Mädchen unter unter oder gleich 3 Jahren

In [66]:
df_titanic = df.copy()

In [67]:
df_titanic[(df_titanic['Age'] <= 3) & (df_titanic['Sex'] == "female")]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
117,1009,1,3,"Sandstrom, Miss. Beatrice Irene",female,1.0,1,1,PP 9549,16.7,G6,S
250,1142,1,2,"West, Miss. Barbara J",female,0.92,1,2,C.A. 34651,27.75,,S
263,1155,1,3,"Klasen, Miss. Gertrud Emilia",female,1.0,1,1,350405,12.1833,,S
284,1176,1,3,"Rosblom, Miss. Salli Helena",female,2.0,1,1,370129,20.2125,,S
296,1188,1,2,"Laroche, Miss. Louise",female,1.0,1,2,SC/Paris 2123,41.5792,,C
354,1246,1,3,"Dean, Miss. Elizabeth Gladys Millvina""""",female,0.17,1,2,C.A. 2315,20.575,,S
409,1301,1,3,"Peacock, Miss. Treasteall",female,3.0,1,1,SOTON/O.Q. 3101315,13.775,,S


---

### Abfrage mit `.query()`

In [68]:
df_titanic.query('Age <= 3 & Sex == "female"')

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
117,1009,1,3,"Sandstrom, Miss. Beatrice Irene",female,1.0,1,1,PP 9549,16.7,G6,S
250,1142,1,2,"West, Miss. Barbara J",female,0.92,1,2,C.A. 34651,27.75,,S
263,1155,1,3,"Klasen, Miss. Gertrud Emilia",female,1.0,1,1,350405,12.1833,,S
284,1176,1,3,"Rosblom, Miss. Salli Helena",female,2.0,1,1,370129,20.2125,,S
296,1188,1,2,"Laroche, Miss. Louise",female,1.0,1,2,SC/Paris 2123,41.5792,,C
354,1246,1,3,"Dean, Miss. Elizabeth Gladys Millvina""""",female,0.17,1,2,C.A. 2315,20.575,,S
409,1301,1,3,"Peacock, Miss. Treasteall",female,3.0,1,1,SOTON/O.Q. 3101315,13.775,,S


Du siehst also, die `.query()`-Funktion bietet dir nicht nur eine übersichtlichere Abfrageweise, sondern ist zudem auch deutlich kürzer als die Basis-Version. Nichts desto trotz, solltest du mindestens die Basic Version perfekt beherrschen, da man bei längeren Abfragen mit der `.query()` trotzdem schnell mal den Überblick verlieren kann aufgrund eines einzelnen Strings.

---

### Dynamische Query mit `.query()`

Noch nützlich zu wissen ist, dass man die eigentliche Query auch auslagern und später der `.query()` Funktion übergeben kann. Somit kann man z.B. dynamische Queries zur Laufzeit erstellen und diese mithilfe f-Strings interaktiv benutzen.

In [69]:
altersgrenze = 10
geschlecht = "female"

In [70]:
query = f'Age <= {altersgrenze} & Sex == "{geschlecht}"'

In [71]:
df.query(query)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
117,1009,1,3,"Sandstrom, Miss. Beatrice Irene",female,1.0,1,1,PP 9549,16.7,G6,S
140,1032,1,3,"Goodwin, Miss. Jessie Allis",female,10.0,5,2,CA 2144,46.9,,S
203,1095,1,2,"Quick, Miss. Winifred Vera",female,8.0,1,1,26360,26.0,,S
250,1142,1,2,"West, Miss. Barbara J",female,0.92,1,2,C.A. 34651,27.75,,S
263,1155,1,3,"Klasen, Miss. Gertrud Emilia",female,1.0,1,1,350405,12.1833,,S
283,1175,1,3,"Touma, Miss. Maria Youssef",female,9.0,1,1,2650,15.2458,,C
284,1176,1,3,"Rosblom, Miss. Salli Helena",female,2.0,1,1,370129,20.2125,,S
296,1188,1,2,"Laroche, Miss. Louise",female,1.0,1,2,SC/Paris 2123,41.5792,,C
354,1246,1,3,"Dean, Miss. Elizabeth Gladys Millvina""""",female,0.17,1,2,C.A. 2315,20.575,,S
409,1301,1,3,"Peacock, Miss. Treasteall",female,3.0,1,1,SOTON/O.Q. 3101315,13.775,,S


---

### Dynamische Query mit `.query()` und `kwargs` & `inplace`


In [77]:
altersgrenze = 10
geschlecht = "female"
kwargs = "& Pclass == 3"
#kwargs = ""

In [78]:
query = f'Age <= {altersgrenze} & Sex == "{geschlecht}" {kwargs}'

In [79]:
df.query(query)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
117,1009,1,3,"Sandstrom, Miss. Beatrice Irene",female,1.0,1,1,PP 9549,16.7,G6,S
140,1032,1,3,"Goodwin, Miss. Jessie Allis",female,10.0,5,2,CA 2144,46.9,,S
263,1155,1,3,"Klasen, Miss. Gertrud Emilia",female,1.0,1,1,350405,12.1833,,S
283,1175,1,3,"Touma, Miss. Maria Youssef",female,9.0,1,1,2650,15.2458,,C
284,1176,1,3,"Rosblom, Miss. Salli Helena",female,2.0,1,1,370129,20.2125,,S
354,1246,1,3,"Dean, Miss. Elizabeth Gladys Millvina""""",female,0.17,1,2,C.A. 2315,20.575,,S
409,1301,1,3,"Peacock, Miss. Treasteall",female,3.0,1,1,SOTON/O.Q. 3101315,13.775,,S


In [80]:
df_girls = df.query(query, inplace=False)

In [81]:
df_girls

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
117,1009,1,3,"Sandstrom, Miss. Beatrice Irene",female,1.0,1,1,PP 9549,16.7,G6,S
140,1032,1,3,"Goodwin, Miss. Jessie Allis",female,10.0,5,2,CA 2144,46.9,,S
263,1155,1,3,"Klasen, Miss. Gertrud Emilia",female,1.0,1,1,350405,12.1833,,S
283,1175,1,3,"Touma, Miss. Maria Youssef",female,9.0,1,1,2650,15.2458,,C
284,1176,1,3,"Rosblom, Miss. Salli Helena",female,2.0,1,1,370129,20.2125,,S
354,1246,1,3,"Dean, Miss. Elizabeth Gladys Millvina""""",female,0.17,1,2,C.A. 2315,20.575,,S
409,1301,1,3,"Peacock, Miss. Treasteall",female,3.0,1,1,SOTON/O.Q. 3101315,13.775,,S
