# Filtering the DataFrame with Boolean Arrays (Masks)

- Use [this CheatSheet](https://www.craft.do/s/G80r1dqrQKrjTb/b/F80131CD-4914-414F-8B93-C03B5D1AFCD5/DataFrame) to work better with the following exercises.

In this chapter, you will learn how to select specific parts of the data (masking) based on conditions we'll ask in the questions.

Framework to work on masking the DataFrames:

1. Identify the column of the condition
2. Access the column `df.column`
3. Compare column values based on the condition `df.column == value`; operators:
    1. Equal `==`
    2. Not equal `!=`
    2. Greater `>`
    3. Greater or equal `>=`
4. Save the boolean array into the mask `mask = df.column == value`
5. Filter the DataFrame with the mask `df[mask]`

```python
df.column
df.column == value
mask = df.column == value
df[mask]
```

## Load the data

The data is taken from [this kaggle repository](https://www.kaggle.com/datasets/azminetoushikwasi/ucl-202122-uefa-champions-league?select=goals.csv).

In [1]:
import pandas as pd

df_players = pd.read_csv('key_stats.csv', index_col='player_name')
df_players

Unnamed: 0_level_0,club,position,minutes_played,match_played,goals,assists,distance_covered
player_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Courtois,Real Madrid,Goalkeeper,1230,13,0,0,64.2
Vinícius Júnior,Real Madrid,Forward,1199,13,4,6,133.0
Benzema,Real Madrid,Forward,1106,12,15,1,121.5
Modrić,Real Madrid,Midfielder,1077,13,0,4,124.5
Éder Militão,Real Madrid,Defender,1076,12,0,0,110.4
...,...,...,...,...,...,...,...
Gil Dias,Benfica,Midfielder,1,1,0,0,0.7
Rodrigo Ribeiro,Sporting CP,Forward,1,1,0,0,0.7
Cojocari,Sheriff,Defender,1,1,0,0,0.5
Maouassa,Club Brugge,Defender,1,1,0,0,0.2


## Simple conditions

### Players who scored 10 or more goals

In [3]:
mask_goals= df_players.goals>=10

In [7]:
df_players[mask_goals]

Unnamed: 0_level_0,club,position,minutes_played,match_played,goals,assists,distance_covered
player_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Benzema,Real Madrid,Forward,1106,12,15,1,121.5
Lewandowski,Bayern,Forward,876,10,13,3,99.7
Haller,Ajax,Forward,668,8,11,1,82.2


### Players who assisted 5 or more times

In [10]:
mask_assist= df_players.assists>=5

In [12]:
mask_assist

player_name
Courtois           False
Vinícius Júnior     True
Benzema            False
Modrić             False
Éder Militão       False
                   ...  
Gil Dias           False
Rodrigo Ribeiro    False
Cojocari           False
Maouassa           False
Zesiger            False
Name: assists, Length: 747, dtype: bool

In [14]:
df_players[mask_assist]

Unnamed: 0_level_0,club,position,minutes_played,match_played,goals,assists,distance_covered
player_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Vinícius Júnior,Real Madrid,Forward,1199,13,4,6,133.0
Sané,Bayern,Midfielder,798,10,6,6,94.0
Antony,Ajax,Forward,577,7,2,5,65.1
Bruno Fernandes,Man. United,Midfielder,520,7,0,7,58.4


## Multiple conditions

### Filter the goalkeepers who gave at least one assist

In [17]:
mask_goalkeeper=df_players.position=='goalkeeper'

In [19]:
mask_assist= df_players.assists>0

In [22]:
df_players[mask_assist & mask_assist]

Unnamed: 0_level_0,club,position,minutes_played,match_played,goals,assists,distance_covered
player_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Vinícius Júnior,Real Madrid,goalkeeper,1199,13,4,6,133.0
Benzema,Real Madrid,goalkeeper,1106,12,15,1,121.5
Modrić,Real Madrid,goalkeeper,1077,13,0,4,124.5
Carvajal,Real Madrid,goalkeeper,959,11,0,1,112.8
Mendy,Real Madrid,goalkeeper,867,10,0,2,96.3
...,...,...,...,...,...,...,...
Fábio Vieira,Porto,goalkeeper,85,2,0,1,11
Al. Miranchuk,Atalanta,goalkeeper,33,2,0,1,4.8
Novoa,Leipzig,goalkeeper,24,1,0,1,3.6
Bernardo,Salzburg,goalkeeper,23,2,0,1,4.2


### Forwards with at least 700 minutes played

In [31]:
mask_minutes= df_players.minutes_played>=700

In [32]:
df_players[mask_assist & mask_assist & mask_minutes]

Unnamed: 0_level_0,club,position,minutes_played,match_played,goals,assists,distance_covered
player_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Vinícius Júnior,Real Madrid,goalkeeper,1199,13,4,6,133.0
Benzema,Real Madrid,goalkeeper,1106,12,15,1,121.5
Modrić,Real Madrid,goalkeeper,1077,13,0,4,124.5
Carvajal,Real Madrid,goalkeeper,959,11,0,1,112.8
Mendy,Real Madrid,goalkeeper,867,10,0,2,96.3
Valverde,Real Madrid,goalkeeper,804,11,0,1,96.7
Salah,Liverpool,goalkeeper,1008,13,8,2,112.0
Robertson,Liverpool,goalkeeper,826,10,0,2,97.7
Mané,Liverpool,goalkeeper,822,13,5,1,100.3
Van Dijk,Liverpool,goalkeeper,810,9,0,1,82.8


In [26]:
posicion= "goalkeeper"

In [29]:
asistencias= 0

In [33]:
df_players.query('position==@posicion & assists<=@asistencias')

Unnamed: 0_level_0,club,position,minutes_played,match_played,goals,assists,distance_covered
player_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Courtois,Real Madrid,goalkeeper,1230,13,0,0,64.2
Éder Militão,Real Madrid,goalkeeper,1076,12,0,0,110.4
Alaba,Real Madrid,goalkeeper,1040,12,1,0,112.3
Casemiro,Real Madrid,goalkeeper,914,11,0,0,107.6
Kroos,Real Madrid,goalkeeper,902,12,2,0,116.5
...,...,...,...,...,...,...,...
Gil Dias,Benfica,goalkeeper,1,1,0,0,0.7
Rodrigo Ribeiro,Sporting CP,goalkeeper,1,1,0,0,0.7
Cojocari,Sheriff,goalkeeper,1,1,0,0,0.5
Maouassa,Club Brugge,goalkeeper,1,1,0,0,0.2


### Real Madrid players who scored

### FC Barcelona players who scored

### Real Madrid players who scored and assisted

### FC Barcelona players who scored and assisted

### Defenders who scored and assisted

## Combine masks with unions and intersections

### FC Barcelona players who scored or assisted

### Liverpool players who scored or assisted