# Selecting Data
- table[] Bracket Selection
- table.loc[] Label based selection
- table.iloc[] Integer based selection

In [344]:
import pandas as pd

# Load the dataset
filename = '../data/Pokemon.csv'
PKMN = pd.read_csv(filename)

## Bracket Selection
Syntax: `X[selector]`
Selector can be a column label or list of column labels
To select the rows, selector can be a slice or a boolean mask

### Single Column

In [345]:
PKMN['Name']  # Selecting a single column by label

0                  Bulbasaur
1                    Ivysaur
2                   Venusaur
3      VenusaurMega Venusaur
4                 Charmander
               ...          
795                  Diancie
796      DiancieMega Diancie
797      HoopaHoopa Confined
798       HoopaHoopa Unbound
799                Volcanion
Name: Name, Length: 800, dtype: object

### Multiple Columns

In [346]:
PKMN[['Name', 'Type 1', 'Type 2']] # Selecting multiple columns

Unnamed: 0,Name,Type 1,Type 2
0,Bulbasaur,Grass,Poison
1,Ivysaur,Grass,Poison
2,Venusaur,Grass,Poison
3,VenusaurMega Venusaur,Grass,Poison
4,Charmander,Fire,
...,...,...,...
795,Diancie,Rock,Fairy
796,DiancieMega Diancie,Rock,Fairy
797,HoopaHoopa Confined,Psychic,Ghost
798,HoopaHoopa Unbound,Psychic,Dark


### Slice Indexing

In [347]:
PKMN[:4]  # First 4 rows

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False


### Boolean Mask

In [348]:
PKMN[PKMN['Attack'] >= 170] # Boolean mask for Attack >= 170

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
163,150,MewtwoMega Mewtwo X,Psychic,Fighting,780,106,190,100,154,100,130,1,True
232,214,HeracrossMega Heracross,Bug,Fighting,600,80,185,115,40,105,75,2,False
424,383,GroudonPrimal Groudon,Ground,Fire,770,100,180,160,150,90,90,3,True
426,384,RayquazaMega Rayquaza,Dragon,Flying,780,105,180,100,180,100,115,3,True
429,386,DeoxysAttack Forme,Psychic,,600,50,180,20,180,20,150,3,True
494,445,GarchompMega Garchomp,Dragon,Ground,700,108,170,115,120,95,92,4,False
711,646,KyuremBlack Kyurem,Dragon,Ice,700,125,170,100,120,90,95,5,True


### Label Based Selection with loc[]
- Syntax: `X.loc[rowselector, columnselector]` with `columnselector` being optional
- `rowselector` can be an index (string, int, date etc.), a list of indexes, a slice, a boolean mask
- `columnselector` can be a string label or a list of string labels
- `.loc` is a more general and powerful way of using`[]` - except when `.loc(slice)` the index is not integer based

In [349]:
PKMN.loc[0:3] # Selecting with a slice of indexes

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False


In [350]:
PKMN.loc[0] # Selecting a single row by index

#                     1
Name          Bulbasaur
Type 1            Grass
Type 2           Poison
Total               318
HP                   45
Attack               49
Defense              49
Sp. Atk              65
Sp. Def              65
Speed                45
Generation            1
Legendary         False
Name: 0, dtype: object

In [351]:
PKMN.loc[[0, 7, 700]] # Selecting multiple rows by index

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
7,6,CharizardMega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False
700,639,Terrakion,Rock,Fighting,580,91,129,90,72,90,108,5,True


In [352]:
PKMN_2 = PKMN.set_index('Name')
PKMN_2.loc[['Bulbasaur', 'Mew', 'Pikachu']]  # Selecting multiple rows by label

Unnamed: 0_level_0,#,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
Bulbasaur,1,Grass,Poison,318,45,49,49,65,65,45,1,False
Mew,151,Psychic,,600,100,100,100,100,100,100,1,False
Pikachu,25,Electric,,320,35,55,40,50,50,90,1,False


In [353]:
PKMN_2.loc[['Arcanine', 'Charizard', 'Blaziken'], ['Type 1', 'Type 2']]  # Selecting multiple rows and columns by label

Unnamed: 0_level_0,Type 1,Type 2
Name,Unnamed: 1_level_1,Unnamed: 2_level_1
Arcanine,Fire,
Charizard,Fire,Flying
Blaziken,Fire,Fighting


In [354]:
PKMN.loc[(PKMN['Type 1']=='Fire') & (PKMN['Type 2']=='Flying'), ['Name', 'Generation']] # Selecting rows with a condition and specific columns

Unnamed: 0,Name,Generation
6,Charizard,1
8,CharizardMega Charizard Y,1
158,Moltres,1
270,Ho-oh,2
730,Fletchinder,6
731,Talonflame,6


### `iloc[]` integer based selection
- Syntax `iloc[rowselector, columnselector]` and `columnselector` is optional
- `rowselector` and `columnselector` can be an integer or list of integers or a slice

In [355]:
PKMN_2.iloc[:10, :3]  # Selecting first 100 rows and first 3 columns

Unnamed: 0_level_0,#,Type 1,Type 2
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Bulbasaur,1,Grass,Poison
Ivysaur,2,Grass,Poison
Venusaur,3,Grass,Poison
VenusaurMega Venusaur,3,Grass,Poison
Charmander,4,Fire,
Charmeleon,5,Fire,
Charizard,6,Fire,Flying
CharizardMega Charizard X,6,Fire,Dragon
CharizardMega Charizard Y,6,Fire,Flying
Squirtle,7,Water,
