# **loc and iloc in pandas for selecting data**

`loc` and `iloc` are two important functions in pandas which are used for selecting data.
- `loc` is label-based, which means that you have to specify rows and columns based on their row and column labels.
- `iloc` is integer index based, so you have to specify rows and columns by their integer index like you did in the 2D numpy array.

`loc` is more understandable and better for selecting data as compared to `iloc`. 

> **`loc`** means *location* and **`iloc`** means *integer location*.

## When to use `loc` and `iloc`?

- `loc` is used when you want to access a group of rows and columns by labels(s).
  - Example:
    - `df.loc[0:5, ['column1','column2']]` will return the first 5 rows of column1 and column2.
- `iloc` is used when you want to access a group of rows and columns by their integer index.
  - Example:
    - `df.iloc[0:5, 0:2]` will return the first 5 rows of first 2 columns.
    - 0:5 means 0 to 4. It doesn't include 5. So, it will return 0, 1, 2, 3, 4 rows.
    - 0:2 means 0 to 1. It doesn't include 2. So, it will return 0, 1 columns. So, it will return 0, 1 columns.

In [1]:
import pandas as pd

df = pd.read_csv('../00_datasets/titanic.csv')
df

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.2500,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.9250,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1000,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.0500,S,Third,man,True,,Southampton,no,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
886,0,2,male,27.0,0,0,13.0000,S,Second,man,True,,Southampton,no,True
887,1,1,female,19.0,0,0,30.0000,S,First,woman,False,B,Southampton,yes,True
888,0,3,female,,1,2,23.4500,S,Third,woman,False,,Southampton,no,False
889,1,1,male,26.0,0,0,30.0000,C,First,man,True,C,Cherbourg,yes,True


In [5]:
# use of loc function
# df[df['sex', 'age']]
df.loc[0:5, 'sex':'fare']

Unnamed: 0,sex,age,sibsp,parch,fare
0,male,22.0,1,0,7.25
1,female,38.0,1,0,71.2833
2,female,26.0,0,0,7.925
3,female,35.0,1,0,53.1
4,male,35.0,0,0,8.05
5,male,,0,0,8.4583


In [6]:
# fare:deck
# df[df['fare', 'embarked', "class", 'who', 'adult_male', 'deck']]
df.loc[0:15, 'fare':'deck']

Unnamed: 0,fare,embarked,class,who,adult_male,deck
0,7.25,S,Third,man,True,
1,71.2833,C,First,woman,False,C
2,7.925,S,Third,woman,False,
3,53.1,S,First,woman,False,C
4,8.05,S,Third,man,True,
5,8.4583,Q,Third,man,True,
6,51.8625,S,First,man,True,E
7,21.075,S,Third,child,False,
8,11.1333,S,Third,woman,False,
9,30.0708,C,Second,child,False,


In [9]:
# negative index can be used
df.loc[-50:10, 'fare':'deck']

Unnamed: 0,fare,embarked,class,who,adult_male,deck
0,7.25,S,Third,man,True,
1,71.2833,C,First,woman,False,C
2,7.925,S,Third,woman,False,
3,53.1,S,First,woman,False,C
4,8.05,S,Third,man,True,
5,8.4583,Q,Third,man,True,
6,51.8625,S,First,man,True,E
7,21.075,S,Third,child,False,
8,11.1333,S,Third,woman,False,
9,30.0708,C,Second,child,False,


In [10]:
# using iloc function
df.iloc[0:5, 0:3]

Unnamed: 0,survived,pclass,sex
0,0,3,male
1,1,1,female
2,1,3,female
3,1,1,female
4,0,3,male
