## Difference between loc and iloc in Pandas
loc is label-based, which means that we have to specify the name of the rows and columns that we need to filter out.
Example:
let’s say we search for the rows whose index is 1, 2 or 100. We will not get the first, second or the hundredth row here. Instead, we will get the results only if the name of any index is 1, 2 or 100.

On the other hand, iloc is integer index-based. So here, we have to specify rows and columns by their integer index.
Example:
Let’s say we search for the rows with index 1, 2 or 100. It will return the first, second and hundredth row, regardless of the name or labels we have in the index in our dataset.

In [1]:
import pandas as pd
import numpy as np

In [3]:
data = pd.DataFrame({
    'age' :     [ 10, 22, 13, 21, 12, 11, 17],
    'section' : [ 'A', 'B', 'C', 'B', 'B', 'A', 'A'],
    'city' :    [ 'Gurgaon', 'Delhi', 'Mumbai', 'Delhi', 'Mumbai', 'Delhi', 'Mumbai'],
    'gender' :  [ 'M', 'F', 'F', 'M', 'M', 'M', 'F'],
    'favourite_color' : [ 'red', np.NAN, 'yellow', np.NAN, 'black', 'green', 'red']
})
dataset = data.copy()

In [9]:
dataset.head(20)

Unnamed: 0,age,section,city,gender,favourite_color
0,10,A,Gurgaon,M,red
1,22,B,Delhi,F,
2,13,C,Mumbai,F,yellow
3,21,B,Delhi,M,
4,12,B,Mumbai,M,black
5,11,A,Delhi,M,green
6,17,A,Mumbai,F,red


## Find all the rows based on any condition in a column
One thing we use almost always when we’re exploring a dataset – filtering the data based on a given condition. For example, we might need to find all the rows in our dataset where age is more than x years, or the city is Delhi, and so on.

We can solve types of queries with a simple line of code using pandas.DataFrame.loc[]. We just need to pass the condition within the loc statement.

In [12]:
dataset.loc[dataset.age > 13]

Unnamed: 0,age,section,city,gender,favourite_color
1,22,B,Delhi,F,
3,21,B,Delhi,M,
6,17,A,Mumbai,F,red


In [19]:
dataset.loc[dataset['gender'] == 'F']

Unnamed: 0,age,section,city,gender,favourite_color
1,22,B,Delhi,F,
2,13,C,Mumbai,F,yellow
6,17,A,Mumbai,F,red


In [16]:
## Find all the rows with more than one condition
dataset.loc[(dataset.age > 13) & (dataset.gender == 'F')]

Unnamed: 0,age,section,city,gender,favourite_color
1,22,B,Delhi,F,
6,17,A,Mumbai,F,red


## Select a range of rows using loc

In [17]:
dataset.loc[1:3]

Unnamed: 0,age,section,city,gender,favourite_color
1,22,B,Delhi,F,
2,13,C,Mumbai,F,yellow
3,21,B,Delhi,M,


## Select only required columns with a condition
We can also select the columns that are required of the rows that satisfy our condition.


In [20]:
dataset.loc[(dataset.age > 13), ['city','gender']]

Unnamed: 0,city,gender
1,Delhi,F
3,Delhi,M
6,Mumbai,F


## Update the values of a particular column on selected rows

In [21]:
dataset.loc[(dataset.age > 13), ['section']] = 'M'
dataset.head(12)

Unnamed: 0,age,section,city,gender,favourite_color
0,10,A,Gurgaon,M,red
1,22,M,Delhi,F,
2,13,C,Mumbai,F,yellow
3,21,M,Delhi,M,
4,12,B,Mumbai,M,black
5,11,A,Delhi,M,green
6,17,M,Mumbai,F,red


# Update the values of multiple columns on selected rows

In [22]:
dataset.loc[(dataset.age > 13), ['section','favourite_color']] = ('M','Orange')

In [24]:
dataset.head(12)

Unnamed: 0,age,section,city,gender,favourite_color
0,10,A,Gurgaon,M,red
1,22,M,Delhi,F,Orange
2,13,C,Mumbai,F,yellow
3,21,M,Delhi,M,Orange
4,12,B,Mumbai,M,black
5,11,A,Delhi,M,green
6,17,M,Mumbai,F,Orange


# Select rows with indices using iloc

In [25]:
data.iloc[[0,2]]

Unnamed: 0,age,section,city,gender,favourite_color
0,10,A,Gurgaon,M,red
2,13,C,Mumbai,F,yellow


# Select rows with particular indices and particular columns

In [26]:
data.iloc[[0,2],[1,3]]

Unnamed: 0,section,gender
0,A,M
2,C,F


# Select a range of rows using iloc

In [27]:
data.iloc[1:3]

Unnamed: 0,age,section,city,gender,favourite_color
1,22,B,Delhi,F,
2,13,C,Mumbai,F,yellow


# Select a range of rows and columns using iloc

In [29]:
data.iloc[1:3,2:4]

Unnamed: 0,city,gender
1,Delhi,F
2,Mumbai,F
