# Difference between loc and iloc
###### Resource: https://www.analyticsvidhya.com/blog/2020/02/loc-iloc-pandas/?

### loc is label-based, which means that we have to specify the name of the rows and columns that we need to filter out.
'''
For example, let’s say we search for the rows whose index is 1, 2 or 100. We will not get the first, second or the hundred
th row here. Instead, we will get the results only if the name of any index is 1, 2 or 100.
So, we can filter the data using the loc function in Pandas even if the indices are not an integer in our dataset.

'''


### On the other hand, iloc is integer index-based. So here, we have to specify rows and columns by their integer index.

'''
Let’s say we search for the rows with index 1, 2 or 100. It will return the first, second and hundredth row,
regardless of the name or labels we have in the index in our dataset.

'''

In [2]:
import pandas as pd
import numpy as np


In [3]:
data = pd.DataFrame({
    'age' :     [ 10, 22, 13, 21, 12, 11, 17],
    'section' : [ 'A', 'B', 'C', 'B', 'B', 'A', 'A'],
    'city' :    [ 'Gurgaon', 'Delhi', 'Mumbai', 'Delhi', 'Mumbai', 'Delhi', 'Mumbai'],
    'gender' :  [ 'M', 'F', 'F', 'M', 'M', 'M', 'F'],
    'favourite_color' : [ 'red', np.NAN, 'yellow', np.NAN, 'black', 'green', 'red']
})

In [4]:
data

Unnamed: 0,age,section,city,gender,favourite_color
0,10,A,Gurgaon,M,red
1,22,B,Delhi,F,
2,13,C,Mumbai,F,yellow
3,21,B,Delhi,M,
4,12,B,Mumbai,M,black
5,11,A,Delhi,M,green
6,17,A,Mumbai,F,red


In [5]:
data.loc[data.age>=15]

Unnamed: 0,age,section,city,gender,favourite_color
1,22,B,Delhi,F,
3,21,B,Delhi,M,
6,17,A,Mumbai,F,red


In [6]:
data.loc[(data.age >= 12) & (data.gender == 'M')]

Unnamed: 0,age,section,city,gender,favourite_color
3,21,B,Delhi,M,
4,12,B,Mumbai,M,black


In [9]:
data.loc[(data.age >= 12) | (data.gender == 'M')]

Unnamed: 0,age,section,city,gender,favourite_color
0,10,A,Gurgaon,M,red
1,22,B,Delhi,F,
2,13,C,Mumbai,F,yellow
3,21,B,Delhi,M,
4,12,B,Mumbai,M,black
5,11,A,Delhi,M,green
6,17,A,Mumbai,F,red


In [10]:
data.loc[1:3] 

''' Using loc, we can also slice the Pandas dataframe over a range of indices. 
If the indices are not in the sorted order, it will select only the rows with index 1 and 3. 
And if the indices are not numbers, then we cannot slice our dataframe.
In that case, we need to use the iloc function to slice our Pandas dataframe.'''



Unnamed: 0,age,section,city,gender,favourite_color
1,22,B,Delhi,F,
2,13,C,Mumbai,F,yellow
3,21,B,Delhi,M,


In [12]:
# select few columns with a condition
data.loc[(data.age >= 12), ['city', 'gender','section']]

Unnamed: 0,city,gender,section
1,Delhi,F,B
2,Mumbai,F,C
3,Delhi,M,B
4,Mumbai,M,B
6,Mumbai,F,A


In [13]:
data

Unnamed: 0,age,section,city,gender,favourite_color
0,10,A,Gurgaon,M,red
1,22,B,Delhi,F,
2,13,C,Mumbai,F,yellow
3,21,B,Delhi,M,
4,12,B,Mumbai,M,black
5,11,A,Delhi,M,green
6,17,A,Mumbai,F,red


In [16]:
# update a column with condition
data.loc[(data.age == 22), ['section']] = 'M'
data

# data.age ==12 also updated

Unnamed: 0,age,section,city,gender,favourite_color
0,10,A,Gurgaon,M,red
1,22,M,Delhi,F,
2,13,C,Mumbai,F,yellow
3,21,B,Delhi,M,
4,12,M,Mumbai,M,black
5,11,A,Delhi,M,green
6,17,A,Mumbai,F,red


In [17]:
data.loc[(data.age == 10), ['section', 'city']] = ['A+','Pune']
data

Unnamed: 0,age,section,city,gender,favourite_color
0,10,A+,Pune,M,red
1,22,M,Delhi,F,
2,13,C,Mumbai,F,yellow
3,21,B,Delhi,M,
4,12,M,Mumbai,M,black
5,11,A,Delhi,M,green
6,17,A,Mumbai,F,red


'''
Select rows with indices using iloc
================================================
When we are using iloc, we need to specify the rows and columns by their integer index. 
If we want to select only the first and third row, we simply need to put this into a list in the iloc statement with 
our dataframe:

'''

In [18]:
data.iloc[[0,2]]

Unnamed: 0,age,section,city,gender,favourite_color
0,10,A+,Pune,M,red
2,13,C,Mumbai,F,yellow


In [19]:
# selecting particular row and coloumn

data.iloc[[0,2],[1,3]]

Unnamed: 0,section,gender
0,A+,M
2,C,F


'''
Select a range of rows using iloc:
======================================

We can slice a dataframe using iloc as well. We need to provide the start_index and end_index+1 to slice a given dataframe. 
If the indices are not the sorted numbers even then it will select the starting_index row number up to the end_index:

'''

In [20]:
data.iloc[1:3]

Unnamed: 0,age,section,city,gender,favourite_color
1,22,M,Delhi,F,
2,13,C,Mumbai,F,yellow


'''
Slice the data frame over both rows and columns. In the below example, we selected the rows from (1-2) 
and columns from (2-3).
'''

In [22]:
data.iloc[1:3,2:4]

Unnamed: 0,city,gender
1,Delhi,F
2,Mumbai,F
