<a href="https://colab.research.google.com/github/PranavPutsa1006/ATMA/blob/master/loc%26iloc.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**How to use loc and iloc for Selecting Data in Pandas?**

The Pandas library contains multiple methods for convenient data filtering – loc and iloc among them. 
Using these, we can do practically any data selection task on Pandas dataframes.

**loc is label-based**, which means that we have to specify the name of the rows and columns that we need to filter out.

**iloc is integer index-based**. So here, we have to specify rows and columns by their integer index.


We will create a sample student dataset consisting of 5 columns – (age, section, city, gender, and favorite color) 
This dataset will contain both numerical as well as categorical variables:

In [None]:
# importing pandas and numpy
import pandas as pd
import numpy as np

# create a sample dataframe
data = pd.DataFrame({
    'age' :     [ 10, 22, 13, 21, 12, 11, 17],
    'section' : [ 'A', 'B', 'C', 'B', 'B', 'A', 'A'],
    'city' :    [ 'Gurgaon', 'Delhi', 'Mumbai', 'Delhi', 'Mumbai', 'Delhi', 'Mumbai'],
    'gender' :  [ 'M', 'F', 'F', 'M', 'M', 'M', 'F'],
    'name' : [ 'Aadil', 'Anu', 'Anju', 'Akil', 'Abay', 'Arun', 'Adhira']
})

# view the data
data

Unnamed: 0,age,section,city,gender,name
0,10,A,Gurgaon,M,Aadil
1,22,B,Delhi,F,Anu
2,13,C,Mumbai,F,Anju
3,21,B,Delhi,M,Akil
4,12,B,Mumbai,M,Abay
5,11,A,Delhi,M,Arun
6,17,A,Mumbai,F,Adhira


**Find all the rows based on any condition in a column**

One thing we use almost always when we’re exploring a dataset – filtering the data based on a given condition. For example, we might need to find all the rows in our dataset where age is more than x years, or the city is Delhi, and so on.

Let’s try to find the rows where the value of age is greater than or equal to 15:


In [None]:
# select all rows with a condition
import pandas as pd
import numpy as np

# create a sample dataframe
data = pd.DataFrame({
    'age' :     [ 10, 22, 13, 21, 12, 11, 17],
    'section' : [ 'A', 'B', 'C', 'B', 'B', 'A', 'A'],
    'city' :    [ 'Gurgaon', 'Delhi', 'Mumbai', 'Delhi', 'Mumbai', 'Delhi', 'Mumbai'],
    'gender' :  [ 'M', 'F', 'F', 'M', 'M', 'M', 'F'],
    'name' : [ 'Aadil', 'Anu', 'Anju', 'Akil', 'Abay', 'Arun', 'Adhira']
})
data.loc[data.age >= 15]


Unnamed: 0,age,section,city,gender,name
1,22,B,Delhi,F,Anu
3,21,B,Delhi,M,Akil
6,17,A,Mumbai,F,Adhira


**Find all the rows with more than one condition**

Similarly, we can also use multiple conditions to filter our data, such as finding all the rows where the age is greater than or equal to 15 and the gender is also male:

In [None]:
# select with multiple conditions
data.loc[(data.age >= 15) & (data.gender == 'M')]

Unnamed: 0,age,section,city,gender,name
3,21,B,Delhi,M,Akil


**Select only required columns with a condition**

We can also select the columns that are required of the rows that satisfy our condition.

For example, if our dataset contains hundreds of columns and we want to view only a few of them, then we can add a list of columns after the condition within the loc statement itself:

In [None]:
# select few columns with a condition
data.loc[(data.age >= 15), ['name','city', 'gender']]

Unnamed: 0,name,city,gender
1,Anu,Delhi,F
3,Akil,Delhi,M
6,Adhira,Mumbai,F


**Update the values of a particular column on selected rows**

For example, if the values in age are greater than equal to 15, then we want to update the values of the column section to be “M”.

We can do this by running a for loop as well but if our dataset is big in size, then it would take forever to complete the task. Using loc in Pandas, we can do this within seconds, even on bigger datasets!

We just need to specify the condition followed by the target column and then assign the value with which we want to update:

In [None]:
# update a column with condition
data.loc[(data.age >= 15), ['section']] = 'M'
data

Unnamed: 0,age,section,city,gender,name
0,10,A,Gurgaon,M,Aadil
1,22,M,Delhi,F,Anu
2,13,C,Mumbai,F,Anju
3,21,M,Delhi,M,Akil
4,12,B,Mumbai,M,Abay
5,11,A,Delhi,M,Arun
6,17,M,Mumbai,F,Adhira


**Update the values of multiple columns on selected rows**

If we want to update multiple columns with different values, then we can use the below syntax.

In this example, if the value in the column age is greater than 20, then the loc function will update the values in the column section with “S” and the values in the column city with Pune:

In [None]:
# update multiple columns with condition
data.loc[(data.age >= 15), ['section', 'city']] = ['S','Pune']
data

**Select rows with indices using iloc**

When we are using iloc, we need to specify the rows and columns by their **integer index.** 

If we want to select only the first and third row, we simply need to put this into a list in the iloc statement with our dataframe:


In [None]:
# select rows with indexes
data.iloc[[0,2]]

Unnamed: 0,age,section,city,gender,name
0,10,A,Gurgaon,M,Aadil
2,13,C,Mumbai,F,Anju


**Select rows with particular indices and particular columns**

Earlier, we selected a few columns from the dataset using the loc function.

 We can do this using the iloc function. 
 
Keep in mind that we need to provide the index number of the column instead of the column name:

In [None]:
# select rows with particular indexes and particular columns
data.iloc[[0,2],[1,3]]

Unnamed: 0,section,gender
0,A,M
2,C,F


**Select a range of rows using iloc**

We can slice a dataframe using iloc as well. 

We need to provide the start_index and end_index+1 to slice a given dataframe. 

If the indices are not the sorted numbers even then it will select the starting_index row number up to the end_index:

In [None]:
# select a range of rows
data.iloc[1:3]

Unnamed: 0,age,section,city,gender,name
1,22,S,Pune,F,Anu
2,13,C,Mumbai,F,Anju


**Select a range of rows and columns using iloc**

Slice the data frame over both rows and columns. 

In the below example, we selected the rows from (1-4) and columns from (2-4).

In [None]:
# select a range of rows and columns
data.iloc[1:5,2:5]

Unnamed: 0,city,gender,name
1,Pune,F,Anu
2,Mumbai,F,Anju
3,Pune,M,Akil
4,Mumbai,M,Abay
