### Pandas - Tips and Tricks - df.loc, df.iloc

This notebook is a part of __Pandas - Tips and Tricks__ mini-series, focusing on different aspects of __pandas__ library in __Python__. In the below examples we will be looking at selecting the data by using __.loc__ and __.iloc__ methods.

&nbsp;&nbsp;__.loc:__ is primarily label based indexing.

&nbsp;&nbsp;__.iloc:__ is primarily integer position based indexing.

Previous blog posts on the topic:
[Data import with Python, using pandas DataFrame – Part 1](http://codewithmax.com/2017/07/06/data-import-with-python-using-pandas-dataframe-part-1/)

Let's start by importing __pandas__ and loading the data:

In [1]:
# Loading the library
import pandas as pd

# I am using the data from WHO as an example
df = pd.read_csv('Data/SuicBoth.csv')

# Checking the DataFrame shape
print(df.shape)

# Checking the imported data
df.head()

(183, 6)


Unnamed: 0,Country,Sex,2015,2010,2005,2000
0,Afghanistan,Both sexes,5.5,5.2,5.4,4.8
1,Albania,Both sexes,4.3,5.3,6.3,6.0
2,Algeria,Both sexes,3.1,3.4,3.6,3.0
3,Angola,Both sexes,20.5,20.7,20.0,18.4
4,Antigua and Barbuda,Both sexes,0.0,0.2,1.6,2.3


__.loc__ and __.iloc__ are used with square brackets, in a format df.loc[[row], [columns]]. Let's look at few examples below:

In [2]:
# To access any row we can call it by the row label
df.loc[2]

Country       Algeria
Sex        Both sexes
2015              3.1
2010              3.4
2005              3.6
2000                3
Name: 2, dtype: object

In [3]:
# or by its index. In these two cases the label is matching the index,
# so the results are the same
df.iloc[2]

Country       Algeria
Sex        Both sexes
2015              3.1
2010              3.4
2005              3.6
2000                3
Name: 2, dtype: object

In [4]:
# However, we won't be able to call df.loc[-1] to access the last element,
# it doesn't exist in row labels, only in index based
df.iloc[-1]

Country      Zimbabwe
Sex        Both sexes
2015             10.5
2010             11.2
2005             11.3
2000             12.1
Name: 182, dtype: object

In [5]:
# Multiple rows can be selected
df.loc[[3, 15, 146, 118]]

Unnamed: 0,Country,Sex,2015,2010,2005,2000
3,Angola,Both sexes,20.5,20.7,20.0,18.4
15,Belgium,Both sexes,20.5,20.3,20.5,22.6
146,Slovenia,Both sexes,21.4,20.6,26.5,31.8
118,Nigeria,Both sexes,9.9,9.8,9.5,9.9


In [6]:
# The rows can be also selected by the slicing syntax - :
df.loc[14:20]

Unnamed: 0,Country,Sex,2015,2010,2005,2000
14,Belarus,Both sexes,22.8,27.9,36.1,41.9
15,Belgium,Both sexes,20.5,20.3,20.5,22.6
16,Belize,Both sexes,7.3,6.1,5.1,8.0
17,Benin,Both sexes,9.4,9.0,9.0,8.5
18,Bhutan,Both sexes,11.7,11.4,12.2,13.0
19,Bolivia (Plurinational State of),Both sexes,18.7,20.5,22.2,23.5
20,Bosnia and Herzegovina,Both sexes,6.0,6.0,8.0,9.8


Let's look at some examples regarding the columns:

In [7]:
# Using the colon to select all the rows, and specifying
# the columns in the square bracket
df_new = df.loc[:, ['2015', '2010']]
df_new.head()

Unnamed: 0,2015,2010
0,5.5,5.2
1,4.3,5.3
2,3.1,3.4
3,20.5,20.7
4,0.0,0.2


In [8]:
# The same result achieved by using .iloc
df_new = df.iloc[:, [2, 3]]
df_new.head()

Unnamed: 0,2015,2010
0,5.5,5.2
1,4.3,5.3
2,3.1,3.4
3,20.5,20.7
4,0.0,0.2


In [9]:
# Range function can be also used.
# Selecting the range from 2 to 3
df_new = df.iloc[:, list(range(2, 4))]
df_new.head()

Unnamed: 0,2015,2010
0,5.5,5.2
1,4.3,5.3
2,3.1,3.4
3,20.5,20.7
4,0.0,0.2


In [10]:
# Multiple colons can be applied as in the example below.
# Here, we include all the rows, and every other column
df_new = df.iloc[:, ::2]
df_new.head()

Unnamed: 0,Country,2015,2005
0,Afghanistan,5.5,5.4
1,Albania,4.3,6.3
2,Algeria,3.1,3.6
3,Angola,20.5,20.0
4,Antigua and Barbuda,0.0,1.6
