<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Pandas-loc-and-iloc-for-selecting-data" data-toc-modified-id="Pandas-loc-and-iloc-for-selecting-data-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Pandas <code>loc</code> and <code>iloc</code> for selecting data</a></span><ul class="toc-item"><li><span><a href="#1.-Differences-between-loc-and-iloc" data-toc-modified-id="1.-Differences-between-loc-and-iloc-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>1. Differences between loc and iloc</a></span></li><li><span><a href="#2.-Selecting-via-a-single-value" data-toc-modified-id="2.-Selecting-via-a-single-value-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>2. Selecting via a single value</a></span></li><li><span><a href="#3.-Selecting-via-a-list-of-values" data-toc-modified-id="3.-Selecting-via-a-list-of-values-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>3. Selecting via a list of values</a></span></li><li><span><a href="#4.-Selecting-a-range-of-data-via-slice" data-toc-modified-id="4.-Selecting-a-range-of-data-via-slice-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>4. Selecting a range of data via slice</a></span></li><li><span><a href="#5.-Selecting-via-conditions-and-callable" data-toc-modified-id="5.-Selecting-via-conditions-and-callable-1.5"><span class="toc-item-num">1.5&nbsp;&nbsp;</span>5. Selecting via conditions and callable</a></span><ul class="toc-item"><li><span><a href="#5.2-Conditions" data-toc-modified-id="5.2-Conditions-1.5.1"><span class="toc-item-num">1.5.1&nbsp;&nbsp;</span>5.2 Conditions</a></span></li><li><span><a href="#5.2-Callable" data-toc-modified-id="5.2-Callable-1.5.2"><span class="toc-item-num">1.5.2&nbsp;&nbsp;</span>5.2 Callable</a></span></li></ul></li><li><span><a href="#6.-loc-and-iloc-are-interchangeable-when-labels-are-0-based-integers" data-toc-modified-id="6.-loc-and-iloc-are-interchangeable-when-labels-are-0-based-integers-1.6"><span class="toc-item-num">1.6&nbsp;&nbsp;</span>6. <code>loc</code> and <code>iloc</code> are interchangeable when labels are 0-based integers</a></span></li><li><span><a href="#Exercise" data-toc-modified-id="Exercise-1.7"><span class="toc-item-num">1.7&nbsp;&nbsp;</span>Exercise</a></span></li></ul></li></ul></div>

# Pandas `loc` and `iloc` for selecting data

This is a notebook for the medium article [How to use `loc` and `iloc` for selecting data in Pandas](https://bindichen.medium.com/how-to-use-loc-and-iloc-for-selecting-data-in-pandas-bd09cb4c3d79)

Please check out article for instructions

**License**: [BSD 2-Clause](https://opensource.org/licenses/BSD-2-Clause)

In [None]:
import pandas as pd

In [None]:
data = {
    'Weather': ['Sunny','Sunny','Sunny','Cloudy','Shower','Shower','Sunny'], 
    'Temperature': [78,76,78,68,70,71,82],
    'Wind': [13,28,16,11,26,27,20],
    'Humidity': [30,96,20,22,79,62,10],
}
df = pd.DataFrame(data, index = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'])
df

## 1. Differences between loc and iloc

The main distinction between `loc` and `iloc` is:
* `loc` is label-based, which means that you have to specify rows and columns based on their row and column labels. 
* `iloc` is integer position-based, so you have to specify rows and columns by their integer position values (0-based integer position).

## 2. Selecting via a single value 

To get Fridays' temperature

In [None]:
# Pass label to `loc`
df.loc['Fri', 'Temperature']

In [None]:
# The equivalent `iloc` statement should take row number 4 and column number 1
df.iloc[4, 1]

Use `:` to return all data

In [None]:
# To get all rows
df.loc[:, 'Temperature']

In [None]:
# The equivalent `iloc` statement
df.iloc[:, 1]

In [None]:
# To get all columns
df.loc['Fri', :]

In [None]:
# The equivalent `iloc` statement
df.iloc[4, :]

## 3. Selecting via a list of values

In [None]:
# Multiple rows
df.loc[['Thu', 'Fri'], 'Temperature']

In [None]:
# Multiple columns
df.loc['Fri', ['Temperature', 'Wind']]

In [None]:
# Multiple rows using iloc
df.iloc[[3, 4], 1]

In [None]:
# Multiple columns using iloc
df.iloc[4, [1, 2]]

In [None]:
# Multiple rows and columns
rows = ['Thu', 'Fri']
cols=['Temperature','Wind']

df.loc[rows, cols]

In [None]:
# the equivalent iloc statement
rows = [3, 4]
cols = [1, 2]
df.iloc[rows, cols]

## 4. Selecting a range of data via slice

For loc, we can use the syntax `A:B` to select data from label `A` to label `B` (Both `A` and `B` are included):

In [None]:
# Slicing column labels
rows=['Thu', 'Fri']
df.loc[rows, 'Temperature':'Humidity' ]

In [None]:
# Slicing row labels
cols = ['Temperature', 'Wind']
df.loc['Mon':'Thu', cols]

We can use the syntax `A:B:S` to select data from label `A` to label `B` with step size `S` (Both `A` and `B` are included):

In [None]:
# Slicing with step
df.loc['Mon':'Fri':2 , :]

With iloc, we can also use the syntax `n:m` to select data from position `n` (included) to position `m` (excluded).

In [None]:
df.iloc[[1, 2], 0 : 3]

In [None]:
df.iloc[0:4:2, :]

## 5. Selecting via conditions and callable

### 5.2 Conditions

In [None]:
# One condition
df.loc[df.Humidity > 50, :]

In [None]:
## multiple conditions
df.loc[
    (df.Humidity > 50) & (df.Weather == 'Shower'), 
    ['Temperature','Wind'],
]

In [None]:
# Getting ValueError
#df.iloc[df.Humidity > 50, :]

In [None]:
# Single condition
df.iloc[list(df.Humidity > 50)]

In [None]:
## multiple conditions
df.iloc[
    list((df.Humidity > 50) & (df.Weather == 'Shower')), 
    :,
]

### 5.2 Callable

In [None]:
# Selecting columns
df.loc[:, lambda df: ['Humidity', 'Wind']]

In [None]:
# With condition
df.loc[lambda df: df.Humidity > 50, :]

In [None]:
df.iloc[lambda df: [0,1], :]

In [None]:
df.iloc[lambda df: list(df.Humidity > 50), :]

## 6. `loc` and `iloc` are interchangeable when labels are 0-based integers

In [None]:
data = [
    ['Mon','Sunny',78,13,30],
    ['Tue','Sunny',76,28,96],
    ['Wed','Sunny',78,16,20],
    ['Thu','Cloudy',68,11,22],
    ['Fri','Shower',70,26,79],
    ['Sat','Shower',71,27,62],
    ['Sun','Sunny',82,20,10]]
df = pd.DataFrame(data)
df

Now, `loc`, a label-based data selector, can accept a single integer and a list of integer values.

In [None]:
df.loc[1, 2]

In [None]:
df.loc[1, [1, 2]]

`loc` and `iloc` are interchangeable when selecting via a single value or a list of values.

In [None]:
df.loc[1, 2] == df.iloc[1, 2]

In [None]:
df.loc[1, [1, 2]] == df.iloc[1, [1, 2]]

## Exercise

In [None]:
import numpy as np
import pandas as pd
from numpy.random import randn
np.random.seed(1234)  
np.random.randint(1,100,6)
df = pd.DataFrame(randn(4,5), index=['IL','GA','MA','VT'],columns=['Sent','Used','Expired','Lost','Destroyed'])
df