# Selecting Subsets of Data from DataFrames with `iloc`

The `iloc` indexer is very similar to the `loc` indexer but only uses **integer location** to make its subset selections. The word `iloc` itself stands for integer location and can help remind you what it does.

## Simultaneous row and column subset selection

The `iloc` indexer is capable of making simultaneous row and column selections just like `loc`. Selection with `iloc` takes the following form, with a comma separating the row and column selections.

```python
df.iloc[rows, cols]
```

Let's read in some sample data and then begin making selections with integer location using `iloc`.

In [1]:
import pandas as pd
df = pd.read_csv('../data/sample_data.csv', index_col=0)
df

Unnamed: 0_level_0,state,color,food,age,height,score
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Jane,NY,blue,Steak,30,165,4.6
Niko,TX,green,Lamb,2,70,8.3
Aaron,FL,red,Mango,12,120,9.0
Penelope,AL,white,Apple,4,80,3.3
Dean,AK,gray,Cheese,32,180,1.8
Christina,TX,black,Melon,33,172,9.5
Cornelia,TX,red,Beans,69,150,2.2


### What is integer location?

Integer location is the term used to reference a row or column. The first row/column is referenced by the integer 0. Each subsequent row is referenced by the next integer. The last row/column is referenced by `n - 1` where `n` is the number of row/columns.

### Select using a list for both rows and columns

Let's select rows with integer location 2 and 4 along with the first and last columns. It is possible to use negative integers in the same manner as Python lists. The integer location -1 refers to the last column below.

In [2]:
rows = [2, 4]
cols = [0, -1]
df.iloc[rows, cols]

Unnamed: 0_level_0,state,score
name,Unnamed: 1_level_1,Unnamed: 2_level_1
Aaron,FL,9.0
Dean,AK,1.8


### The possible types of selections for `iloc`

In the above example, we used a list of integers for both the row and column selection. You are not limited to just lists. All of the following are valid objects available for both row and column selections with `iloc`. The `iloc` indexer, unlike `loc`, is unable to do boolean selection. 

* A single integer
* A list of integers
* A slice with integers

### Slice the rows and use a list for the columns

Let's use slice notation to select rows with integer location 2 and 3 and a list to select columns with integer location 4 and 2. Notice that the stop integer location is **excluded** with `iloc`, which is exactly how slicing works with Python lists, tuples, and strings. Slicing with `loc` is **inclusive** of the stop label.

In [3]:
cols = [4, 2]
df.iloc[2:4, cols]

Unnamed: 0_level_0,height,food
name,Unnamed: 1_level_1,Unnamed: 2_level_1
Aaron,120,Mango
Penelope,80,Apple


### Use a list for the rows and a slice for the columns

In this example, we use a list for the row selection and slice notation for the columns.

In [4]:
rows = [5, 2, 4]
df.iloc[rows, 3:]

Unnamed: 0_level_0,age,height,score
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Christina,33,172,9.5
Aaron,12,120,9.0
Dean,32,180,1.8


### Select all of the rows and some of the columns

You can use an empty slice (just the colon) to select all of the rows or columns. In the example below, we select all of the rows and some of the columns with a list.

In [5]:
cols = [2, 4]
df.iloc[:, cols]

Unnamed: 0_level_0,food,height
name,Unnamed: 1_level_1,Unnamed: 2_level_1
Jane,Steak,165
Niko,Lamb,70
Aaron,Mango,120
Penelope,Apple,80
Dean,Cheese,180
Christina,Melon,172
Cornelia,Beans,150


### Cannot do this with *just the brackets*

*Just the brackets* does select columns, but it only understands **labels** and not **integer location**. The following produces an error as pandas is looking for column names that are the integers 2 and 4.

In [6]:
df[cols]

KeyError: "None of [Int64Index([2, 4], dtype='int64')] are in the [columns]"

### Select some of the rows and all of the columns

We can again use an empty slice, but do so to select all of the columns. We use a list to select some of the rows.

In [7]:
rows = [-3, -1, -2]
df.iloc[rows, :]

Unnamed: 0_level_0,state,color,food,age,height,score
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Dean,AK,gray,Cheese,32,180,1.8
Cornelia,TX,red,Beans,69,150,2.2
Christina,TX,black,Melon,33,172,9.5


It is possible to rewrite the above without the column selection. pandas defaults to selecting all of the columns if a selection for them is not explicitly present.

In [8]:
df.iloc[rows]

Unnamed: 0_level_0,state,color,food,age,height,score
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Dean,AK,gray,Cheese,32,180,1.8
Cornelia,TX,red,Beans,69,150,2.2
Christina,TX,black,Melon,33,172,9.5


### Select a single row and a single column

We can select a single value in our DataFrame using `iloc` by providing a single integer for both the row and column selection. This returns the actual value by itself completely outside of a DataFrame or Series.

In [9]:
df.iloc[3, 2]

'Apple'

### Select a single row and a single column as a DataFrame

It is possible to select the above value as a DataFrame by using one-item lists for the row and column selections. The output looks a little bizarre, but it's just a DataFrame with one row and one column.

In [10]:
rows = [3]
cols = [2]
df.iloc[rows, cols]

Unnamed: 0_level_0,food
name,Unnamed: 1_level_1
Penelope,Apple


### Select some rows and a single column

In this example, a list of integers is used for the rows and a single integer for the columns. pandas returns a Series when a single integer is used to select either a row or column.

In [11]:
rows = [2, 3, 5]
cols = 4
df.iloc[rows, cols]

name
Aaron        120
Penelope      80
Christina    172
Name: height, dtype: int64

### Select a single row or column as a DataFrame and NOT a Series

You can select a single row (or column) and return a DataFrame and not a Series if you use a list to make the selection. Let's replicate the selection from the previous example, but use a one-item list for the column selection.

In [12]:
rows = [2, 3, 5]
cols = [4]
df.iloc[rows, cols]

Unnamed: 0_level_0,height
name,Unnamed: 1_level_1
Aaron,120
Penelope,80
Christina,172


### Select a single row as a Series

We can select a single row by providing a single integer as the row selection for `iloc`. We use an empty slice to select all of the columns. Because we are selecting a single row, a Series is returned. Just as with `loc`, the returned output can be confusing as the original horizontal row is now displayed vertically.

In [13]:
df.iloc[2, :]

state        FL
color       red
food      Mango
age          12
height      120
score       9.0
Name: Aaron, dtype: object

To maintain the original orientation, we can select the row using a one-item list which returns a DataFrame.

In [14]:
df.iloc[[2], :]

Unnamed: 0_level_0,state,color,food,age,height,score
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Aaron,FL,red,Mango,12,120,9.0


## Summary of `iloc`

The `iloc` indexer is analogous to `loc` but only uses **integer location** for selection. The official pandas documentation refers to it as selection by **position**.

* Uses only integer location
* Selects rows and columns simultaneously with `df.iloc[rows, cols]`
* Selection can be a 
    * single integer
    * a list of integers
    * a slice of integers
* A comma separates row and column selections

## Exercises

Read in the movie dataset by executing the cell below and use it for the following exercises.

In [15]:
pd.set_option('display.max_columns', 50)
movie = pd.read_csv('../data/movie.csv', index_col='title')
movie.head(3)

Unnamed: 0_level_0,year,color,content_rating,duration,director_name,director_fb,actor1,actor1_fb,actor2,actor2_fb,actor3,actor3_fb,gross,genres,num_reviews,num_voted_users,plot_keywords,language,country,budget,imdb_score
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Avatar,2009.0,Color,PG-13,178.0,James Cameron,0.0,CCH Pounder,1000.0,Joel David Moore,936.0,Wes Studi,855.0,760505847.0,Action|Adventure|Fantasy|Sci-Fi,723.0,886204,avatar|future|marine|native|paraplegic,English,USA,237000000.0,7.9
Pirates of the Caribbean: At World's End,2007.0,Color,PG-13,169.0,Gore Verbinski,563.0,Johnny Depp,40000.0,Orlando Bloom,5000.0,Jack Davenport,1000.0,309404152.0,Action|Adventure|Fantasy,302.0,471220,goddess|marriage ceremony|marriage proposal|pi...,English,USA,300000000.0,7.1
Spectre,2015.0,Color,PG-13,148.0,Sam Mendes,0.0,Christoph Waltz,11000.0,Rory Kinnear,393.0,Stephanie Sigman,161.0,200074175.0,Action|Adventure|Thriller,602.0,275868,bomb|espionage|sequel|spy|terrorist,English,UK,245000000.0,6.8


### Exercise 1

<span  style="color:green; font-size:16px">Select the columns with integer location 10, 5, and 1.</span>

In [16]:
movie.iloc[:, [10,5,1]]

Unnamed: 0_level_0,actor3,director_fb,color
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Avatar,Wes Studi,0.0,Color
Pirates of the Caribbean: At World's End,Jack Davenport,563.0,Color
Spectre,Stephanie Sigman,0.0,Color
The Dark Knight Rises,Joseph Gordon-Levitt,22000.0,Color
Star Wars: Episode VII - The Force Awakens,,131.0,
...,...,...,...
Signed Sealed Delivered,Crystal Lowe,2.0,Color
The Following,Sam Underwood,,Color
A Plague So Pleasant,David Chandler,0.0,Color
Shanghai Calling,Eliza Coupe,0.0,Color


### Exercise 2

<span  style="color:green; font-size:16px">Select the rows with integer location 10, 5, and 1.</span>

In [17]:
movie.iloc[[10,5,1]]

Unnamed: 0_level_0,year,color,content_rating,duration,director_name,director_fb,actor1,actor1_fb,actor2,actor2_fb,actor3,actor3_fb,gross,genres,num_reviews,num_voted_users,plot_keywords,language,country,budget,imdb_score
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Batman v Superman: Dawn of Justice,2016.0,Color,PG-13,183.0,Zack Snyder,0.0,Henry Cavill,15000.0,Lauren Cohan,4000.0,Alan D. Purwin,2000.0,330249062.0,Action|Adventure|Sci-Fi,673.0,371639,based on comic book|batman|sequel to a reboot|...,English,USA,250000000.0,6.9
John Carter,2012.0,Color,PG-13,132.0,Andrew Stanton,475.0,Daryl Sabara,640.0,Samantha Morton,632.0,Polly Walker,530.0,73058679.0,Action|Adventure|Sci-Fi,462.0,212204,alien|american civil war|male nipple|mars|prin...,English,USA,263700000.0,6.6
Pirates of the Caribbean: At World's End,2007.0,Color,PG-13,169.0,Gore Verbinski,563.0,Johnny Depp,40000.0,Orlando Bloom,5000.0,Jack Davenport,1000.0,309404152.0,Action|Adventure|Fantasy,302.0,471220,goddess|marriage ceremony|marriage proposal|pi...,English,USA,300000000.0,7.1


### Exercise 3

<span  style="color:green; font-size:16px">Select rows with integer location 100 to 104 along with the column integer location 5.</span>

In [19]:
movie.iloc[100:105, 5]

title
The Fast and the Furious                   357.0
The Curious Case of Benjamin Button      21000.0
X-Men: First Class                         905.0
The Hunger Games: Mockingjay - Part 2      508.0
The Sorcerer's Apprentice                  226.0
Name: director_fb, dtype: float64

### Exercise 4

<span  style="color:green; font-size:16px">Select the value at row integer location 100 and column integer location 4.</span>

In [20]:
movie.iloc[100,4]

'Rob Cohen'

### Exercise 5

<span  style="color:green; font-size:16px">Return the result of exercise 4 as a DataFrame.</span>

In [22]:
movie.iloc[[100], [4]]

Unnamed: 0_level_0,director_name
title,Unnamed: 1_level_1
The Fast and the Furious,Rob Cohen


### Exercise 6

<span  style="color:green; font-size:16px">Select the last 5 rows of the last 5 columns.</span>

In [24]:
movie.iloc[-5:, -5:]

Unnamed: 0_level_0,plot_keywords,language,country,budget,imdb_score
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Signed Sealed Delivered,fraud|postal worker|prison|theft|trial,English,Canada,,7.7
The Following,cult|fbi|hideout|prison escape|serial killer,English,USA,,7.5
A Plague So Pleasant,,English,USA,1400.0,6.3
Shanghai Calling,,English,USA,,6.3
My Date with Drew,actress name in title|crush|date|four word tit...,English,USA,1100.0,6.6


### Exercise 7

<span  style="color:green; font-size:16px">Select every 25th row between rows with integer location 100 and 200 along with every fifth column.</span>

In [25]:
movie.iloc[100:200:25, ::5]

Unnamed: 0_level_0,year,director_fb,actor3,num_voted_users,imdb_score
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
The Fast and the Furious,2001.0,357.0,Jordana Brewster,272223,6.7
Frozen,2013.0,69.0,Livvy Stubenrauch,421658,7.6
Armageddon,1998.0,0.0,Will Patton,322395,6.6
The Incredible Hulk,2008.0,255.0,William Hurt,326286,6.8


### Exercise 8

<span  style="color:green; font-size:16px">Select the column with integer location 7 as a Series.</span>

In [26]:
movie.iloc[:, 7]

title
Avatar                                         1000.0
Pirates of the Caribbean: At World's End      40000.0
Spectre                                       11000.0
The Dark Knight Rises                         27000.0
Star Wars: Episode VII - The Force Awakens      131.0
                                               ...   
Signed Sealed Delivered                         637.0
The Following                                   841.0
A Plague So Pleasant                              0.0
Shanghai Calling                                946.0
My Date with Drew                                86.0
Name: actor1_fb, Length: 4916, dtype: float64

### Exercise 9

<span  style="color:green; font-size:16px">Select the rows with integer location 999, 99, and 9 and the columns with integer location 9 and 19.</span>

In [27]:
movie.iloc[[999,99,9], [9,19]]

Unnamed: 0_level_0,actor2_fb,budget
title,Unnamed: 1_level_1,Unnamed: 2_level_1
The Iron Giant,631.0,70000000.0
The Hobbit: An Unexpected Journey,972.0,180000000.0
Harry Potter and the Half-Blood Prince,11000.0,250000000.0
