# Selecting Subsets of Data from DataFrames with `iloc`

## Getting started with `iloc`
The `iloc` indexer is very similar to `loc` but only uses **integer location** to make its selections. The word `iloc` itself stands for integer location so that should help remind you what it does.

### Simultaneous row and column subset selection with `iloc`
Selection with iloc will look like the following with a comma separating the row and column selections.

```
df.iloc[rows, cols]
```

Let's read in some sample data and then begin making selections with integer location.

In [None]:
import pandas as pd
df = pd.read_csv('../data/sample_data.csv', index_col=0)
df

### Use a list for both rows and columns

Let's select rows with integer location 2 and 4 along with the first and last columns. It is possible to use negative integers in the same manner as Python lists.

In [None]:
rows = [2, 4]
cols = [0, -1]
df.iloc[rows, cols]

### The possible types of selections for `iloc`
In the above example, we used a list of integers for both the row and column selection. You are not limited to just lists. All of the following are valid objects available for both row and column selections with `iloc`.  The `iloc` indexer, unlike `loc`, is unable to do boolean selection. 

* A single integer
* A list of integers
* A slice with integers

### Slice the rows and use a list for the columns
Let's use slice notation to select rows with integer location 2 and 3 and a list to select columns with integer location 4 and 2. Notice that the stop integer location is **excluded** with `iloc`, which is exactly how slicing works with Python lists, tuples, and strings. Slicing with `loc` is **inclusive** of the stop label.

In [None]:
cols = [4, 2]
df.iloc[2:4, cols]

### Use a list for the rows and a slice for the columns

In this example, we use a list for the row selection and slice notation for the columns.

In [None]:
rows = [5, 2, 4]
df.iloc[rows, 3:]

### Selecting some rows and all of the columns
If you leave the column selection empty, then all of the columns will be selected.

In [None]:
rows = [3, 2]
df.iloc[rows]

It is possible to rewrite the above with both row and column selections by using a colon to represent a slice of all of the columns. Just as with `loc`, this can be instructive and reinforce the concept that the `iloc` does simultaneous row and column selection with the row selection first.

In [None]:
df.iloc[rows, :]

### Select all of the rows and some of the columns
Let's use a single colon to create slice notation to select all of the rows and a list to select two columns.

In [None]:
cols = [1, 5]
df.iloc[:, cols]

### Cannot do this with *just the brackets*
Just the brackets does select columns but it only understands **labels** and not **integer location**. The following produces an error as pandas is looking for column names that are the integers `1` and `5`.

In [None]:
cols = [1, 5]
df[cols]

### Integer column names
pandas allows integers as column names and in fact you can have a mix of strings and integers (along with other types). So, if a column name was the integer 1, you would select it by writing `df[1]`. I would avoid using integer column names if possible as they do not provide descriptive names.

### Select some rows and a single column
In this example, a list of integers is used for the rows along with a single integer for the columns. pandas returns a Series when a single integer is used to select either a row or column.

In [None]:
rows = [2, 3, 5]
cols = 4
df.iloc[rows, cols]

### Select a single row or column as a DataFrame and NOT a Series
You can select a single row (or column) and return a DataFrame and not a Series if you use a list to make the selection. Let's replicate the selection from the previous example, but use a one-item list for the column selection.

In [None]:
rows = [2, 3, 5]
cols = [4]
df.iloc[rows, cols]

### Select a single row as a Series with `iloc`
By passing a single integer to `iloc`, it will select one row as a Series.

In [None]:
df.iloc[2]

## Summary of `iloc`
The `iloc` indexer is analogous to `loc` but only uses **integer location** for selection. The official pandas documentation refers to this as selection by **position**.

* Uses only integer location
* Selects rows and columns simultaneously
* Selection can be a single integer, a list of integers, or a slice of integers
* A comma separates row and column selections

## Exericses
* Use the movie dataset for the following exercises

### Exercise 1
<span  style="color:green; font-size:16px">Select the rows with integer location 10, 5, and 1</span>

In [2]:
import pandas as pd
movies = pd.read_csv('../data/movie.csv', index_col='title')
movies.head(3)

Unnamed: 0_level_0,year,color,content_rating,duration,director_name,director_fb,actor1,actor1_fb,actor2,actor2_fb,...,actor3_fb,gross,genres,num_reviews,num_voted_users,plot_keywords,language,country,budget,imdb_score
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Avatar,2009.0,Color,PG-13,178.0,James Cameron,0.0,CCH Pounder,1000.0,Joel David Moore,936.0,...,855.0,760505847.0,Action|Adventure|Fantasy|Sci-Fi,723.0,886204,avatar|future|marine|native|paraplegic,English,USA,237000000.0,7.9
Pirates of the Caribbean: At World's End,2007.0,Color,PG-13,169.0,Gore Verbinski,563.0,Johnny Depp,40000.0,Orlando Bloom,5000.0,...,1000.0,309404152.0,Action|Adventure|Fantasy,302.0,471220,goddess|marriage ceremony|marriage proposal|pi...,English,USA,300000000.0,7.1
Spectre,2015.0,Color,PG-13,148.0,Sam Mendes,0.0,Christoph Waltz,11000.0,Rory Kinnear,393.0,...,161.0,200074175.0,Action|Adventure|Thriller,602.0,275868,bomb|espionage|sequel|spy|terrorist,English,UK,245000000.0,6.8


In [3]:
rows = [10, 5, 1]
movies.iloc[rows, :]

Unnamed: 0_level_0,year,color,content_rating,duration,director_name,director_fb,actor1,actor1_fb,actor2,actor2_fb,...,actor3_fb,gross,genres,num_reviews,num_voted_users,plot_keywords,language,country,budget,imdb_score
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Batman v Superman: Dawn of Justice,2016.0,Color,PG-13,183.0,Zack Snyder,0.0,Henry Cavill,15000.0,Lauren Cohan,4000.0,...,2000.0,330249062.0,Action|Adventure|Sci-Fi,673.0,371639,based on comic book|batman|sequel to a reboot|...,English,USA,250000000.0,6.9
John Carter,2012.0,Color,PG-13,132.0,Andrew Stanton,475.0,Daryl Sabara,640.0,Samantha Morton,632.0,...,530.0,73058679.0,Action|Adventure|Sci-Fi,462.0,212204,alien|american civil war|male nipple|mars|prin...,English,USA,263700000.0,6.6
Pirates of the Caribbean: At World's End,2007.0,Color,PG-13,169.0,Gore Verbinski,563.0,Johnny Depp,40000.0,Orlando Bloom,5000.0,...,1000.0,309404152.0,Action|Adventure|Fantasy,302.0,471220,goddess|marriage ceremony|marriage proposal|pi...,English,USA,300000000.0,7.1


### Exericse 2
<span  style="color:green; font-size:16px">Select the columns with integer location 10, 5, and 1</span>

In [4]:
cols = [10, 5, 1]
movies.iloc[:, cols]

Unnamed: 0_level_0,actor3,director_fb,color
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Avatar,Wes Studi,0.0,Color
Pirates of the Caribbean: At World's End,Jack Davenport,563.0,Color
Spectre,Stephanie Sigman,0.0,Color
The Dark Knight Rises,Joseph Gordon-Levitt,22000.0,Color
Star Wars: Episode VII - The Force Awakens,,131.0,
...,...,...,...
Signed Sealed Delivered,Crystal Lowe,2.0,Color
The Following,Sam Underwood,,Color
A Plague So Pleasant,David Chandler,0.0,Color
Shanghai Calling,Eliza Coupe,0.0,Color


### Exercise 3
<span  style="color:green; font-size:16px">Select rows with integer location 100 to 104 along with the column integer location 5.</span>

In [6]:
movies.iloc[100:104, [5]]

Unnamed: 0_level_0,director_fb
title,Unnamed: 1_level_1
The Fast and the Furious,357.0
The Curious Case of Benjamin Button,21000.0
X-Men: First Class,905.0
The Hunger Games: Mockingjay - Part 2,508.0
