# Selecting Subsets of Data from DataFrames with `iloc`

The `iloc` indexer is very similar to the `loc` indexer but only uses **integer location** to make its subset selections. The word `iloc` itself stands for integer location and can help remind you what it does.

## Simultaneous row and column subset selection

The `iloc` indexer is capable of making simultaneous row and column selections just like `loc`. Selection with `iloc` takes the following form, with a comma separating the row and column selections.

```python
df.iloc[rows, cols]
```

Let's read in some sample data and then begin making selections with integer location using `iloc`.

In [1]:
import pandas as pd
df = pd.read_csv('input/sample_data.csv', index_col=0)
df

FileNotFoundError: [Errno 2] No such file or directory: 'input/sample_data.csv'

### What is integer location?

Integer location is the term used to reference a row or column. The first row/column is referenced by the integer 0. Each subsequent row is referenced by the next integer. The last row/column is referenced by `n - 1` where `n` is the number of row/columns.

### Select using a list for both rows and columns

Let's select rows with integer location 2 and 4 along with the first and last columns. It is possible to use negative integers in the same manner as Python lists. The integer location -1 refers to the last column below.

In [None]:
rows = [2, 4]
cols = [0, -1]
df.iloc[rows, cols]

### The possible types of selections for `iloc`

In the above example, we used a list of integers for both the row and column selection. You are not limited to just lists. All of the following are valid objects available for both row and column selections with `iloc`. The `iloc` indexer, unlike `loc`, is unable to do boolean selection. 

* A single integer
* A list of integers
* A slice with integers

### Slice the rows and use a list for the columns

Let's use slice notation to select rows with integer location 2 and 3 and a list to select columns with integer location 4 and 2. Notice that the stop integer location is **excluded** with `iloc`, which is exactly how slicing works with Python lists, tuples, and strings. Slicing with `loc` is **inclusive** of the stop label.

In [None]:
cols = [4, 2]
df.iloc[2:4, cols]

### Use a list for the rows and a slice for the columns

In this example, we use a list for the row selection and slice notation for the columns.

In [None]:
rows = [5, 2, 4]
df.iloc[rows, 3:]

### Select all of the rows and some of the columns

You can use an empty slice (just the colon) to select all of the rows or columns. In the example below, we select all of the rows and some of the columns with a list.

In [None]:
cols = [2, 4]
df.iloc[:, cols]

### Select some of the rows and all of the columns

We can again use an empty slice, but do so to select all of the columns. We use a list to select some of the rows.

In [None]:
rows = [-3, -1, -2]
df.iloc[rows, :]

It is possible to rewrite the above without the column selection. pandas defaults to selecting all of the columns if a selection for them is not explicitly present.

In [None]:
df.iloc[rows]

### Select a single row and a single column

We can select a single value in our DataFrame using `iloc` by providing a single integer for both the row and column selection. This returns the actual value by itself completely outside of a DataFrame or Series.

In [None]:
df.iloc[3, 2]

### Select a single row and a single column as a DataFrame

It is possible to select the above value as a DataFrame by using one-item lists for the row and column selections. The output looks a little bizarre, but it's just a DataFrame with one row and one column.

In [None]:
rows = [3]
cols = [2]
df.iloc[rows, cols]

### Select some rows and a single column

In this example, a list of integers is used for the rows and a single integer for the columns. pandas returns a Series when a single integer is used to select either a row or column.

In [None]:
rows = [2, 3, 5]
cols = 4
df.iloc[rows, cols]

### Select a single row or column as a DataFrame and NOT a Series

You can select a single row (or column) and return a DataFrame and not a Series if you use a list to make the selection. Let's replicate the selection from the previous example, but use a one-item list for the column selection.

In [None]:
rows = [2, 3, 5]
cols = [4]
df.iloc[rows, cols]

### Select a single row as a Series

We can select a single row by providing a single integer as the row selection for `iloc`. We use an empty slice to select all of the columns. Because we are selecting a single row, a Series is returned. Just as with `loc`, the returned output can be confusing as the original horizontal row is now displayed vertically.

In [None]:
df.iloc[2, :]

To maintain the original orientation, we can select the row using a one-item list which returns a DataFrame.

In [None]:
df.iloc[[2], :]

## Summary of `iloc`

The `iloc` indexer is analogous to `loc` but only uses **integer location** for selection. The official pandas documentation refers to it as selection by **position**.

* Uses only integer location
* Selects rows and columns simultaneously with `df.iloc[rows, cols]`
* Selection can be a 
    * single integer
    * a list of integers
    * a slice of integers
* A comma separates row and column selections

## Exercises

Read in the movie dataset by executing the cell below and use it for the following exercises.

In [None]:
pd.set_option('display.max_columns', 50)
movie = pd.read_csv('input/movie.csv', index_col='title')
movie.head(3)