# Numeric Indexing of DataFrames

This lesson covers:

* Accessing specific elements in DataFrames using numeric indices

Accessing elements in a DataFrame is a common task. To begin this lesson,
clear the workspace set up some vectors and a $5\times5$ array. These vectors
and matrix will make it easy to determine which elements are selected by a
command.

Begin by creating:

* A 5-by-5 DataFrame `x_df` containing `np.arange(25).reshape((5,5))`.
* A 5-element Series `y_s` containing `np.arange(5)`.
* A 5-by-5 DataFrame `x_named` that is `x_df` with columns 'c0', 'c1', ...,
  'c4' and rows 'r0', 'r1', ..., 'r4'.
* A 5-element Series `y_named` with index 'r0', 'r1', ..., 'r4'. 

In [31]:
import numpy as np
import pandas as pd

x = np.arange(25).reshape((5,5))  
y = np.arange(5)


x_df = pd.DataFrame(x)
x_named = pd.DataFrame(x, index=['r0','r1','r2','r3','r4'],
                       columns=['c0','c1','c2','c3','c4'])
y_s = pd.Series(y)
y_named = pd.Series(y, index=['r0','r1','r2','r3','r4'])

print(f'x_df = \n{x_df}')
print(f'y_s = \n{y_s}')

print(f'x_named = \n{x_named}')
print(f'y_named = \n{y_named}')

x_df = 
    0   1   2   3   4
0   0   1   2   3   4
1   5   6   7   8   9
2  10  11  12  13  14
3  15  16  17  18  19
4  20  21  22  23  24
y_s = 
0    0
1    1
2    2
3    3
4    4
dtype: int32
x_named = 
    c0  c1  c2  c3  c4
r0   0   1   2   3   4
r1   5   6   7   8   9
r2  10  11  12  13  14
r3  15  16  17  18  19
r4  20  21  22  23  24
y_named = 
r0    0
r1    1
r2    2
r3    3
r4    4
dtype: int32


## Problem: Picking an Element out of a DataFrame

Using double index notation, select the (0,2) and the (2,0) element of
`x_named`.

In [32]:
print(x_named.iloc[0, 2])
print(x_named.iloc[2, 0])

2
10


## Problem: Select Elements from Series

Select the 2nd element of `y_named`.

In [33]:
y_named.iloc[1:2]


r1    1
dtype: int32

## Problem: Selecting Rows as Series

Select the 2nd row of `x_named` using the colon (:) operator.


In [34]:
x_named.iloc[1, :]


c0    5
c1    6
c2    7
c3    8
c4    9
Name: r1, dtype: int32

## Problem: Selecting Rows as DataFrames

1. Select the 2nd row of `x_named` using a slice so that the selection
   remains a DataFrame.
2. Repeat using a list of indices to retain the DataFrame. 


In [35]:
x_named.iloc[1:2, :]

Unnamed: 0,c0,c1,c2,c3,c4
r1,5,6,7,8,9


In [36]:
x_named.iloc[[1], :]


Unnamed: 0,c0,c1,c2,c3,c4
r1,5,6,7,8,9


## Problem: Selecting Entire Columns as Series
Select the 2nd column of `x_named` using the colon (:) operator. 

In [37]:
print(x_named.iloc[:, 1])

r0     1
r1     6
r2    11
r3    16
r4    21
Name: c1, dtype: int32


## Problem: Selecting Single Columns as DataFrames
Select the 2nd column of `x_named`  so that the selection remains a DataFrame. 


In [38]:
x_named.iloc[:, 1:2]

Unnamed: 0,c1
r0,1
r1,6
r2,11
r3,16
r4,21


In [39]:
x_named.iloc[:, [1]]


Unnamed: 0,c1
r0,1
r1,6
r2,11
r3,16
r4,21


## Problem: Selecting Specific Columns
Select the 2nd and 3rd columns of `x_named` using a slice.

In [40]:
print(x_named.iloc[:, 1:3])

    c1  c2
r0   1   2
r1   6   7
r2  11  12
r3  16  17
r4  21  22


## Problem: Select Specific Rows

Select the 2nd and 4th rows of `x_named` using a slice.  Repeat the 
selection using a list of integers.

In [41]:
x_named.iloc[1:4:2, :]

Unnamed: 0,c0,c1,c2,c3,c4
r1,5,6,7,8,9
r3,15,16,17,18,19


In [42]:
x_named.iloc[[1, 3], :]


Unnamed: 0,c0,c1,c2,c3,c4
r1,5,6,7,8,9
r3,15,16,17,18,19


## Problem: Select arbitrary rows and columns

Combine the previous selections to select columns 2 and 3 and rows 2 and 4
of `x_named`. 

**Note**: This is the only important difference with NumPy.  Arbitrary
row/column selection using `DataFrame.iloc` is simpler but less flexible.

print(x_named.iloc[1:4:2, 1:3])
print(x_named.iloc[[1, 3],[1, 2]])
print(x_named.iloc[[1,3], 1:3])

## Problem: Mixed selection

Select the columns c1 and c2 and rows 0, 2 and 4.

In [43]:
x_named[['c1', 'c2']].iloc[[0, 2, 4]]

Unnamed: 0,c1,c2
r0,1,2
r2,11,12
r4,21,22


## Problem: Mixed selection 2

Select the rows r1 and r2 and columns 0, 2 and 4.

In [44]:
x_named.loc[['r1','r2']].iloc[:, [0, 2, 4]]

Unnamed: 0,c0,c2,c4
r1,5,7,9
r2,10,12,14
