## Subsetting Data
### Working with pandas
*Curtis Miller*

In this notebook we will subset `Series` and `DataFrame`s in a variety of ways.

We start by creating some `Series` and `DataFrame`s to work with.

In [None]:
import pandas as pd
from pandas import Series, DataFrame
import numpy as np

In [None]:
srs = Series(np.arange(5),
             index=["alpha", "beta", "gamma", "delta", "epsilon"])
srs

In [None]:
srs[:2]

In [None]:
srs[["beta", "delta"]]

In [None]:
srs["beta":"delta"]     # Select everything BETWEEN (and
                        # including) beta and delta

In [None]:
srs[srs > 3]    # Select elements of srs greater than 3

In [None]:
srs > 3    # A look at the indexing object

Consider the following code `Series`. Notice the index is the numbers between 0 and 4 rearranged.

In [None]:
srs2 = Series(["zero", "one", "two", "three", "four"],
              index=[3, 2, 4, 0, 1])

srs2

What will the following do?

In [None]:
srs2[2:4]    # Ambiguous

In [None]:
srs2.iloc[2:4]

In [None]:
srs2.loc[2:4]

Now let's work with `DataFrame`s.

In [None]:
df = DataFrame(np.arange(21).reshape(7, 3),
               columns=['AAA', 'BBB', 'CCC'],
               index=["alpha", "beta", "gamma", "delta",
                      "epsilon", "zeta", "eta"])
df

In [None]:
df.AAA

In [None]:
df['AAA']

In [None]:
df[['BBB', 'CCC']]

In [None]:
df.iloc[1:3, 1:2]

In [None]:
df.loc['beta':'delta', 'BBB':'CCC']

In [None]:
df.iloc[:, 1:3]

In [None]:
df.iloc[:, 1:3].loc[['alpha', 'gamma', 'zeta']]    # Mixing

In [None]:
df2 = df.iloc[:, 1:3].loc[['alpha', 'gamma', 'zeta']].copy()

df2

Let's now look at changing the contents of `DataFrame`s.

In [None]:
df2['CCC'] = Series({'alpha': 11, 'gamma': 18, 'zeta': 5})

df2

In [None]:
df2.iloc[1, 1] = 2
df2

In [None]:
df2.iloc[:, 1] = 0
df2