# Data Frame Selection

In [8]:
import pandas as pd

# Load data into dataframe
df = pd.read_csv("./data/iris-with-header.tsv", delimiter='\t')
df.head()

Unnamed: 0,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


## Get coloumns by column name

In [6]:
df['Species'].head()

0    Iris-setosa
1    Iris-setosa
2    Iris-setosa
3    Iris-setosa
4    Iris-setosa
Name: Species, dtype: object

In [7]:
df[['SepalLengthCm', 'Species']].head()

Unnamed: 0,SepalLengthCm,Species
0,5.1,Iris-setosa
1,4.9,Iris-setosa
2,4.7,Iris-setosa
3,4.6,Iris-setosa
4,5.0,Iris-setosa


## Get rows by index

In [17]:
df[0:1]

Unnamed: 0,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species
0,5.1,3.5,1.4,0.2,Iris-setosa


## Get elements by `iloc`, `iat`, `ix`, `loc`
[Pandas Documentation](http://pandas.pydata.org/pandas-docs/stable/indexing.html)

`.iloc` Purely **integer-location based** indexing for selection by position.

In [47]:
print(df.iloc[0]) 
print("")
print(df.iloc[0,4])

SepalLengthCm            5.1
SepalWidthCm             3.5
PetalLengthCm            1.4
PetalWidthCm             0.2
Species          Iris-setosa
Name: 0, dtype: object

Iris-setosa


`.loc` is primarily **label based**, but may also be used with a **boolean array**. .loc will raise KeyError when the items are not found.

In [52]:
print(df.loc[0])
print('')
print(df.loc[0, 'Species'])

SepalLengthCm            5.1
SepalWidthCm             3.5
PetalLengthCm            1.4
PetalWidthCm             0.2
Species          Iris-setosa
Name: 0, dtype: object

Iris-setosa


`.ix` supports **mixed integer** and **label** based access.

In [58]:
print(df.ix[0])
print('')
print(df.ix[0,0])
print('')
print(df.ix[0,'Species'])

SepalLengthCm            5.1
SepalWidthCm             3.5
PetalLengthCm            1.4
PetalWidthCm             0.2
Species          Iris-setosa
Name: 0, dtype: object

5.1

Iris-setosa


## Get scalar values
Similarly to `loc`, `at` provides **label based scalar** lookups, while, `iat` provides **integer based** lookups analogously to `iloc`

In [63]:
# TODO Series example using s.at

In [62]:
# Can be used with dataframe too
print(df.iat[0,0])
print('')
print(df.at[0,'Species'])

5.1

Iris-setosa
