## Data Selection in Series

A series object acts in many ways like a one dimensional numpy array and a standard python dictionary. A series object provides a mapping from a collection of keys to a collection of values

In [3]:
import pandas as pd
data = pd.Series([0.25, 0.5, 0.75, 1.0],
                 index=['a', 'b', 'c', 'd'])
data

a    0.25
b    0.50
c    0.75
d    1.00
dtype: float64

In [6]:
data['b']

0.5

Dictionary like python expressions can examing the key/index and values

In [5]:
'a' in data

True

In [7]:
data.keys()

Index(['a', 'b', 'c', 'd'], dtype='object')

In [8]:
list(data.items())

[('a', 0.25), ('b', 0.5), ('c', 0.75), ('d', 1.0)]

Series objects can easily be modified with a dictionary like sytax

In [10]:
data['e'] = 1.25
data

a    0.25
b    0.50
c    0.75
d    1.00
e    1.25
dtype: float64

Pandas provides a couple special indexer attributes that expose certain indexing schemes:
 - loc: allows indexing and slicing that always references the explicit index
 - iloc: allows indexing and slicing that always references the implicit Python-style index
 - ix: a hybrid of the two, and for Series objects is equivalent to standard []-based indexing

In [11]:
data = pd.Series(['a', 'b', 'c'], index=[1, 3, 5])
data

1    a
3    b
5    c
dtype: object

In [13]:
data.loc[1]

'a'

In [12]:
data.loc[1:3]

1    a
3    b
dtype: object

In [15]:
data.iloc[1]

'b'

In [16]:
data.iloc[1:3]

3    b
5    c
dtype: object

## Data Selection in Dataframe

A dataframe acts in many ways like a 2D array and a dictionary of Series objects

In [17]:
area = pd.Series({'California': 423967, 'Texas': 695662,
                  'New York': 141297, 'Florida': 170312,
                  'Illinois': 149995})
pop = pd.Series({'California': 38332521, 'Texas': 26448193,
                 'New York': 19651127, 'Florida': 19552860,
                 'Illinois': 12882135})
data = pd.DataFrame({'area':area, 'pop':pop})
data

Unnamed: 0,area,pop
California,423967,38332521
Texas,695662,26448193
New York,141297,19651127
Florida,170312,19552860
Illinois,149995,12882135


In [18]:
data['density'] = data['pop'] / data['area']
data

Unnamed: 0,area,pop,density
California,423967,38332521,90.413926
Texas,695662,26448193,38.01874
New York,141297,19651127,139.076746
Florida,170312,19552860,114.806121
Illinois,149995,12882135,85.883763
