# Part 1: Data Structures in Pandas

In [None]:
"""
----------------------------------------------------------------------
Filename : 01_basic_data_structs.py
Date     : 12th Dec, 2013
Author   : Jaidev Deshpande
Purpose  : To get started with basic data structures in Pandas
Libraries: Pandas 0.12 and its dependencies
----------------------------------------------------------------------
"""

pandas is an open source, BSD-licensed library providing high-performance, 
easy-to-use data structures and data analysis tools for the Python programming language.
http://pandas.pydata.org

There are many useful objects in Pandas:

* Series
* DataFrame
* Panel
* TimeSeries

### Series and DataFrame

In [None]:
# imports
import pandas as pd
from math import pi

In [None]:
s = pd.Series(range(10))
print(s)

In [None]:
print(s[5])

A pandas `Series`, like a list, doesn't have to be homogenous.

In [None]:
s = pd.Series(['foo', None, 3+4j])

The index of a Series can be arbitrary as well.

In [None]:
inds = ['bar',1, (1, 2)]
s.index = inds
print(s['bar'], s[1], s[(1, 2)])

Multiple `Series` objects can be clubbed together to make a pandas `DataFrame`. The pandas `DataFrame` is similar to the `data.frame` object in R.

In [None]:
s1 = pd.Series(range(10))
s2 = pd.Series(range(10,20))
df = pd.DataFrame({'A':s1,'B':s2})
df.head()

Think of pandas `DataFrames` as `dict`s of `Series`. Almost all operations that are valid on a Python `dictionary` will work on a pandas `DataFrame`.

In [None]:
df['C'] = map(str, range(20,30))
df.head()

In [None]:
df['C']

In [None]:
del df['A']
print(df.head(10))

In [None]:
df.update({'D':range(50,60)})
df.head()

### Index Objects

Index objects available in Pandas:

* `Index`        : The most general Pandas index, often created by default
* `Int64Index`   : Specialized index for integer values
* `MultiIndex`   : Hierarchical index
* `DatetimeIndex`: Nanosecond timestamps that can be used as indexes
* `PeriodIndex`  : Specialized indices for timespans

In [None]:
df.index

### Exercise: Creating Series, DataFrames and indexing them

1. Create a random valued NumPy array having dimensions (10,10).
2. Convert this into a DataFrame
3. The column names of this DataFrame should be of type str.
4. Add one more column to the DataFrame using the `update` method demonstrated above.