# Pandas and time series

[`pandas`](http://pandas.pydata.org/) is a Python library for doing statistics and working with time series. Just as `numpy`, `pandas` is not part of the standard library but comes bundled with [Anaconda](01_anaconda.ipynb). `pandas` is conventionally imported as

    import pandas as pd
    
The main data structure in `pandas` is the __DataFrame__ which is a collection of __Series__. A __Series__ is similar to a one-dimension `numpy` __Array__, but has some added metadata and functionality. A __DataFrame__ resembles the way data are stored in SQL databases or spreadsheets. If you have seen data frames in `R`, they are quite similar.

In [None]:
import pandas as pd
pd.__version__

## Reading data with `pandas`

The `pandas` library comes with several functions for reading data in different formats. Try typing

    pd.read
    
and then hitting `<tab>` to see a list of `read`-functions in `pandas`. Here we will use the `pd.read_csv`-function for our examples. As with the `numpy`-functions, all the file handling is done by `pandas` so that we need only to pass it a filename. The following CSV-file is easily handled by the `pandas`-CSV-reader although it contains missing data, funky quotes and a newline in the middle of the description field.

In [None]:
!cat data/pandas_simple.csv

In [None]:
df = pd.read_csv('data/pandas_simple.csv')
df

Individual columns of the data frame (i.e. Series) can be accessed by name, using either dot- or square bracket-notation.

In [None]:
df.Year

In [None]:
df['Price']

The Series support some basic operations directly.

In [None]:
df.Year.min()

In [None]:
df.Price.median()

## Time Series

`pandas` has good support for working with time series.

In [None]:
co2 = pd.read_csv('data/co2-ppm-mauna-loa-19651980.csv', index_col=0, parse_dates=True)
co2.head()

In [None]:
co2['CO2 (ppm) mauna loa, 1965-1980'].mean()

In [None]:
daily_co2 = co2.asfreq('1W', method='pad')
daily_co2.head()

See the [`pandas` documentation](http://pandas.pydata.org/pandas-docs/stable/timeseries.html) for more information on Time Series