# Python Pandas Time Series

You will learn how to manipulate and visualize time series data using pandas. You will become familiar with concepts such as `upsampling`, `downsampling`, and `interpolation`. You will practice using **method chaining** to efficiently filter your data and perform **time series analyses**. From stock prices to flight timings, time series data can be found in a wide variety of domains, and being able to effectively work with it is an invaluable skill.

In [2]:
import pandas as pd

## Reading and slicing times

Use the `.head()` and `.info()` methods in the IPython Shell to inspect the DataFrames. Then, try to index each DataFrame with a datetime string. Which of the resulting DataFrames allows you to easily index and slice data by dates using, for example, df1.loc['2010-Aug-01']? (answer: df3)

In [3]:
filename = 'data/timeseries1.csv'

df1 = pd.read_csv(filename)

df2 = pd.read_csv(filename, parse_dates=['Date'])

df3 = pd.read_csv(filename, index_col='Date', parse_dates=True)

In [4]:
df1.head()

Unnamed: 0,Temperature,DewPoint,Pressure,Date
0,46.2,37.5,1.0,20100101 00:00
1,44.6,37.1,1.0,20100101 01:00
2,44.1,36.9,1.0,20100101 02:00
3,43.8,36.9,1.0,20100101 03:00
4,43.5,36.8,1.0,20100101 04:00


In [5]:
df1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8759 entries, 0 to 8758
Data columns (total 4 columns):
Temperature    8759 non-null float64
DewPoint       8759 non-null float64
Pressure       8759 non-null float64
Date           8759 non-null object
dtypes: float64(3), object(1)
memory usage: 273.8+ KB


In [10]:
df1.loc['2010-Aug-01']

KeyError: 'the label [2010-Aug-01] is not in the [index]'

In [6]:
df2.head()

Unnamed: 0,Temperature,DewPoint,Pressure,Date
0,46.2,37.5,1.0,2010-01-01 00:00:00
1,44.6,37.1,1.0,2010-01-01 01:00:00
2,44.1,36.9,1.0,2010-01-01 02:00:00
3,43.8,36.9,1.0,2010-01-01 03:00:00
4,43.5,36.8,1.0,2010-01-01 04:00:00


In [7]:
df2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8759 entries, 0 to 8758
Data columns (total 4 columns):
Temperature    8759 non-null float64
DewPoint       8759 non-null float64
Pressure       8759 non-null float64
Date           8759 non-null datetime64[ns]
dtypes: datetime64[ns](1), float64(3)
memory usage: 273.8 KB


In [11]:
df2.loc['2010-Aug-01']

KeyError: 'the label [2010-Aug-01] is not in the [index]'

In [8]:
df3.head()

Unnamed: 0_level_0,Temperature,DewPoint,Pressure
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2010-01-01 00:00:00,46.2,37.5,1.0
2010-01-01 01:00:00,44.6,37.1,1.0
2010-01-01 02:00:00,44.1,36.9,1.0
2010-01-01 03:00:00,43.8,36.9,1.0
2010-01-01 04:00:00,43.5,36.8,1.0


In [9]:
df3.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 8759 entries, 2010-01-01 00:00:00 to 2010-12-31 23:00:00
Data columns (total 3 columns):
Temperature    8759 non-null float64
DewPoint       8759 non-null float64
Pressure       8759 non-null float64
dtypes: float64(3)
memory usage: 273.7 KB


In [12]:
df3.loc['2010-Aug-01']

Unnamed: 0_level_0,Temperature,DewPoint,Pressure
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2010-08-01 00:00:00,79.0,70.8,1.0
2010-08-01 01:00:00,77.4,71.2,1.0
2010-08-01 02:00:00,76.4,71.3,1.0
2010-08-01 03:00:00,75.7,71.4,1.0
2010-08-01 04:00:00,75.1,71.4,1.0
2010-08-01 05:00:00,74.6,71.3,1.0
2010-08-01 06:00:00,74.5,71.3,1.0
2010-08-01 07:00:00,76.0,72.3,1.0
2010-08-01 08:00:00,79.8,72.8,1.0
2010-08-01 09:00:00,83.3,72.1,1.0
