# Time Series Data

Wikipedia definition of **Time Series**: 

            A time series is a series of data points indexed in time order. Most commonly, a
            time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of
            discrete-time data.

In [31]:
import numpy as np
import pandas as pd

For this example we will use Bitcoin price data from [Quandl](https://www.quandl.com/data/BITFINEX/BTCUSD-BTC-USD-Exchange-Rate)…

In [38]:
df = pd.read_csv('BTCUSD.csv')
df.head()

Unnamed: 0,Date,High,Low,Mid,Last,Bid,Ask,Volume
0,2014-04-15,513.9,452.0,504.235,505.0,503.5,504.97,21013.584774
1,2014-04-16,547.0,495.0,537.5,538.0,537.0,538.0,29633.358705
2,2014-04-17,538.5,486.1,507.02,508.0,506.04,508.0,20709.783819
3,2014-04-18,509.0,474.25,483.77,482.75,482.75,484.79,10458.045243
4,2014-04-19,513.9899,473.83,505.01065,507.4999,502.5313,507.49,8963.618369


First, look at the data type of the columns…

In [39]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1343 entries, 0 to 1342
Data columns (total 8 columns):
Date      1343 non-null object
High      1343 non-null float64
Low       1343 non-null float64
Mid       1343 non-null float64
Last      1343 non-null float64
Bid       1343 non-null float64
Ask       1343 non-null float64
Volume    1343 non-null float64
dtypes: float64(7), object(1)
memory usage: 84.0+ KB


### Create your DateTime Object

Notice that the data type of the 'Date' column is object. We will have to convert the data type of the column to datetime in order to use the special properties of the datetime data type…

In [40]:
df['Date'] = pd.to_datetime(df['Date'])

print(df.info())
df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1343 entries, 0 to 1342
Data columns (total 8 columns):
Date      1343 non-null datetime64[ns]
High      1343 non-null float64
Low       1343 non-null float64
Mid       1343 non-null float64
Last      1343 non-null float64
Bid       1343 non-null float64
Ask       1343 non-null float64
Volume    1343 non-null float64
dtypes: datetime64[ns](1), float64(7)
memory usage: 84.0 KB
None


Unnamed: 0,Date,High,Low,Mid,Last,Bid,Ask,Volume
0,2014-04-15,513.9,452.0,504.235,505.0,503.5,504.97,21013.584774
1,2014-04-16,547.0,495.0,537.5,538.0,537.0,538.0,29633.358705
2,2014-04-17,538.5,486.1,507.02,508.0,506.04,508.0,20709.783819
3,2014-04-18,509.0,474.25,483.77,482.75,482.75,484.79,10458.045243
4,2014-04-19,513.9899,473.83,505.01065,507.4999,502.5313,507.49,8963.618369


Every date in the 'Date' column is now of data type **datetime64**. Datetime objects are in the format **yyyy-mm-dd** by default and will sometimes contain information down to the second if the proper data is provided.

### Make a DateTime Index

It is often useful (and common practice) to organize your time series data by making a datetime index.

In [41]:
df.set_index('Date', inplace = True)
df.head()

Unnamed: 0_level_0,High,Low,Mid,Last,Bid,Ask,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2014-04-15,513.9,452.0,504.235,505.0,503.5,504.97,21013.584774
2014-04-16,547.0,495.0,537.5,538.0,537.0,538.0,29633.358705
2014-04-17,538.5,486.1,507.02,508.0,506.04,508.0,20709.783819
2014-04-18,509.0,474.25,483.77,482.75,482.75,484.79,10458.045243
2014-04-19,513.9899,473.83,505.01065,507.4999,502.5313,507.49,8963.618369


## DateTime Index Properties

### Extract the Year, Month, and Day

In [42]:
# Years
df.index.year

Int64Index([2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014,
            ...
            2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018],
           dtype='int64', name='Date', length=1343)

In [43]:
# Months
df.index.month

Int64Index([4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
            ...
            1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
           dtype='int64', name='Date', length=1343)

In [44]:
# Days
df.index.day

Int64Index([15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
            ...
             9, 10, 11, 12, 13, 14, 15, 16, 17, 18],
           dtype='int64', name='Date', length=1343)

In [45]:
df.groupby(df.index.month).mean()

Unnamed: 0_level_0,High,Low,Mid,Last,Bid,Ask,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1,2909.784807,2595.553474,2779.482045,2779.638705,2778.933364,2780.030727,41428.518619
2,572.033882,547.795647,563.479294,563.501882,563.368588,563.59,26636.774311
3,582.168506,551.378046,567.937011,567.957241,567.845287,568.028736,27884.727031
4,627.763212,605.818869,619.201542,619.239244,618.999273,619.403811,16184.122225
5,802.671815,748.315112,777.850524,777.987272,777.570413,778.130634,16744.835665
6,1053.681803,976.770152,1022.407966,1022.495427,1022.044746,1022.771186,27156.524772
7,1041.576371,979.127729,1015.089718,1015.090081,1014.862581,1015.316855,20917.726129
8,1392.25693,1306.895263,1373.430088,1373.685965,1373.100965,1373.759211,19000.295448
9,1404.939915,1304.459407,1360.449068,1360.51678,1360.183898,1360.714237,19430.891843
10,1698.852602,1606.640407,1663.861382,1663.847642,1663.654878,1664.067886,24067.901063
