# Week 14
# Time Series Data

Time series data is a data set where instances are indexed by time. It is an important form of structured data in many fields such as finance, economics, ecology, neuroscience, and physics. 

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

## 1. Date and Time Data Types and Tools

In Python, the `datetime.datetime` class is widely used to represent date and time data.

In [None]:
from datetime import datetime

datetime.now()

In [None]:
datetime.now().year

In [None]:
datetime.now().day

In [None]:
datetime.now().month

We can use `datetime.timedelta` to represent the temporal difference between two `datetime` objects.

In [None]:
from datetime import timedelta

delta = timedelta(10)

datetime.now() + delta

In [None]:
date1 = datetime(2019, 12, 12)
date2 = datetime.now()
date1 - date2

**Convert between string and datetime**

In [None]:
# datetime to string
date = datetime(2011, 1, 3, 23, 30, 45)
str(date)

In [None]:
# Convert to format "YYYY-MM-DD"
date.strftime("%Y/%m/%d %H:%M, %A")

Datetime formats:
- %Y: Four-digit year
- %y: Two-digit year
- %m: Two-digit month
- %d: Two-digit day
- %H: Hour 0 - 23
- %I: Hour 1 - 12
- %M: Two-digit minute
- %S: Second
- %w: Weekday

[More on this](https://docs.python.org/2/library/datetime.html)

In [None]:
# Exercise: convert date to "01/03/2011"



In [None]:
# Exercise: convert date to "01-03-2011 00:00"



**Parse a datetime string**

In [None]:
# String to datetime
from dateutil.parser import parse
parse("2011-01-03")

In [None]:
parse("Jan 31, 1997 10:45 PM")

In [None]:
# Many countries use format "DD/MM/YYYY". We need to set dayfirst=True
# so that the date is correctly recognized.
parse("06/12/2011", dayfirst=True)

In [None]:
parse("06/12/2011")

## 2. Time Series Basics

In [None]:
# Create a list of datetime objects
dates = [datetime(2011, 1, 2), datetime(2011, 1, 5),
         datetime(2011, 2, 7), datetime(2011, 2, 8),
         datetime(2011, 3, 10), datetime(2011, 3, 12)]
ts = pd.Series(np.random.randn(6), index=dates)
ts

In [None]:
# Select 01/05
ts['2011-01-05']

In [None]:
ts[1]

In [None]:
ts['01/05/2011']

In [None]:
ts['20110105']

In [None]:
# Select a range of dates
ts['2011-02']

In [None]:
ts['2011-02-01':'2011-02-8'] # the end datetime is also included

In [None]:
ts['2011-02-01':]

In [None]:
ts[:"2011-03-10"]

## 3. Date Ranges

In [None]:
# manually populate a list of dates
dates = [datetime(2011, 1, 2), datetime(2011, 3, 10), datetime(2011, 4, 1)]
# ts[dates] # Pandas no longer supports missing indices
ts[ts.index.isin(dates)]

In [None]:
# Create a range of dates
daterange = pd.date_range('2011-01-01', periods=8)
print(daterange)

In [None]:
daterange = pd.date_range('2011-01-01', periods=5, freq='2D')
print(daterange)

In [None]:
daterange = pd.date_range("2011-01-01", periods=5, freq="10H")
print(daterange)

In [None]:
# Sample business days only
daterange = pd.date_range("2011-01-01", periods=10, freq="B")
print(daterange)

In [None]:
ts[daterange]

In [None]:
ts[ts.index.isin(daterange)]

## 4. Shifting Data


In [None]:
prices = pd.DataFrame(np.random.rand(4) + 100,
                      index=pd.date_range('2019-11-01', periods=4),
                      columns=['Price'])
prices

In [None]:
prices - 100

In [None]:
# How to create a column storing yesterday's price?
for date in prices.index:
    yesterday = date - timedelta(days=1)
    if yesterday in prices.index:
        prices.loc[date, "Yesterday's Price"] = prices.loc[yesterday, "Price"]
prices

In [None]:
prices = pd.DataFrame(np.random.rand(4) + 100,
                      index=pd.date_range('2019-11-01', periods=4),
                      columns=['Price'])
prices_yesterday = prices.shift(1)
prices_yesterday

In [None]:
prices = pd.merge(prices, prices_yesterday, left_index=True, right_index=True,
                  suffixes=["Today", "Yesterday"])
prices

In [None]:
# Exercise: Compute the percent changes between yesterday and today's price
# Formula: percent = (today's price - yesterday's price) / yesterday's price

