# Chapter 11 - Time Series

## 11.1 - Date and Time Data Types and Tools

In [1]:
import datetime
from datetime import datetime as dt
import pandas as pd
import numpy as np

from dateutil.parser import parse

The Python standard library has datatypes for date and time data, and date manipulation. Mostly, the `datetime.datetime`datatype is being used.

`datetime` stores a date down to the microsecond.

In [2]:
# Get current time
now = dt.now()
display(now)

# Get the attributes of the dt object using their names
print(now.year)
print(now.month)
print(now.day)
print()
print(now.hour)
print(now.minute)
print(now.second)

datetime.datetime(2019, 10, 9, 22, 47, 25, 66175)

2019
10
9

22
47
25


In [3]:
# Instantiate a dt variable just by using date
t1 = dt(2019, 1, 13)
display(t1)
print(t1.year, t1.month, t1.day)
print()
print(t1.hour, t1.minute, t1.second)

datetime.datetime(2019, 1, 13, 0, 0)

2019 1 13

0 0 0


In [4]:
# Instantiate a dt variable using date and time
t2 = dt(2019, 2, 23, 10, 0, 30)
display(t2)
print(t2.year, t2.month, t2.day)
print()
print(t2.hour, t2.minute, t2.second)

datetime.datetime(2019, 2, 23, 10, 0, 30)

2019 2 23

10 0 30


The temporal difference between two `datetime` objects is stored as a `timedelta` datatype.

In [5]:
display(t1)
display(t2)

# timedelta will give the difference between 2 datetime objects
td = t2 - t1
display(td)
# timedelta has values in days and seconds
print(td.days, td.seconds)

datetime.datetime(2019, 1, 13, 0, 0)

datetime.datetime(2019, 2, 23, 10, 0, 30)

datetime.timedelta(41, 36030)

41 36030


In [6]:
# Use timedelta to add a time period to a datetime
display(t1)

# Add 20 days to a dt
display(t1 + datetime.timedelta(20))

datetime.datetime(2019, 1, 13, 0, 0)

datetime.datetime(2019, 2, 2, 0, 0)

Convert a `str` to a `datetime` using `strptime`. Conversely, convert a `datetime` to a `str` using `strftime`. In both cases, do remember to state the date-time format.

In [7]:
# NOTE: strftime is STRING from TIME,  so dt  -> str
print(dt.strftime(t1, '%Y-%m-%d %H:%M:%S'))
# NOTE: strptime is STRING parse TIME, so str -> dt
print(dt.strptime('2019-01-28', '%Y-%m-%d'))

2019-01-13 00:00:00
2019-01-28 00:00:00


Additionally, convert a `str` to a `datetime` object using the `parse` function from `dateutil.parser`. This ia a dependency for `pandas` so it is installed when `pandas` was installed.

In [8]:
# use dateutil.parser.parse from dateutil to obtain datetime 
# from multiple string formats
display(parse('2019-01-28'))
display(parse('4/16/2010'))
display(parse('2019.05.13 8:30'))

datetime.datetime(2019, 1, 28, 0, 0)

datetime.datetime(2010, 4, 16, 0, 0)

datetime.datetime(2019, 5, 13, 8, 30)

To convert strings to `DateTimeIndex` use `pd.to_datetime()`

In [9]:
# Use pd.to_datetime to convert strings to a DateTimeIndex
l1 = pd.to_datetime(['2019-10-01 8:30:00', '2019-10-01 9:30:00', '2019-12-01 9:30:00'])
display(l1)

DatetimeIndex(['2019-10-01 08:30:00', '2019-10-01 09:30:00',
               '2019-12-01 09:30:00'],
              dtype='datetime64[ns]', freq=None)

The `pandas.to_datetime()` function can parse multiple datetimes from string to date time formats. It also handles missing values. `NaT` is pandas's null value for timestamp data.

In [10]:
a = pd.to_datetime(['2019-10-01 8:30:00', 
                    None, '2019-10-01 9:30:00', 
                    '2019-12-01 9:30:00', np.nan])
display(a)
display(a.isnull())

DatetimeIndex(['2019-10-01 08:30:00',                 'NaT',
               '2019-10-01 09:30:00', '2019-12-01 09:30:00',
                               'NaT'],
              dtype='datetime64[ns]', freq=None)

array([False,  True, False, False,  True])

**References:**

Python for Data Analysis, 2nd Edition, McKinney (2017)