# 11. Time Series

* fixed frequency: data points occur at regular intervals

ways to mark and refer to time series
* timestamps
* Fixed periods, e.g., Jan 2007, or 2010
* Intervals
    * start and end timestamp (? no interval)
    * periods are special cases of intervals (?)
* Experiment or elapsed time
    * start time, relative time
    
Notes
* timedelta indexes of pandas is not covered in the book

## 11.1 Date and Time Data Types and Tools

In [1]:
from datetime import datetime

now = datetime.now()

now

datetime.datetime(2021, 3, 27, 15, 18, 37, 907120)

In [3]:
now.year, now.month, now.day

(2021, 3, 27)

In [6]:
from datetime import timedelta

start = datetime(2011, 1, 7)
start + timedelta(12) # day

datetime.datetime(2011, 1, 19, 0, 0)

In [5]:
start - 2 * timedelta(12)

datetime.datetime(2010, 12, 14, 0, 0)

### Converting Between String and Datetime

In [7]:
stamp = datetime(2021, 3, 27)
str(stamp)

'2021-03-27 00:00:00'

In [10]:
# string-format-time
stamp.strftime('%Y-%m-%d')

'2021-03-27'

#### Datetime format specification (ISO C89 compatible)

|Type|Description|
|----|:----------------|
|%Y|Four-digit year|
|%y|Two-digit year|
|%m|Two-digit month [01, 12]|
|%d|Two-digit day [01, 31]|
|%H|Hour (24-hour clock) [00, 23]|
|%I|Hour (12-hour clock) [01, 12]|
|%M|Two-digit minute [00, 59]|
|%S|Second [00, 61] (seconds 60, 61 account for leap seconds)|
|%w|Weekday as integer [0 (Sunday), 6]|
|%U|Week number of the year [00, 53]; Sunday is considered the first day of the week, and days before the first Sunday of the year are “week 0”|
|%W|Week number of the year [00, 53]; Monday is considered the first day of the week, and days before the first Monday of the year are “week 0”|
|%z|UTC time zone offset as +HHMM or -HHMM; empty if time zone naive|
|%F|Shortcut for %Y-%m-%d (e.g., 2012-4-18)|
|%D|Shortcut for %m/%d/%y (e.g., 04/18/12)|

#### Locale-specific date formatting

|Type|Description|
|:---|:---|
|%a|Abbreviated weekday name|
|%A|Full weekday name|
|%b|Abbreviated month name|
|%B|Full month name|
|%c|Full date and time (e.g., ‘Tue 01 May 2012 04:20:57 PM’)|
|%p|Locale equivalent of AM or PM|
|%x|Locale-appropriate formatted date (e.g., in the United States, May 1, 2012 yields ’05/01/2012’)|
|%X|Locale-appropriate time (e.g., ’04:24:12 PM’)|

In [11]:
value = '2011-01-03'
# string-parse-time
datetime.strptime(value, '%Y-%M-%d')

datetime.datetime(2011, 1, 3, 0, 1)

In [12]:
datestrs = ['7/6/2011', '8/6/2011']
[datetime.strptime(x, '%m/%d/%Y') for x in datestrs]

[datetime.datetime(2011, 7, 6, 0, 0), datetime.datetime(2011, 8, 6, 0, 0)]

In [14]:
# 3rd party dateutil package that is installed with pandas
from dateutil.parser import parse

parse('2011-01-03')

datetime.datetime(2011, 1, 3, 0, 0)

In [15]:
parse('Jan 31, 1997 10:45 PM')

datetime.datetime(1997, 1, 31, 22, 45)

In [16]:
parse('6/12/2022', dayfirst=True)

datetime.datetime(2022, 12, 6, 0, 0)

In [18]:
import pandas as pd

datestrs = ['2011-07-06 12:00:00', '2011-08-06 00:00:00']

pd.to_datetime(datestrs)

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00'], dtype='datetime64[ns]', freq=None)

In [19]:
idx = pd.to_datetime(datestrs + [None])
idx

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00', 'NaT'], dtype='datetime64[ns]', freq=None)

In [20]:
idx[2] # Not a Time

NaT

In [21]:
pd.isnull(idx)

array([False, False,  True])

## 11.2 Time Series Basics