# Dates & Times in Pandas

* Generate sequences of fixed-frequency dates and time spans
* Conform or convert time series to a particular frequency
* Compute ’relative’ dates based on various non- standard time increments (e.g. 5 business days before the last day of the year) or ’roll’ dates backward and forward

In [1]:
import pandas as pd
import numpy as np

## Generate Series of Times

### `pandas.date_range` [Link](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.date_range.html)

In [2]:
# specify pd.date_range with start date, periods and frequency:
rng = pd.date_range('2018 Jul 1', periods = 10, freq = 'D')
rng

DatetimeIndex(['2018-07-01', '2018-07-02', '2018-07-03', '2018-07-04',
               '2018-07-05', '2018-07-06', '2018-07-07', '2018-07-08',
               '2018-07-09', '2018-07-10'],
              dtype='datetime64[ns]', freq='D')

In [3]:
#different example using months (notice how pandas then selects the end of month, for the beginning specify freq as 'MS')
rng2 = pd.date_range('2018-01-01', periods=12, freq='M')
rng2

DatetimeIndex(['2018-01-31', '2018-02-28', '2018-03-31', '2018-04-30',
               '2018-05-31', '2018-06-30', '2018-07-31', '2018-08-31',
               '2018-09-30', '2018-10-31', '2018-11-30', '2018-12-31'],
              dtype='datetime64[ns]', freq='M')

In [4]:
#different example using business days (very important!)
rng3 = pd.date_range('2018-01-01', periods=10, freq='B')
rng3

DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
               '2018-01-05', '2018-01-08', '2018-01-09', '2018-01-10',
               '2018-01-11', '2018-01-12'],
              dtype='datetime64[ns]', freq='B')

In [5]:
#different example using a start and end date
rng4 = pd.date_range('1 January, 2018', '1 December, 2018', freq='MS')
rng4

DatetimeIndex(['2018-01-01', '2018-02-01', '2018-03-01', '2018-04-01',
               '2018-05-01', '2018-06-01', '2018-07-01', '2018-08-01',
               '2018-09-01', '2018-10-01', '2018-11-01', '2018-12-01'],
              dtype='datetime64[ns]', freq='MS')

Pandas Datetime is standardized by American Dates, which means that the date '1/7/2016' will be in January, not July!

In [6]:
pd.date_range('1/7/2018', periods=2)

DatetimeIndex(['2018-01-07', '2018-01-08'], dtype='datetime64[ns]', freq='D')

## Generate Points in Time

### `pandas.Timestamp` [Link](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timestamp.html)

The elements inside the date_range we have created above are pandas.Timestamps:

In [7]:
type(rng[0])

pandas._libs.tslibs.timestamps.Timestamp

You can also create Timestamps individually:

In [8]:
pd.Timestamp('2018-07-10')

Timestamp('2018-07-10 00:00:00')

In [9]:
# You can also add more details 
pd.Timestamp('2018-07-10 10')

Timestamp('2018-07-10 10:00:00')

In [10]:
# Or even more...
pd.Timestamp('2018-07-10 10:15')

Timestamp('2018-07-10 10:15:00')

You can go all the way to nanoseconds

In [11]:
pd.Timestamp('2018-07-10 10:15:15.123456789')

Timestamp('2018-07-10 10:15:15.123456789')

Pandas Timestamps have a lot of usefull attributes such as the weekday name attribute or the quarter attribute

In [12]:
day = pd.Timestamp('2018-01-01')
day.day_name()

'Monday'

In [13]:
day.quarter

1

## Generate Differences in Time

### `pandas.Timedelta` [Link](https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.Timedelta.html)

We can also specify time differences / timespans

In [14]:
pd.Timedelta('1 day')

Timedelta('1 days 00:00:00')

And add theses timedeltas to timestamps to receive a new timestamp

In [15]:
pd.Timestamp('2018-01-01 10:10') + pd.Timedelta('1 day')

Timestamp('2018-01-02 10:10:00')

In [16]:
pd.Timestamp('2018-01-01 10:10') + pd.Timedelta('15 ns')

Timestamp('2018-01-01 10:10:00.000000015')

We can also add a timedelta to a range of Timestamps which is called a DatetimeIndex. It will then add the timedelta to all of the items in the DatetimeIndex:

In [17]:
rng5 = pd.date_range('2018-01-01', periods=5, freq='D')
rng5

DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
               '2018-01-05'],
              dtype='datetime64[ns]', freq='D')

In [18]:
rng5 + pd.Timedelta('1 day')

DatetimeIndex(['2018-01-02', '2018-01-03', '2018-01-04', '2018-01-05',
               '2018-01-06'],
              dtype='datetime64[ns]', freq='D')

## Generate Spans of Time

### `pandas.Period` [Link](https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.Period.html)

A period of a month

In [19]:
pd.Period('2016-01')

Period('2016-01', 'M')

A period of a day

In [20]:
pd.Period('2016-01-01')

Period('2016-01-01', 'D')

An hour

In [21]:
pd.Period('2016-01-01 10')

Period('2016-01-01 10:00', 'H')

A minute

In [22]:
pd.Period('2016-01-01 10:10')

Period('2016-01-01 10:10', 'T')

And a second

In [23]:
pd.Period('2016-01-01 10:10:10')

Period('2016-01-01 10:10:10', 'S')

We can see that pandas.Period corresponds to a time span by comparing it to a pandas.Timestamp:

In [24]:
period = pd.Period('01/2018')
timestamp = pd.Timestamp('01/15/2018')

In [25]:
period.start_time < timestamp < period.end_time

True

You can also create multiple time periods in one expression

In [26]:
rng6 = pd.period_range('01/2018', periods=5, freq='M')
rng6

PeriodIndex(['2018-01', '2018-02', '2018-03', '2018-04', '2018-05'], dtype='period[M]', freq='M')

It's possible to combine frequencies. What if you want to advance by 25 hours each day. What are the 2 ways to do it?

In [27]:
p1 = pd.period_range('2018-01-01 10:10', freq = '25H', periods = 10)

In [28]:
p2 = pd.period_range('2018-01-01 10:10', freq = '1D1H', periods = 10)

You can convert Timestampd to periods. In this example I test whether a timestamp is in a given period by first converting the timestamp to a period

In [29]:
timestamp.to_period('M') == period

True

## Indexing with Pandas Time Objects

We can use `pd.date_range` to receive a DatetimeIndex which contains Timestamps. We then use this Index to create a `pd.Series` :

In [30]:
rng = pd.date_range('2018 Jul 1', periods = 10, freq = 'D')
rng
series = pd.Series(data = range(len(rng)), index = rng)
series

2018-07-01    0
2018-07-02    1
2018-07-03    2
2018-07-04    3
2018-07-05    4
2018-07-06    5
2018-07-07    6
2018-07-08    7
2018-07-09    8
2018-07-10    9
Freq: D, dtype: int64

In [31]:
type(series.index)

pandas.core.indexes.datetimes.DatetimeIndex

Of course we can also use time periods as the index when it makes more sense to have time spans as the index:

In [32]:
periods = [pd.Period('2018-01'), pd.Period('2018-02'), pd.Period('2018-03')]
ts = pd.Series(data = np.random.randn(len(periods)), index = periods)
ts

2018-01    1.091969
2018-02   -0.031044
2018-03   -0.629683
Freq: M, dtype: float64

In [33]:
type(ts.index)

pandas.core.indexes.period.PeriodIndex

A DatetimeIndex can be converted to a PeriodIndex and vice versa

In [34]:
ts = pd.Series(range(10), pd.date_range('07-10-18 8:00', periods = 10, freq = 'H'))
ts

2018-07-10 08:00:00    0
2018-07-10 09:00:00    1
2018-07-10 10:00:00    2
2018-07-10 11:00:00    3
2018-07-10 12:00:00    4
2018-07-10 13:00:00    5
2018-07-10 14:00:00    6
2018-07-10 15:00:00    7
2018-07-10 16:00:00    8
2018-07-10 17:00:00    9
Freq: H, dtype: int64

In [35]:
type(ts.index)

pandas.core.indexes.datetimes.DatetimeIndex

In [36]:
ts_period = ts.to_period()
ts_period

2018-07-10 08:00    0
2018-07-10 09:00    1
2018-07-10 10:00    2
2018-07-10 11:00    3
2018-07-10 12:00    4
2018-07-10 13:00    5
2018-07-10 14:00    6
2018-07-10 15:00    7
2018-07-10 16:00    8
2018-07-10 17:00    9
Freq: H, dtype: int64

In [37]:
type(ts_period.index)

pandas.core.indexes.period.PeriodIndex

## Extras

### 1. How to create a pd.Timestamp / pd.DatetimeIndex with European style formatting (Day First, then Month then Year)?

Say I want to have the first of July 2018. The normal formatting does not give me the right result:

In [38]:
pd.Timestamp('01/07/2018')

Timestamp('2018-01-07 00:00:00')

You have to use the `pd.to_datetime()` function. If you use dates which start with the day first (i.e. European style), you can pass the `dayfirst` flag

In [39]:
pd.to_datetime('01/07/2018', dayfirst=True)

Timestamp('2018-07-01 00:00:00')

Same goes for a complete DatetimeIndex

In [40]:
pd.date_range(pd.to_datetime('01/07/2018', dayfirst=True), pd.to_datetime('05/07/2018', dayfirst=True), freq='D')

DatetimeIndex(['2018-07-01', '2018-07-02', '2018-07-03', '2018-07-04',
               '2018-07-05'],
              dtype='datetime64[ns]', freq='D')

### 2. How to generate a string representation of a pd.Timestamp in a desired format?

To specify the string representation of a timestamp or any other time object you can use `strftime()`:

In [41]:
pd.Timestamp('01-07-2019')

Timestamp('2019-01-07 00:00:00')

In [42]:
pd.Timestamp('01-07-2019').strftime('%d-%m-%Y')

'07-01-2019'