# Frequencies and Date Offsets

Frequencies in pandas are composed of a *base frwquency* and a multiplier. Base frequencies are typically referred to by a string alias, like 'M' for monthly or 'H' for hourly. For each base frequency, there is an object defined generally referred to as a *date offset*. For example, hourly frequency can be represented with *Hour* class:

In [1]:
import pandas as pd
import numpy as np
from pandas import DataFrame, Series
from datetime import datetime
from pandas.tseries.offsets import Hour, Minute

In [2]:
hour = Hour()

In [5]:
hour

<Hour>

You can define a multiple of an offset by passing an integer:

In [6]:
four_hours = Hour(4)

In [7]:
four_hours

<4 * Hours>

In most applications, you would never need to explicitly create one of these objects, instead using a string alias like 'H' or '4H'. Putting an integer before the base frequency creates a multiple:

In [14]:
pd.date_range('1/1/2000', '1/3/2000', freq = '4h')

DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 04:00:00',
               '2000-01-01 08:00:00', '2000-01-01 12:00:00',
               '2000-01-01 16:00:00', '2000-01-01 20:00:00',
               '2000-01-02 00:00:00', '2000-01-02 04:00:00',
               '2000-01-02 08:00:00', '2000-01-02 12:00:00',
               '2000-01-02 16:00:00', '2000-01-02 20:00:00',
               '2000-01-03 00:00:00'],
              dtype='datetime64[ns]', freq='4H')

Many offsets can be combined togehter by addition:

In [15]:
Hour(2) + Minute(30)

<150 * Minutes>

Similarly, you can pass frequency strings like '2h30min' which will effectively be parsed to the same expression:

In [16]:
pd.date_range('1/1/2000', periods = 5, freq= '2h30min')

DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 02:30:00',
               '2000-01-01 05:00:00', '2000-01-01 07:30:00',
               '2000-01-01 10:00:00'],
              dtype='datetime64[ns]', freq='150T')

Some frequencies describe points in time that are not evenly spaced. For example, 'M' (calender month end) and 'BM; (last business/weekday of month) depend on the number of days in a month and, in the latter case, whether the month ends on a weekend or not. For lack of a better term, I call these *anchored* offsets.

In [18]:
pd.date_range('1/1/2000', periods = 7, freq= 'B')

DatetimeIndex(['2000-01-03', '2000-01-04', '2000-01-05', '2000-01-06',
               '2000-01-07', '2000-01-10', '2000-01-11'],
              dtype='datetime64[ns]', freq='B')

## Week of month dates

One usefull frequency class is "Week of month", Starting with *WOM*. This enables you to get dates like the third Friday of each month:

In [24]:
rng = pd.date_range('1/1/2012', '9/1/2012', freq='WOM-4MON')

In [25]:
list(rng)

[Timestamp('2012-01-23 00:00:00', freq='WOM-4MON'),
 Timestamp('2012-02-27 00:00:00', freq='WOM-4MON'),
 Timestamp('2012-03-26 00:00:00', freq='WOM-4MON'),
 Timestamp('2012-04-23 00:00:00', freq='WOM-4MON'),
 Timestamp('2012-05-28 00:00:00', freq='WOM-4MON'),
 Timestamp('2012-06-25 00:00:00', freq='WOM-4MON'),
 Timestamp('2012-07-23 00:00:00', freq='WOM-4MON'),
 Timestamp('2012-08-27 00:00:00', freq='WOM-4MON')]