#### Pandas Part 85: DateOffset Aliases

This notebook explores DateOffset aliases in pandas, which provide convenient shortcuts for commonly used DateOffset classes.

In [1]:
import pandas as pd
import numpy as np
from datetime import datetime, time, timedelta
from pandas.tseries.offsets import (
    BDay, BMonthEnd, BMonthBegin,
    BQuarterEnd, BQuarterBegin,
    BYearEnd, BYearBegin
)

##### 1. Introduction to DateOffset Aliases

DateOffset aliases are shorthand strings that represent specific DateOffset classes. These aliases make it easier to specify frequency strings in pandas time series functions like `date_range()`, `resample()`, and `asfreq()`.

In [2]:
# Display some common DateOffset aliases
aliases = {
    'B': 'business day frequency (weekdays)',
    'D': 'calendar day frequency',
    'W': 'weekly frequency',
    'M': 'month end frequency',
    'BM': 'business month end frequency',
    'MS': 'month start frequency',
    'BMS': 'business month start frequency',
    'Q': 'quarter end frequency',
    'BQ': 'business quarter end frequency',
    'QS': 'quarter start frequency',
    'BQS': 'business quarter start frequency',
    'A': 'year end frequency',
    'BA': 'business year end frequency',
    'AS': 'year start frequency',
    'BAS': 'business year start frequency',
    'H': 'hourly frequency',
    'T': 'minutely frequency',
    'S': 'secondly frequency',
    'L': 'milliseconds',
    'U': 'microseconds',
    'N': 'nanoseconds'
}

print("Common DateOffset Aliases:")
for alias, description in aliases.items():
    print(f"{alias}: {description}")

Common DateOffset Aliases:
B: business day frequency (weekdays)
D: calendar day frequency
W: weekly frequency
M: month end frequency
BM: business month end frequency
MS: month start frequency
BMS: business month start frequency
Q: quarter end frequency
BQ: business quarter end frequency
QS: quarter start frequency
BQS: business quarter start frequency
A: year end frequency
BA: business year end frequency
AS: year start frequency
BAS: business year start frequency
H: hourly frequency
T: minutely frequency
S: secondly frequency
L: milliseconds
U: microseconds
N: nanoseconds


##### 2. Business Day Alias (B)

The `B` alias represents business days (weekdays). It is an alias for the `BDay` class.

In [3]:
# Create a BDay offset
bday = BDay()
print(f"BDay: {bday}")

# Apply to a datetime
dt = datetime(2023, 1, 1)  # Sunday
print(f"Original datetime: {dt} ({dt.strftime('%A')})")
print(f"After adding 1 business day: {dt + bday} ({(dt + bday).strftime('%A')})")

# Create a date range with business day frequency using the 'B' alias
dates = pd.date_range(start='2023-01-01', periods=5, freq='B')
print(f"\nDate range with business day frequency (B):")
for date in dates:
    print(f"{date} ({date.strftime('%A')})")

BDay: <BusinessDay>
Original datetime: 2023-01-01 00:00:00 (Sunday)
After adding 1 business day: 2023-01-02 00:00:00 (Monday)

Date range with business day frequency (B):
2023-01-02 00:00:00 (Monday)
2023-01-03 00:00:00 (Tuesday)
2023-01-04 00:00:00 (Wednesday)
2023-01-05 00:00:00 (Thursday)
2023-01-06 00:00:00 (Friday)


In [4]:
# Create a date range with 2 business days frequency
dates = pd.date_range(start='2023-01-01', periods=5, freq='2B')
print(f"Date range with 2 business days frequency (2B):")
for date in dates:
    print(f"{date} ({date.strftime('%A')})")

Date range with 2 business days frequency (2B):
2023-01-02 00:00:00 (Monday)
2023-01-04 00:00:00 (Wednesday)
2023-01-06 00:00:00 (Friday)
2023-01-10 00:00:00 (Tuesday)
2023-01-12 00:00:00 (Thursday)


##### 3. Business Month Aliases (BM and BMS)

The `BM` alias represents business month end, and `BMS` represents business month start. They are aliases for the `BMonthEnd` and `BMonthBegin` classes, respectively.

In [5]:
# Create BMonthEnd and BMonthBegin offsets
bmonth_end = BMonthEnd()
bmonth_begin = BMonthBegin()
print(f"BMonthEnd: {bmonth_end}")
print(f"BMonthBegin: {bmonth_begin}")

# Apply to a datetime
dt = datetime(2023, 1, 15)  # Middle of the month
print(f"\nOriginal datetime: {dt}")
print(f"After adding BMonthEnd: {dt + bmonth_end}")
print(f"After adding BMonthBegin: {dt + bmonth_begin}")

# Create date ranges with business month frequency using aliases
dates_bm = pd.date_range(start='2023-01-15', periods=5, freq='BM')
print(f"\nDate range with business month end frequency (BM):")
for date in dates_bm:
    print(f"{date} ({date.strftime('%A')})")

dates_bms = pd.date_range(start='2023-01-15', periods=5, freq='BMS')
print(f"\nDate range with business month start frequency (BMS):")
for date in dates_bms:
    print(f"{date} ({date.strftime('%A')})")

BMonthEnd: <BusinessMonthEnd>
BMonthBegin: <BusinessMonthBegin>

Original datetime: 2023-01-15 00:00:00
After adding BMonthEnd: 2023-01-31 00:00:00
After adding BMonthBegin: 2023-02-01 00:00:00

Date range with business month end frequency (BM):
2023-01-31 00:00:00 (Tuesday)
2023-02-28 00:00:00 (Tuesday)
2023-03-31 00:00:00 (Friday)
2023-04-28 00:00:00 (Friday)
2023-05-31 00:00:00 (Wednesday)

Date range with business month start frequency (BMS):
2023-02-01 00:00:00 (Wednesday)
2023-03-01 00:00:00 (Wednesday)
2023-04-03 00:00:00 (Monday)
2023-05-01 00:00:00 (Monday)
2023-06-01 00:00:00 (Thursday)


  dates_bm = pd.date_range(start='2023-01-15', periods=5, freq='BM')


##### 4. Business Quarter Aliases (BQ and BQS)

The `BQ` alias represents business quarter end, and `BQS` represents business quarter start. They are aliases for the `BQuarterEnd` and `BQuarterBegin` classes, respectively.

In [6]:
# Create BQuarterEnd and BQuarterBegin offsets
bquarter_end = BQuarterEnd()
bquarter_begin = BQuarterBegin()
print(f"BQuarterEnd: {bquarter_end}")
print(f"BQuarterBegin: {bquarter_begin}")

# Apply to a datetime
dt = datetime(2023, 2, 15)  # Middle of Q1
print(f"\nOriginal datetime: {dt}")
print(f"After adding BQuarterEnd: {dt + bquarter_end}")
print(f"After adding BQuarterBegin: {dt + bquarter_begin}")

# Create date ranges with business quarter frequency using aliases
dates_bq = pd.date_range(start='2023-01-15', periods=4, freq='BQ')
print(f"\nDate range with business quarter end frequency (BQ):")
for date in dates_bq:
    print(f"{date} ({date.strftime('%A')})")

dates_bqs = pd.date_range(start='2023-01-15', periods=4, freq='BQS')
print(f"\nDate range with business quarter start frequency (BQS):")
for date in dates_bqs:
    print(f"{date} ({date.strftime('%A')})")

BQuarterEnd: <BusinessQuarterEnd: startingMonth=3>
BQuarterBegin: <BusinessQuarterBegin: startingMonth=3>

Original datetime: 2023-02-15 00:00:00
After adding BQuarterEnd: 2023-03-31 00:00:00
After adding BQuarterBegin: 2023-03-01 00:00:00

Date range with business quarter end frequency (BQ):
2023-03-31 00:00:00 (Friday)
2023-06-30 00:00:00 (Friday)
2023-09-29 00:00:00 (Friday)
2023-12-29 00:00:00 (Friday)

Date range with business quarter start frequency (BQS):
2023-04-03 00:00:00 (Monday)
2023-07-03 00:00:00 (Monday)
2023-10-02 00:00:00 (Monday)
2024-01-01 00:00:00 (Monday)


  dates_bq = pd.date_range(start='2023-01-15', periods=4, freq='BQ')


##### 5. Business Year Aliases (BA and BAS)

The `BA` alias represents business year end, and `BAS` represents business year start. They are aliases for the `BYearEnd` and `BYearBegin` classes, respectively.

In [7]:
# Create BYearEnd and BYearBegin offsets
byear_end = BYearEnd()
byear_begin = BYearBegin()
print(f"BYearEnd: {byear_end}")
print(f"BYearBegin: {byear_begin}")

# Apply to a datetime
dt = datetime(2023, 6, 15)  # Middle of the year
print(f"\nOriginal datetime: {dt}")
print(f"After adding BYearEnd: {dt + byear_end}")
print(f"After adding BYearBegin: {dt + byear_begin}")

# Create date ranges with business year frequency using aliases
dates_ba = pd.date_range(start='2020-06-15', periods=3, freq='BA')
print(f"\nDate range with business year end frequency (BA):")
for date in dates_ba:
    print(f"{date} ({date.strftime('%A')})")

dates_bas = pd.date_range(start='2020-06-15', periods=3, freq='BAS')
print(f"\nDate range with business year start frequency (BAS):")
for date in dates_bas:
    print(f"{date} ({date.strftime('%A')})")

BYearEnd: <BYearEnd: month=12>
BYearBegin: <BYearBegin: month=1>

Original datetime: 2023-06-15 00:00:00
After adding BYearEnd: 2023-12-29 00:00:00
After adding BYearBegin: 2024-01-01 00:00:00

Date range with business year end frequency (BA):
2020-12-31 00:00:00 (Thursday)
2021-12-31 00:00:00 (Friday)
2022-12-30 00:00:00 (Friday)

Date range with business year start frequency (BAS):
2021-01-01 00:00:00 (Friday)
2022-01-03 00:00:00 (Monday)
2023-01-02 00:00:00 (Monday)


  dates_ba = pd.date_range(start='2020-06-15', periods=3, freq='BA')
  dates_bas = pd.date_range(start='2020-06-15', periods=3, freq='BAS')


##### 6. Time-Based Aliases (H, T, S, L, U, N)

Pandas provides aliases for time-based DateOffset classes:
- `H`: Hour
- `T`: Minute (T stands for "time")
- `S`: Second
- `L`: Millisecond
- `U`: Microsecond
- `N`: Nanosecond

In [8]:
# Create date ranges with time-based frequency using aliases
dt = datetime(2023, 1, 1, 12, 0, 0)

# Hourly frequency
dates_h = pd.date_range(start=dt, periods=5, freq='H')
print(f"Date range with hourly frequency (H):")
for date in dates_h:
    print(f"{date}")

# Minutely frequency
dates_t = pd.date_range(start=dt, periods=5, freq='T')
print(f"\nDate range with minutely frequency (T):")
for date in dates_t:
    print(f"{date}")

# Secondly frequency
dates_s = pd.date_range(start=dt, periods=5, freq='S')
print(f"\nDate range with secondly frequency (S):")
for date in dates_s:
    print(f"{date}")

# Millisecond frequency
dates_l = pd.date_range(start=dt, periods=5, freq='L')
print(f"\nDate range with millisecond frequency (L):")
for date in dates_l:
    print(f"{date}")

Date range with hourly frequency (H):
2023-01-01 12:00:00
2023-01-01 13:00:00
2023-01-01 14:00:00
2023-01-01 15:00:00
2023-01-01 16:00:00

Date range with minutely frequency (T):
2023-01-01 12:00:00
2023-01-01 12:01:00
2023-01-01 12:02:00
2023-01-01 12:03:00
2023-01-01 12:04:00

Date range with secondly frequency (S):
2023-01-01 12:00:00
2023-01-01 12:00:01
2023-01-01 12:00:02
2023-01-01 12:00:03
2023-01-01 12:00:04

Date range with millisecond frequency (L):
2023-01-01 12:00:00
2023-01-01 12:00:00.001000
2023-01-01 12:00:00.002000
2023-01-01 12:00:00.003000
2023-01-01 12:00:00.004000


  dates_h = pd.date_range(start=dt, periods=5, freq='H')
  dates_t = pd.date_range(start=dt, periods=5, freq='T')
  dates_s = pd.date_range(start=dt, periods=5, freq='S')
  dates_l = pd.date_range(start=dt, periods=5, freq='L')


##### 7. Calendar Day vs. Business Day Aliases

It's important to understand the difference between calendar day and business day aliases.

In [9]:
# Calendar day frequency (D)
dates_d = pd.date_range(start='2023-01-01', periods=10, freq='D')
print(f"Date range with calendar day frequency (D):")
for date in dates_d:
    print(f"{date} ({date.strftime('%A')})")

# Business day frequency (B)
dates_b = pd.date_range(start='2023-01-01', periods=7, freq='B')
print(f"\nDate range with business day frequency (B):")
for date in dates_b:
    print(f"{date} ({date.strftime('%A')})")

Date range with calendar day frequency (D):
2023-01-01 00:00:00 (Sunday)
2023-01-02 00:00:00 (Monday)
2023-01-03 00:00:00 (Tuesday)
2023-01-04 00:00:00 (Wednesday)
2023-01-05 00:00:00 (Thursday)
2023-01-06 00:00:00 (Friday)
2023-01-07 00:00:00 (Saturday)
2023-01-08 00:00:00 (Sunday)
2023-01-09 00:00:00 (Monday)
2023-01-10 00:00:00 (Tuesday)

Date range with business day frequency (B):
2023-01-02 00:00:00 (Monday)
2023-01-03 00:00:00 (Tuesday)
2023-01-04 00:00:00 (Wednesday)
2023-01-05 00:00:00 (Thursday)
2023-01-06 00:00:00 (Friday)
2023-01-09 00:00:00 (Monday)
2023-01-10 00:00:00 (Tuesday)


##### 8. Using Aliases in Time Series Analysis

In [10]:
# Create a time series with daily data
dates = pd.date_range(start='2023-01-01', periods=30, freq='D')
values = np.random.randn(30)
ts = pd.Series(values, index=dates)
print(f"Time series with daily data:")
print(ts.head())

# Resample to business day frequency
ts_b = ts.resample('B').mean()
print(f"\nResampled to business day frequency (B):")
print(ts_b.head())

# Resample to weekly frequency
ts_w = ts.resample('W').mean()
print(f"\nResampled to weekly frequency (W):")
print(ts_w.head())

# Resample to monthly frequency
ts_m = ts.resample('M').mean()
print(f"\nResampled to monthly frequency (M):")
print(ts_m.head())

Time series with daily data:
2023-01-01   -0.226162
2023-01-02   -0.583318
2023-01-03    1.403182
2023-01-04    0.834199
2023-01-05    0.119789
Freq: D, dtype: float64

Resampled to business day frequency (B):
2022-12-30   -0.226162
2023-01-02   -0.583318
2023-01-03    1.403182
2023-01-04    0.834199
2023-01-05    0.119789
Freq: B, dtype: float64

Resampled to weekly frequency (W):
2023-01-01   -0.226162
2023-01-08    0.263532
2023-01-15    0.373402
2023-01-22    0.217045
2023-01-29    0.044548
Freq: W-SUN, dtype: float64

Resampled to monthly frequency (M):
2023-01-31    0.212445
Freq: ME, dtype: float64


  ts_m = ts.resample('M').mean()


##### 9. Combining Aliases with Multipliers

In [11]:
# Create date ranges with multipliers
# 2 business days
dates_2b = pd.date_range(start='2023-01-01', periods=5, freq='2B')
print(f"Date range with 2 business days frequency (2B):")
for date in dates_2b:
    print(f"{date} ({date.strftime('%A')})")

# 3 hours
dates_3h = pd.date_range(start='2023-01-01 00:00', periods=5, freq='3H')
print(f"\nDate range with 3 hours frequency (3H):")
for date in dates_3h:
    print(f"{date}")

# 15 minutes
dates_15t = pd.date_range(start='2023-01-01 00:00', periods=5, freq='15T')
print(f"\nDate range with 15 minutes frequency (15T):")
for date in dates_15t:
    print(f"{date}")

Date range with 2 business days frequency (2B):
2023-01-02 00:00:00 (Monday)
2023-01-04 00:00:00 (Wednesday)
2023-01-06 00:00:00 (Friday)
2023-01-10 00:00:00 (Tuesday)
2023-01-12 00:00:00 (Thursday)

Date range with 3 hours frequency (3H):
2023-01-01 00:00:00
2023-01-01 03:00:00
2023-01-01 06:00:00
2023-01-01 09:00:00
2023-01-01 12:00:00

Date range with 15 minutes frequency (15T):
2023-01-01 00:00:00
2023-01-01 00:15:00
2023-01-01 00:30:00
2023-01-01 00:45:00
2023-01-01 01:00:00


  dates_3h = pd.date_range(start='2023-01-01 00:00', periods=5, freq='3H')
  dates_15t = pd.date_range(start='2023-01-01 00:00', periods=5, freq='15T')


##### 10. Using Aliases with asfreq() Method

In [12]:
# Create a time series with daily data
dates = pd.date_range(start='2023-01-01', periods=10, freq='D')
values = np.random.randn(10)
ts = pd.Series(values, index=dates)
print(f"Time series with daily data:")
print(ts)

# Convert to business day frequency
ts_b = ts.asfreq('B')
print(f"\nConverted to business day frequency (B):")
print(ts_b)

# Convert to business day frequency with forward filling
ts_b_ffill = ts.asfreq('B', method='ffill')
print(f"\nConverted to business day frequency (B) with forward filling:")
print(ts_b_ffill)

Time series with daily data:
2023-01-01   -0.329065
2023-01-02   -0.699027
2023-01-03   -0.243415
2023-01-04    1.061025
2023-01-05    2.060354
2023-01-06    0.121644
2023-01-07    0.885435
2023-01-08   -0.680722
2023-01-09    0.881148
2023-01-10    0.324018
Freq: D, dtype: float64

Converted to business day frequency (B):
2023-01-02   -0.699027
2023-01-03   -0.243415
2023-01-04    1.061025
2023-01-05    2.060354
2023-01-06    0.121644
2023-01-09    0.881148
2023-01-10    0.324018
Freq: B, dtype: float64

Converted to business day frequency (B) with forward filling:
2023-01-02   -0.699027
2023-01-03   -0.243415
2023-01-04    1.061025
2023-01-05    2.060354
2023-01-06    0.121644
2023-01-09    0.881148
2023-01-10    0.324018
Freq: B, dtype: float64
