## Pandas Tutorial 17: Time Series Analysis: Holidays
Time series analysis is crucial in financial data analysis, and Pandas provides robust support for handling time series. This tutorial covers how to manage holidays in time series analysis. Using `CustomBusinessDay` and `AbstractHolidayCalendar`, you can create custom holiday calendars. Pandas also offers a built-in `USFederalHolidayCalendar` as an example for those who want to create their own.

#### Topics covered:
* **Introduction**
* **Importing `USFederalHolidayCalendar` and Related Classes**
* **Creating a Custom Holiday Calendar in Pandas**
* **Using the `observance` Argument in Custom Calendars**
* **Setting a `weekmask` with `CustomBusinessDay()`**

This tutorial will guide you through efficiently handling holidays in time series data for custom business day analysis.

In [6]:
import pandas as pd
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
from pandas.tseries.holiday import AbstractHolidayCalendar, nearest_workday, Holiday
from datetime import datetime

In [7]:
df = pd.read_csv("aapl_no_dates.csv")
df.head()

Unnamed: 0,Open,High,Low,Close,Volume
0,153.17,153.33,152.22,153.18,16404088
1,153.58,155.45,152.89,155.45,27770715
2,154.34,154.45,153.46,153.93,25331662
3,153.9,155.81,153.78,154.45,26624926
4,155.02,155.98,154.48,155.37,21069647


In [8]:
# Generate a DateTimeIndex for business days (Monday to Friday) in June 2016
rng = pd.date_range(start="7/1/2017",end="7/21/2017",freq='B')
rng

DatetimeIndex(['2017-07-03', '2017-07-04', '2017-07-05', '2017-07-06',
               '2017-07-07', '2017-07-10', '2017-07-11', '2017-07-12',
               '2017-07-13', '2017-07-14', '2017-07-17', '2017-07-18',
               '2017-07-19', '2017-07-20', '2017-07-21'],
              dtype='datetime64[ns]', freq='B')

Using the `'B'` frequency excludes weekends but **doesn't account for holidays** like July 4th, 2017. It only handles **weekends**, so holidays need to be managed separately.

## Using `CustomBusinessDay` to generate US holidays calendar frequency

In [9]:
us_cal = CustomBusinessDay(calendar=USFederalHolidayCalendar())

rng = pd.date_range(start="7/1/2017", end="7/23/2017", freq=us_cal)
rng

DatetimeIndex(['2017-07-03', '2017-07-05', '2017-07-06', '2017-07-07',
               '2017-07-10', '2017-07-11', '2017-07-12', '2017-07-13',
               '2017-07-14', '2017-07-17', '2017-07-18', '2017-07-19',
               '2017-07-20', '2017-07-21'],
              dtype='datetime64[ns]', freq='C')

In [11]:
df = df.head(len(rng))
df.set_index(rng,inplace=True)
df.head()

Unnamed: 0,Open,High,Low,Close,Volume
2017-07-03,153.17,153.33,152.22,153.18,16404088
2017-07-05,153.58,155.45,152.89,155.45,27770715
2017-07-06,154.34,154.45,153.46,153.93,25331662
2017-07-07,153.9,155.81,153.78,154.45,26624926
2017-07-10,155.02,155.98,154.48,155.37,21069647


You can **create a custom calendar** using `AbstractHolidayCalendar`. The `USFederalHolidayCalendar` provided by Pandas serves as an example for building your own. You can view its implementation [here](https://github.com/pandas-dev/pandas/blob/master/pandas/tseries/holiday.py)

## AbstractHolidayCalendar

In [12]:
class myCalendar(AbstractHolidayCalendar):
    rules = [
        Holiday('My Birthday', month=4, day=15),
    ]

my_bday = CustomBusinessDay(calendar=myCalendar())
pd.date_range('4/1/2017','4/30/2017',freq=my_bday)

DatetimeIndex(['2017-04-03', '2017-04-04', '2017-04-05', '2017-04-06',
               '2017-04-07', '2017-04-10', '2017-04-11', '2017-04-12',
               '2017-04-13', '2017-04-14', '2017-04-17', '2017-04-18',
               '2017-04-19', '2017-04-20', '2017-04-21', '2017-04-24',
               '2017-04-25', '2017-04-26', '2017-04-27', '2017-04-28'],
              dtype='datetime64[ns]', freq='C')

## CustomBusinessDay

Weekend in Egypt is **Friday and Saturday**. Sunday is just a normal weekday and you can handle this custom week schedule using `CustomBusinessDay` with weekmask as shown below

In [14]:
egypt_weekdays = "Sun Mon Tue Wed Thu"
b = CustomBusinessDay(weekmask=egypt_weekdays)
pd.date_range(start="7/1/2017", periods=20, freq=b)

DatetimeIndex(['2017-07-02', '2017-07-03', '2017-07-04', '2017-07-05',
               '2017-07-06', '2017-07-09', '2017-07-10', '2017-07-11',
               '2017-07-12', '2017-07-13', '2017-07-16', '2017-07-17',
               '2017-07-18', '2017-07-19', '2017-07-20', '2017-07-23',
               '2017-07-24', '2017-07-25', '2017-07-26', '2017-07-27'],
              dtype='datetime64[ns]', freq='C')

You can also add holidays to this custom business day frequency

In [16]:
b = CustomBusinessDay(holidays=['2017-07-04', '2017-07-10'], weekmask=egypt_weekdays)
pd.date_range(start="7/1/2017", periods=20, freq=b)

DatetimeIndex(['2017-07-02', '2017-07-03', '2017-07-05', '2017-07-06',
               '2017-07-09', '2017-07-11', '2017-07-12', '2017-07-13',
               '2017-07-16', '2017-07-17', '2017-07-18', '2017-07-19',
               '2017-07-20', '2017-07-23', '2017-07-24', '2017-07-25',
               '2017-07-26', '2017-07-27', '2017-07-30', '2017-07-31'],
              dtype='datetime64[ns]', freq='C')

Mathematical operations on date object using custom business day

In [17]:
dt = datetime(2017,7,9)
df

Unnamed: 0,Open,High,Low,Close,Volume
2017-07-03,153.17,153.33,152.22,153.18,16404088
2017-07-05,153.58,155.45,152.89,155.45,27770715
2017-07-06,154.34,154.45,153.46,153.93,25331662
2017-07-07,153.9,155.81,153.78,154.45,26624926
2017-07-10,155.02,155.98,154.48,155.37,21069647
2017-07-11,155.25,155.54,154.4,154.99,21250798
2017-07-12,155.19,155.19,146.02,148.98,64882657
2017-07-13,145.74,146.09,142.51,145.42,72307330
2017-07-14,147.16,147.45,145.15,146.59,34165445
2017-07-17,147.5,147.5,143.84,145.16,31531232


In [18]:
dt + 1*b

Timestamp('2017-07-11 00:00:00')