<a href="https://colab.research.google.com/github/Ashiskumargit/Divine-AI-Data-Science/blob/main/TS/0_03_Pandas_TS_(holidays_custombusinessday).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h1 style="color:blue" align="center">Pandas Time Series Analysis Tutorial: Handling Holidays</h1>

In [None]:
import pandas as pd
df = pd.read_csv("aapl_no_dates3.csv")
df.head()

Unnamed: 0,Open,High,Low,Close,Adj Close,Volume
0,36.220001,36.325001,35.775002,35.875,34.054874,57111200
1,35.922501,36.197498,35.68,36.022499,34.194893,86278400
2,35.755001,35.875,35.602501,35.682499,33.872147,96515200
3,35.724998,36.1875,35.724998,36.044998,34.216255,76806800
4,36.0275,36.487499,35.842499,36.264999,34.425087,84362400


In [None]:
rng = pd.date_range(start="7/1/2017", end="7/21/2017", freq='B')
rng

DatetimeIndex(['2017-07-03', '2017-07-04', '2017-07-05', '2017-07-06',
               '2017-07-07', '2017-07-10', '2017-07-11', '2017-07-12',
               '2017-07-13', '2017-07-14', '2017-07-17', '2017-07-18',
               '2017-07-19', '2017-07-20', '2017-07-21'],
              dtype='datetime64[ns]', freq='B')

**Using 'B' frequency is not going to help because 4th July was holiday and 'B' is not taking that into account. 
It only accounts for weekends**

<h3 style="color:purple">Using CustomBusinessDay to generate US holidays calendar frequency</h3>

In [None]:
from pandas.tseries.holiday import USFederalHolidayCalendar     #4th july close due to Independent Day
from pandas.tseries.offsets import CustomBusinessDay

us_cal = CustomBusinessDay(calendar=USFederalHolidayCalendar())

rng = pd.date_range(start="7/1/2017",end="7/31/2017", freq=us_cal)
rng

DatetimeIndex(['2017-07-03', '2017-07-05', '2017-07-06', '2017-07-07',
               '2017-07-10', '2017-07-11', '2017-07-12', '2017-07-13',
               '2017-07-14', '2017-07-17', '2017-07-18', '2017-07-19',
               '2017-07-20', '2017-07-21', '2017-07-24', '2017-07-25',
               '2017-07-26', '2017-07-27', '2017-07-28', '2017-07-31'],
              dtype='datetime64[ns]', freq='C')

In [None]:
df.set_index(rng,inplace=True)
df.head()

Unnamed: 0,Open,High,Low,Close,Adj Close,Volume
2017-07-03,36.220001,36.325001,35.775002,35.875,34.054874,57111200
2017-07-05,35.922501,36.197498,35.68,36.022499,34.194893,86278400
2017-07-06,35.755001,35.875,35.602501,35.682499,33.872147,96515200
2017-07-07,35.724998,36.1875,35.724998,36.044998,34.216255,76806800
2017-07-10,36.0275,36.487499,35.842499,36.264999,34.425087,84362400


**You can define your own calendar using AbstractHolidayCalendar as shown below. USFederalHolidayCalendar is the only calendar available in pandas library and it serves as an example for those who want to write their own custom calendars. Here is the link for USFederalHolidayCalendar implementation** https://github.com/pandas-dev/pandas/blob/master/pandas/tseries/holiday.py

<h3 style="color:purple">AbstractHolidayCalendar</h3>

In [None]:
from pandas.tseries.holiday import AbstractHolidayCalendar, nearest_workday, Holiday
class myCalendar(AbstractHolidayCalendar):
    rules = [
        Holiday('My Birth Day', month=4, day=17),#, observance=nearest_workday),
    ]
    
my_bday = CustomBusinessDay(calendar=myCalendar())
rng3 = pd.date_range('4/1/2017','4/30/2017',freq=my_bday)
rng3

DatetimeIndex(['2017-04-03', '2017-04-04', '2017-04-05', '2017-04-06',
               '2017-04-07', '2017-04-10', '2017-04-11', '2017-04-12',
               '2017-04-13', '2017-04-14', '2017-04-18', '2017-04-19',
               '2017-04-20', '2017-04-21', '2017-04-24', '2017-04-25',
               '2017-04-26', '2017-04-27', '2017-04-28'],
              dtype='datetime64[ns]', freq='C')

In [None]:
#df.set_index(rng3, inplace=True)
#df.head()

<h3 style="color:purple">CustomBusinessDay</h3>

**Weekend in egypt is Friday and Saturday. Sunday is just a normal weekday and you can handle this custom week schedule using
CystomBysinessDay with weekmask as shown below**

In [None]:
egypt_weekdays = "Sun Mon Tue Wed Thu"

b = CustomBusinessDay(weekmask=egypt_weekdays)

pd.date_range(start="7/1/2017",periods=20,freq=b)

DatetimeIndex(['2017-07-02', '2017-07-03', '2017-07-04', '2017-07-05',
               '2017-07-06', '2017-07-09', '2017-07-10', '2017-07-11',
               '2017-07-12', '2017-07-13', '2017-07-16', '2017-07-17',
               '2017-07-18', '2017-07-19', '2017-07-20', '2017-07-23',
               '2017-07-24', '2017-07-25', '2017-07-26', '2017-07-27'],
              dtype='datetime64[ns]', freq='C')

**You can also add holidays to this custom business day frequency**

In [None]:
b = CustomBusinessDay(holidays=['2017-07-04', '2017-07-10'], weekmask=egypt_weekdays)

pd.date_range(start="7/1/2017",periods=20,freq=b)

DatetimeIndex(['2017-07-02', '2017-07-03', '2017-07-05', '2017-07-06',
               '2017-07-09', '2017-07-11', '2017-07-12', '2017-07-13',
               '2017-07-16', '2017-07-17', '2017-07-18', '2017-07-19',
               '2017-07-20', '2017-07-23', '2017-07-24', '2017-07-25',
               '2017-07-26', '2017-07-27', '2017-07-30', '2017-07-31'],
              dtype='datetime64[ns]', freq='C')

**Mathematical operations on date object using custom business day**

In [None]:
from datetime import datetime
dt = datetime(2017,7,9)
dt

datetime.datetime(2017, 7, 9, 0, 0)

In [None]:
dt + 1*b

Timestamp('2017-07-11 00:00:00')