# Chapter 6
## Working with Date and Time in Python

This chapter takes a practical and intuitive approach to an intimidating topic. You will learn how to deal with the complexity of dates and time in your time series data. The chapter illustrates practical use cases for handling time zones, custom holidays, and business days, working with Unix epoch and UTC. Typically, this intimidating topic is presented in a fun and practical way that you will find helpful to apply right away.

Here is the list of the recipes that we will cover in this chapter:

* Working with `DatetimeIndex`
* Providing a format argument to DateTime
* Working with Unix epoch timestamps
* Working with time deltas
* Converting DateTime with time zone information
* Working with date offsets
* Working with custom business days

# Technical Requirements 

In [1]:
import warnings
warnings.filterwarnings('ignore')

In [2]:
import pandas as pd
import numpy as np
import datetime as dt

In [3]:
pd.__version__

'2.3.3'

# Recipe 1: Working with `DatetimeIndex`
* In this recipe, you will explore Python's datetime module and learn about the Timestamp and DatetimeIndex classes and the relationship between them.

#### Python's `Datetime`, pandas `Timestamp`, and `panda.to_datetime()`

In [4]:
dt.datetime

datetime.datetime

In [5]:
dt1 = dt.datetime(2024,1,1)
dt2 = pd.Timestamp('2024-1-1')
dt3 = pd.to_datetime('2024-1-1')

In [6]:
dt4 = pd.DatetimeIndex(['2024-1-1'])

In [7]:
print(dt4)

DatetimeIndex(['2024-01-01'], dtype='datetime64[ns]', freq=None)


In [8]:
print(dt1)
print(dt2)
print(dt3)

2024-01-01 00:00:00
2024-01-01 00:00:00
2024-01-01 00:00:00


In [9]:
print(type(dt1))
print(type(dt2))
print(type(dt3))
print(type(dt4))

<class 'datetime.datetime'>
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>


In [10]:
# comparing differnt datetime representations 
dt1 == dt2 == dt3

True

In [11]:
dt.datetime(2021,1,1) == pd.to_datetime('2021-1-1')

True

In [12]:
isinstance(dt2, dt.datetime)

True

In [13]:
isinstance(dt2, pd.Timestamp)

True

In [14]:
isinstance(dt1, pd.Timestamp)

False

In [15]:
isinstance(dt4[0], dt.datetime)

True

In [16]:
isinstance(dt4, pd.DatetimeIndex)

True

In [17]:
isinstance(pd.DatetimeIndex, dt.datetime)

False

In [18]:
issubclass(pd.Timestamp, dt.datetime)
# isinstance(dt1, pd.Timestamp)

True

####  `pandas.to_datetime()`

In [19]:
sample_dates = ['2024-01-01', '2024-01-02', '2024-01-03']
pd_dates = pd.to_datetime(sample_dates)
print(pd_dates)
print(type(pd_dates))

DatetimeIndex(['2024-01-01', '2024-01-02', '2024-01-03'], dtype='datetime64[ns]', freq=None)
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>


In [20]:
isinstance(pd_dates[0], pd.Timestamp)

True

In [21]:
print(pd_dates[0])
print(type(pd_dates[0]))

2024-01-01 00:00:00
<class 'pandas._libs.tslibs.timestamps.Timestamp'>


In [22]:
dates = ['2024-01-01', # date str format %Y-%m-%d
         '2/1/2024', # date str format %m/%d/%Y
         '03-01-2024', # date  str format %m-%d-%Y
         'April 1, 2024', # date  str format %B %d, %Y
         '20240501', # date str format %Y%m%d
          np.datetime64('2024-07-01'), # numpy datetime64
          dt.datetime(2024, 8, 1), # python datetime
          pd.Timestamp(2024,9,1) # pandas Timestamp
          ]

In [23]:
# Using map directly
parsed_dates_map = pd.Series(list(map(pd.to_datetime, dates)))
print(parsed_dates_map)

0   2024-01-01
1   2024-02-01
2   2024-03-01
3   2024-04-01
4   2024-05-01
5   2024-07-01
6   2024-08-01
7   2024-09-01
dtype: datetime64[ns]


In [24]:
parsed_dates_map.info()

<class 'pandas.core.series.Series'>
RangeIndex: 8 entries, 0 to 7
Series name: None
Non-Null Count  Dtype         
--------------  -----         
8 non-null      datetime64[ns]
dtypes: datetime64[ns](1)
memory usage: 196.0 bytes


In [25]:
pd.DatetimeIndex(dates)

DatetimeIndex(['2024-01-01', '2024-02-01', '2024-03-01', '2024-04-01',
               '2024-05-01', '2024-07-01', '2024-08-01', '2024-09-01'],
              dtype='datetime64[ns]', freq=None)

In [26]:
parsed_dates = pd.Series(pd.DatetimeIndex(dates))
parsed_dates

0   2024-01-01
1   2024-02-01
2   2024-03-01
3   2024-04-01
4   2024-05-01
5   2024-07-01
6   2024-08-01
7   2024-09-01
dtype: datetime64[ns]

In [27]:
parsed_dates_map == parsed_dates

0    True
1    True
2    True
3    True
4    True
5    True
6    True
7    True
dtype: bool

In [28]:
parsed_dates = pd.DatetimeIndex(dates)

print(f'Name of Day : {parsed_dates.day_name()}')
print(f'Month : {parsed_dates.month}')
print(f'Month Name: {parsed_dates.month_name()}')
print(f'Year : {parsed_dates.year}')
print(f'Days in Month : {parsed_dates.days_in_month}')
print(f'Quarter {parsed_dates.quarter}')
print(f'Is Quarter Start : {parsed_dates.is_quarter_start}')
print(f'Days in Month: {parsed_dates.days_in_month}')
print(f'Is Leap Year : {parsed_dates.is_leap_year}')
print(f'Is Month Start : {parsed_dates.is_month_start}')
print(f'Is Month End : {parsed_dates.is_month_end}')
print(f'Is Year Start : {parsed_dates.is_year_start}')

Name of Day : Index(['Monday', 'Thursday', 'Friday', 'Monday', 'Wednesday', 'Monday',
       'Thursday', 'Sunday'],
      dtype='object')
Month : Index([1, 2, 3, 4, 5, 7, 8, 9], dtype='int32')
Month Name: Index(['January', 'February', 'March', 'April', 'May', 'July', 'August',
       'September'],
      dtype='object')
Year : Index([2024, 2024, 2024, 2024, 2024, 2024, 2024, 2024], dtype='int32')
Days in Month : Index([31, 29, 31, 30, 31, 31, 31, 30], dtype='int32')
Quarter Index([1, 1, 1, 2, 2, 3, 3, 3], dtype='int32')
Is Quarter Start : [ True False False  True False  True False False]
Days in Month: Index([31, 29, 31, 30, 31, 31, 31, 30], dtype='int32')
Is Leap Year : [ True  True  True  True  True  True  True  True]
Is Month Start : [ True  True  True  True  True  True  True  True]
Is Month End : [False False False False False False False False]
Is Year Start : [ True False False False False False False False]


## How it works

In [29]:
parsed_dates = pd.to_datetime(
                 dates,
                 errors='coerce'
                 )

print(parsed_dates)

DatetimeIndex(['2024-01-01',        'NaT',        'NaT',        'NaT',
                      'NaT', '2024-07-01', '2024-08-01', '2024-09-01'],
              dtype='datetime64[ns]', freq=None)


In [34]:
parsed_dates = pd.to_datetime(
                 dates,
                 errors='ignore'
                 )

print(parsed_dates)

Index([       '2024-01-01',          '2/1/2024',        '03-01-2024',
           'April 1, 2024',          '20240501',          2024-07-01,
       2024-08-01 00:00:00, 2024-09-01 00:00:00],
      dtype='object')


In [35]:
# Expected warning below for infereing the format 
example = pd.to_datetime(['something 2021', 'Jan 1, 2021'], errors='coerce')
example

DatetimeIndex(['NaT', '2021-01-01'], dtype='datetime64[ns]', freq=None)

In [36]:
len(example)

2

In [37]:
type(example)

pandas.core.indexes.datetimes.DatetimeIndex

## There is more

In [38]:
pd.date_range(start='2024-1-01', periods=3, freq='D')

DatetimeIndex(['2024-01-01', '2024-01-02', '2024-01-03'], dtype='datetime64[ns]', freq='D')

In [39]:
pd.date_range(start='2024-11-25',
               end='2024-12-02',
               freq='D')


DatetimeIndex(['2024-11-25', '2024-11-26', '2024-11-27', '2024-11-28',
               '2024-11-29', '2024-11-30', '2024-12-01', '2024-12-02'],
              dtype='datetime64[ns]', freq='D')

In [40]:
pd.date_range(start='2024-11-25',
              end='2024-12-01',
               periods=4)

DatetimeIndex(['2024-11-25', '2024-11-27', '2024-11-29', '2024-12-01'], dtype='datetime64[ns]', freq=None)

In [41]:
# Example of error

pd.date_range(start='2024-11-25',
              periods=5,
               freq='D')

DatetimeIndex(['2024-11-25', '2024-11-26', '2024-11-27', '2024-11-28',
               '2024-11-29'],
              dtype='datetime64[ns]', freq='D')

In [42]:
pd.date_range(start='2024-11-25',
               end='2024-12-02')


DatetimeIndex(['2024-11-25', '2024-11-26', '2024-11-27', '2024-11-28',
               '2024-11-29', '2024-11-30', '2024-12-01', '2024-12-02'],
              dtype='datetime64[ns]', freq='D')

In [43]:
pd.date_range(start='2024-11-01',
               periods=3)

DatetimeIndex(['2024-11-01', '2024-11-02', '2024-11-03'], dtype='datetime64[ns]', freq='D')

In [44]:
# error
try:
    pd.date_range(start='2024-11-25', freq='D')
except ValueError:
    print("Need additional parameters")


Need additional parameters


In [45]:
df = pd.DataFrame(pd.date_range(start='2024-11-25',
               periods=5), columns=['Date'])

df['days_in_month'] = df['Date'].dt.days_in_month
df['day_name'] = df['Date'].dt.day_name()
df['month'] = df['Date'].dt.month
df['month_name'] = df['Date'].dt.month_name()
df['year'] = df['Date'].dt.year
df['days_in_month'] = df['Date'].dt.days_in_month
df['quarter'] = df['Date'].dt.quarter
df['is_quarter_start'] = df['Date'].dt.is_quarter_start
df['days_in_month'] = df['Date'].dt.days_in_month
df['is_leap_year'] = df['Date'].dt.is_leap_year
df['is_month_start'] = df['Date'].dt.is_month_start
df['is_month_end'] = df['Date'].dt.is_month_end
df['is_year_start'] = df['Date'].dt.is_year_start
df

Unnamed: 0,Date,days_in_month,day_name,month,month_name,year,quarter,is_quarter_start,is_leap_year,is_month_start,is_month_end,is_year_start
0,2024-11-25,30,Monday,11,November,2024,4,False,True,False,False,False
1,2024-11-26,30,Tuesday,11,November,2024,4,False,True,False,False,False
2,2024-11-27,30,Wednesday,11,November,2024,4,False,True,False,False,False
3,2024-11-28,30,Thursday,11,November,2024,4,False,True,False,False,False
4,2024-11-29,30,Friday,11,November,2024,4,False,True,False,False,False


# Recipe 2: Providing a format argument to DateTime

In [46]:
import pandas as pd
import datetime as dt

In [47]:
dt.datetime.strptime('1/1/2024', '%m/%d/%Y')

datetime.datetime(2024, 1, 1, 0, 0)

Using Python `datetime.strptime()`

In [48]:
dt.datetime.strptime('1/1/2024', '%m/%d/%Y').date()

datetime.date(2024, 1, 1)

In [49]:
dt.datetime.strptime('1 January, 2024', '%d %B, %Y').date()

datetime.date(2024, 1, 1)

In [50]:
dt.datetime.strptime('1-Jan-2024', '%d-%b-%Y').date()

datetime.date(2024, 1, 1)

In [51]:
dt.datetime.strptime('Saturday, January 1, 2024', '%A, %B %d, %Y').date()

datetime.date(2024, 1, 1)

In [52]:
dt_1 = dt.datetime.strptime('1/1/2024', '%m/%d/%Y')
dt_1.__str__()
str(dt_1)

'2024-01-01 00:00:00'

In [53]:
print(dt_1.date())

2024-01-01


Using `pandas.to_datetime()`

In [54]:
pd.to_datetime('1/1/2024', format='%m/%d/%Y')

Timestamp('2024-01-01 00:00:00')

In [55]:
pd.to_datetime('1 January, 2024', format='%d %B, %Y')

Timestamp('2024-01-01 00:00:00')

In [56]:
pd.to_datetime('1-Jan-2024', format='%d-%b-%Y')

Timestamp('2024-01-01 00:00:00')

In [57]:
pd.to_datetime('Saturday, January 1, 2024', format='%A, %B %d, %Y')

Timestamp('2024-01-01 00:00:00')

In [58]:
dt_2 = pd.to_datetime('1/1/2024', format='%m/%d/%Y')
print(dt_2)

2024-01-01 00:00:00


In [59]:
pd.to_datetime('1-Jan-2024')

Timestamp('2024-01-01 00:00:00')

In [60]:
pd.to_datetime('Saturday, January 1, 2024')

Timestamp('2024-01-01 00:00:00')

In [61]:
pd.to_datetime('1/1/2024', format='%m/%d/%Y')
pd.to_datetime('1 January, 2024', format='%d %B, %Y')
pd.to_datetime('1-Jan-2024', format='%d-%b-%Y')
pd.to_datetime('Saturday, January 1, 2024', format='%A, %B %d, %Y')

Timestamp('2024-01-01 00:00:00')

In [62]:
print(pd.to_datetime('1/1/2024'))
print(pd.to_datetime('1 January, 2024'))
print(pd.to_datetime('1-Jan-2024'))
print(pd.to_datetime('Saturday, January 1, 2024'))

2024-01-01 00:00:00
2024-01-01 00:00:00
2024-01-01 00:00:00
2024-01-01 00:00:00


In [63]:
str(dt_2)
dt_2.__str__()

'2024-01-01 00:00:00'

In [64]:
print(dt_2.date())

2024-01-01


In [65]:
dt_1 == dt_2

True

In [66]:
dt_1.date()

datetime.date(2024, 1, 1)

In [67]:
dt_2.date()

datetime.date(2024, 1, 1)

In [68]:
type(dt_1)
type(dt_2)

pandas._libs.tslibs.timestamps.Timestamp

In [69]:
isinstance(pd.DatetimeIndex, pd.Timestamp)

False

In [70]:
isinstance(dt_1, dt.datetime)

True

In [71]:
isinstance(dt_2, dt.datetime)

True

In [72]:
isinstance(dt_1, pd.Timestamp)

False

In [73]:
isinstance(dt_2, pd.Timestamp)

True

In [74]:
issubclass(pd.Timestamp, dt.datetime)

True

### Using Ignore example

In [77]:
pd.to_datetime(['something 2024', 'Jan 1, 2024'], 
               errors='ignore')

Index(['something 2024', 'Jan 1, 2024'], dtype='object')

### Tranforming a pandas DataFrame to a time series DataFrame

In [79]:
pd.DataFrame({'missing types' : [pd.NaT, pd.NA, np.nan, None]})

Unnamed: 0,missing types
0,NaT
1,
2,
3,


## There is more

In [80]:
df = pd.DataFrame(
        {'Date': ['January 1, 2024', 'January 2, 2024', 'January 3, 2024'],
         'Sales': [23000, 19020, 21000]}
            )
df

Unnamed: 0,Date,Sales
0,"January 1, 2024",23000
1,"January 2, 2024",19020
2,"January 3, 2024",21000


In [81]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Date    3 non-null      object
 1   Sales   3 non-null      int64 
dtypes: int64(1), object(1)
memory usage: 180.0+ bytes


In [82]:
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 3 entries, 2024-01-01 to 2024-01-03
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   Sales   3 non-null      int64
dtypes: int64(1)
memory usage: 48.0 bytes


# Recipe 3: Working with Unix epoch timestamps

In [83]:
import time
epoch_time = time.time()
print(epoch_time)
print(type(epoch_time))

1759857782.7619112
<class 'float'>


In [84]:
import pandas as pd
t = pd.to_datetime(1732048268.954201, unit='s')
print(t)

2024-11-19 20:31:08.954200983


In [85]:
t = pd.to_datetime(1732048268.954201, unit='s')
t

Timestamp('2024-11-19 20:31:08.954200983')

In [86]:
import datetime
datetime.datetime.fromtimestamp(1732048268.954201)

datetime.datetime(2024, 11, 19, 23, 31, 8, 954201)

In [87]:
new_t = t.tz_localize('UTC').tz_convert('US/Pacific')
new_t

Timestamp('2024-11-19 12:31:08.954200983-0800', tz='US/Pacific')

In [88]:
print(new_t)

2024-11-19 12:31:08.954200983-08:00


In [89]:
z = [1641110340,  1641196740, 1641283140, 1641369540]
new_time = []
org = 1641110340
for i in range(4):
    k = org+3600
    new_time.append(k)
    org = k
new_time

[1641113940, 1641117540, 1641121140, 1641124740]

In [90]:
new_time

[1641113940, 1641117540, 1641121140, 1641124740]

In [91]:
pd.to_datetime(1641124740, unit='s')

Timestamp('2022-01-02 11:59:00')

In [92]:
df = pd.DataFrame(
        {'unix_epoch': [1641110340,  1641196740, 1641283140, 1641369540],
                'Sales': [23000, 19020, 21000, 17030]}
                )
df

Unnamed: 0,unix_epoch,Sales
0,1641110340,23000
1,1641196740,19020
2,1641283140,21000
3,1641369540,17030


In [93]:
df['Date'] = pd.to_datetime(df['unix_epoch'], unit='s')
df['Date'] = df['Date'].dt.tz_localize('UTC').dt.tz_convert('US/Pacific')
df.set_index('Date', inplace=True)
print(df)

                           unix_epoch  Sales
Date                                        
2022-01-01 23:59:00-08:00  1641110340  23000
2022-01-02 23:59:00-08:00  1641196740  19020
2022-01-03 23:59:00-08:00  1641283140  21000
2022-01-04 23:59:00-08:00  1641369540  17030


In [94]:
print(df.index)

DatetimeIndex(['2022-01-01 23:59:00-08:00', '2022-01-02 23:59:00-08:00',
               '2022-01-03 23:59:00-08:00', '2022-01-04 23:59:00-08:00'],
              dtype='datetime64[ns, US/Pacific]', name='Date', freq=None)


In [95]:
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 4 entries, 2022-01-01 23:59:00-08:00 to 2022-01-04 23:59:00-08:00
Data columns (total 2 columns):
 #   Column      Non-Null Count  Dtype
---  ------      --------------  -----
 0   unix_epoch  4 non-null      int64
 1   Sales       4 non-null      int64
dtypes: int64(2)
memory usage: 96.0 bytes


In [96]:
df.index.date

array([datetime.date(2022, 1, 1), datetime.date(2022, 1, 2),
       datetime.date(2022, 1, 3), datetime.date(2022, 1, 4)], dtype=object)

### How it works

In [97]:
df = pd.DataFrame(
        {'unix_epoch': [1641113940, 1641117540, 1641121140, 1641124740],
                'count_shoppers': [12, 16, 18, 10]}
                )
df['Date'] = pd.to_datetime(df['unix_epoch'], unit='s')
df.set_index('Date', inplace=True)
print(df)

                     unix_epoch  count_shoppers
Date                                           
2022-01-02 08:59:00  1641113940              12
2022-01-02 09:59:00  1641117540              16
2022-01-02 10:59:00  1641121140              18
2022-01-02 11:59:00  1641124740              10


In [98]:
updated_df = df.tz_localize('UTC').tz_convert('US/Pacific')
print(updated_df)

                           unix_epoch  count_shoppers
Date                                                 
2022-01-02 00:59:00-08:00  1641113940              12
2022-01-02 01:59:00-08:00  1641117540              16
2022-01-02 02:59:00-08:00  1641121140              18
2022-01-02 03:59:00-08:00  1641124740              10


In [99]:
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 4 entries, 2022-01-02 08:59:00 to 2022-01-02 11:59:00
Data columns (total 2 columns):
 #   Column          Non-Null Count  Dtype
---  ------          --------------  -----
 0   unix_epoch      4 non-null      int64
 1   count_shoppers  4 non-null      int64
dtypes: int64(2)
memory usage: 96.0 bytes


In [100]:
df.index

DatetimeIndex(['2022-01-02 08:59:00', '2022-01-02 09:59:00',
               '2022-01-02 10:59:00', '2022-01-02 11:59:00'],
              dtype='datetime64[ns]', name='Date', freq=None)

In [101]:
t = pd.to_datetime(1635220133.855169, unit='s', origin='unix')
t

Timestamp('2021-10-26 03:48:53.855169058')

In [102]:
t = pd.to_datetime(45, unit='D', origin='2024-1-1')
t

Timestamp('2024-02-15 00:00:00')

## There is more 

In [103]:
df = pd.DataFrame(
        {'Date': pd.date_range('11-01-2024', periods=5),
        'order' : range(5)}
                 )
print(df)

        Date  order
0 2024-11-01      0
1 2024-11-02      1
2 2024-11-03      2
3 2024-11-04      3
4 2024-11-05      4


In [104]:
df['Unix Time'] = (df['Date'] -  pd.Timestamp("1970-01-01")) // pd.Timedelta("1s")
print(df)

        Date  order   Unix Time
0 2024-11-01      0  1730419200
1 2024-11-02      1  1730505600
2 2024-11-03      2  1730592000
3 2024-11-04      3  1730678400
4 2024-11-05      4  1730764800


In [105]:
# convert it back
pd.to_datetime(df['Unix Time'], unit='s')

0   2024-11-01
1   2024-11-02
2   2024-11-03
3   2024-11-04
4   2024-11-05
Name: Unix Time, dtype: datetime64[ns]

# Recipe 4: Working with Time Deltas

In [106]:
import pandas as pd

In [107]:
df = pd.DataFrame(
        {       
        'item': ['item1', 'item2', 'item3', 'item4', 'item5', 'item6'],
        'purchase_dt': pd.date_range('2024-10-01', periods=6, freq='D', tz='UTC')
        }
)
df

Unnamed: 0,item,purchase_dt
0,item1,2024-10-01 00:00:00+00:00
1,item2,2024-10-02 00:00:00+00:00
2,item3,2024-10-03 00:00:00+00:00
3,item4,2024-10-04 00:00:00+00:00
4,item5,2024-10-05 00:00:00+00:00
5,item6,2024-10-06 00:00:00+00:00


In [108]:
df['expiration_dt'] = df['purchase_dt'] + pd.Timedelta(days=30)

In [109]:
df['extended_dt'] = df['purchase_dt'] +\
                pd.Timedelta('35 days 12 hours 30 minutes')
df

Unnamed: 0,item,purchase_dt,expiration_dt,extended_dt
0,item1,2024-10-01 00:00:00+00:00,2024-10-31 00:00:00+00:00,2024-11-05 12:30:00+00:00
1,item2,2024-10-02 00:00:00+00:00,2024-11-01 00:00:00+00:00,2024-11-06 12:30:00+00:00
2,item3,2024-10-03 00:00:00+00:00,2024-11-02 00:00:00+00:00,2024-11-07 12:30:00+00:00
3,item4,2024-10-04 00:00:00+00:00,2024-11-03 00:00:00+00:00,2024-11-08 12:30:00+00:00
4,item5,2024-10-05 00:00:00+00:00,2024-11-04 00:00:00+00:00,2024-11-09 12:30:00+00:00
5,item6,2024-10-06 00:00:00+00:00,2024-11-05 00:00:00+00:00,2024-11-10 12:30:00+00:00


In [110]:
df[['purchase_dt', 'expiration_dt', 'extended_dt']] = \
    df[['purchase_dt', 'expiration_dt', 'extended_dt']].apply(
    lambda x: x.dt.tz_convert('US/Pacific'))
df

Unnamed: 0,item,purchase_dt,expiration_dt,extended_dt
0,item1,2024-09-30 17:00:00-07:00,2024-10-30 17:00:00-07:00,2024-11-05 04:30:00-08:00
1,item2,2024-10-01 17:00:00-07:00,2024-10-31 17:00:00-07:00,2024-11-06 04:30:00-08:00
2,item3,2024-10-02 17:00:00-07:00,2024-11-01 17:00:00-07:00,2024-11-07 04:30:00-08:00
3,item4,2024-10-03 17:00:00-07:00,2024-11-02 17:00:00-07:00,2024-11-08 04:30:00-08:00
4,item5,2024-10-04 17:00:00-07:00,2024-11-03 16:00:00-08:00,2024-11-09 04:30:00-08:00
5,item6,2024-10-05 17:00:00-07:00,2024-11-04 16:00:00-08:00,2024-11-10 04:30:00-08:00


In [112]:
df['exp_ext_diff'] = (
         df['extended_dt'] - df['expiration_dt']
        )
df

Unnamed: 0,item,purchase_dt,expiration_dt,extended_dt,exp_ext_diff
0,item1,2024-09-30 17:00:00-07:00,2024-10-30 17:00:00-07:00,2024-11-05 04:30:00-08:00,5 days 12:30:00
1,item2,2024-10-01 17:00:00-07:00,2024-10-31 17:00:00-07:00,2024-11-06 04:30:00-08:00,5 days 12:30:00
2,item3,2024-10-02 17:00:00-07:00,2024-11-01 17:00:00-07:00,2024-11-07 04:30:00-08:00,5 days 12:30:00
3,item4,2024-10-03 17:00:00-07:00,2024-11-02 17:00:00-07:00,2024-11-08 04:30:00-08:00,5 days 12:30:00
4,item5,2024-10-04 17:00:00-07:00,2024-11-03 16:00:00-08:00,2024-11-09 04:30:00-08:00,5 days 12:30:00
5,item6,2024-10-05 17:00:00-07:00,2024-11-04 16:00:00-08:00,2024-11-10 04:30:00-08:00,5 days 12:30:00


In [114]:
tds = pd.to_timedelta(['1 day', '5 days', '10 days 6 hours'])
tds

TimedeltaIndex(['1 days 00:00:00', '5 days 00:00:00', '10 days 06:00:00'], dtype='timedelta64[ns]', freq=None)

In [115]:
pd.to_timedelta(range(5), unit='W')

TimedeltaIndex(['0 days', '7 days', '14 days', '21 days', '28 days'], dtype='timedelta64[ns]', freq=None)

In [116]:
df['purchase_dt']

0   2024-09-30 17:00:00-07:00
1   2024-10-01 17:00:00-07:00
2   2024-10-02 17:00:00-07:00
3   2024-10-03 17:00:00-07:00
4   2024-10-04 17:00:00-07:00
5   2024-10-05 17:00:00-07:00
Name: purchase_dt, dtype: datetime64[ns, US/Pacific]

### Python datetime.timedelta vs pd.Timedelta

In [117]:
pd.to_timedelta('30 days')
pd.Timedelta(days=30)



Timedelta('30 days 00:00:00')

In [118]:
pd.Timedelta('35 days 12 hours 30 minutes')
pd.to_timedelta('35 days 12 hours 30 minutes')

Timedelta('35 days 12:30:00')

In [119]:
pd.Timedelta(2, unit="days")
pd.to_timedelta(2, unit="days")

Timedelta('2 days 00:00:00')

In [120]:
list_of_times = ["30 days", "2 hours", "1 day 4 hours 45 minutes"]
pd.to_timedelta(list_of_times)
# pd.Timedelta(list_of_times)

TimedeltaIndex(['30 days 00:00:00', '0 days 02:00:00', '1 days 04:45:00'], dtype='timedelta64[ns]', freq=None)

In [121]:
invalid_time = ["1 days", "invalid time"]
pd.to_timedelta(invalid_time, errors="coerce")

TimedeltaIndex(['1 days', NaT], dtype='timedelta64[ns]', freq=None)

In [122]:
import datetime as dt
import pandas as pd

In [123]:
dt.timedelta(days=1)

datetime.timedelta(days=1)

In [124]:
pd.Timedelta(days=1) == dt.timedelta(days=1)

True

In [125]:
dt_1 = pd.Timedelta(days=1)
dt_2 = dt.timedelta(days=1)
isinstance(pd.Timedelta, dt.timedelta)

False

In [126]:
isinstance(dt_1, dt.timedelta)

True

In [127]:
isinstance(dt_1, pd.Timedelta)

True

In [128]:
issubclass(pd.Timedelta, dt.timedelta)

True

In [129]:
issubclass(dt.timedelta, pd.Timedelta)

False

In [130]:
pd.Timedelta(days = 1, hours = 12)

Timedelta('1 days 12:00:00')

In [131]:
pd.Timedelta('10 us')

Timedelta('0 days 00:00:00.000010')

In [132]:
pd.Timedelta(days=1, hours=12, minutes=55)

Timedelta('1 days 12:55:00')

In [133]:
pd.Timedelta('1 day 12 hours 55 minutes')

Timedelta('1 days 12:55:00')

In [134]:
pd.Timedelta('1D 12h 55min')

Timedelta('1 days 12:55:00')

In [135]:
dt_1.min

Timedelta('-106752 days +00:12:43.145224193')

In [136]:
2*dt_1

Timedelta('2 days 00:00:00')

In [137]:
week_td = pd.Timedelta('1W')
pd.to_datetime('1 JAN 2022') + week_td

Timestamp('2022-01-08 00:00:00')

In [138]:
pd.to_datetime('1 JAN 2022') + 2*week_td

Timestamp('2022-01-15 00:00:00')

## There is more

In [139]:
import pandas as pd
df = pd.DataFrame(
        {       
        'item': ['item1', 'item2', 'item3', 'item4', 'item5', 'item6'],
        'purchase_dt': pd.date_range('2024-10-01', periods=6, freq='D', tz='UTC')
        }
)
df

Unnamed: 0,item,purchase_dt
0,item1,2024-10-01 00:00:00+00:00
1,item2,2024-10-02 00:00:00+00:00
2,item3,2024-10-03 00:00:00+00:00
3,item4,2024-10-04 00:00:00+00:00
4,item5,2024-10-05 00:00:00+00:00
5,item6,2024-10-06 00:00:00+00:00


In [140]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Data columns (total 2 columns):
 #   Column       Non-Null Count  Dtype              
---  ------       --------------  -----              
 0   item         6 non-null      object             
 1   purchase_dt  6 non-null      datetime64[ns, UTC]
dtypes: datetime64[ns, UTC](1), object(1)
memory usage: 228.0+ bytes


In [141]:
df['1 week'] = pd.Timedelta('1W')

In [142]:
df

Unnamed: 0,item,purchase_dt,1 week
0,item1,2024-10-01 00:00:00+00:00,7 days
1,item2,2024-10-02 00:00:00+00:00,7 days
2,item3,2024-10-03 00:00:00+00:00,7 days
3,item4,2024-10-04 00:00:00+00:00,7 days
4,item5,2024-10-05 00:00:00+00:00,7 days
5,item6,2024-10-06 00:00:00+00:00,7 days


In [143]:
df['1_week_more'] = df['purchase_dt'] + df['1 week']
df['1_week_less'] = df['purchase_dt'] - df['1 week']
df

Unnamed: 0,item,purchase_dt,1 week,1_week_more,1_week_less
0,item1,2024-10-01 00:00:00+00:00,7 days,2024-10-08 00:00:00+00:00,2024-09-24 00:00:00+00:00
1,item2,2024-10-02 00:00:00+00:00,7 days,2024-10-09 00:00:00+00:00,2024-09-25 00:00:00+00:00
2,item3,2024-10-03 00:00:00+00:00,7 days,2024-10-10 00:00:00+00:00,2024-09-26 00:00:00+00:00
3,item4,2024-10-04 00:00:00+00:00,7 days,2024-10-11 00:00:00+00:00,2024-09-27 00:00:00+00:00
4,item5,2024-10-05 00:00:00+00:00,7 days,2024-10-12 00:00:00+00:00,2024-09-28 00:00:00+00:00
5,item6,2024-10-06 00:00:00+00:00,7 days,2024-10-13 00:00:00+00:00,2024-09-29 00:00:00+00:00


In [144]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Data columns (total 5 columns):
 #   Column       Non-Null Count  Dtype              
---  ------       --------------  -----              
 0   item         6 non-null      object             
 1   purchase_dt  6 non-null      datetime64[ns, UTC]
 2   1 week       6 non-null      timedelta64[ns]    
 3   1_week_more  6 non-null      datetime64[ns, UTC]
 4   1_week_less  6 non-null      datetime64[ns, UTC]
dtypes: datetime64[ns, UTC](3), object(1), timedelta64[ns](1)
memory usage: 372.0+ bytes


In [145]:
pd.timedelta_range('1W 2 days', periods=5)

TimedeltaIndex(['9 days', '10 days', '11 days', '12 days', '13 days'], dtype='timedelta64[ns]', freq='D')

In [146]:
df = pd.DataFrame(
        {       
        'item': ['item1', 'item2', 'item3', 'item4', 'item5'],
        'purchase_dt': pd.date_range('2024-10-01', periods=5, freq='D', tz='UTC'),
        'time_deltas': pd.timedelta_range('1W 2 days 6 hours', periods=5)
        }

)

In [147]:
df

Unnamed: 0,item,purchase_dt,time_deltas
0,item1,2024-10-01 00:00:00+00:00,9 days 06:00:00
1,item2,2024-10-02 00:00:00+00:00,10 days 06:00:00
2,item3,2024-10-03 00:00:00+00:00,11 days 06:00:00
3,item4,2024-10-04 00:00:00+00:00,12 days 06:00:00
4,item5,2024-10-05 00:00:00+00:00,13 days 06:00:00


# Recipe 5: Converting Datetime with TimeZone information 

In [148]:
import pandas as pd

In [149]:
df = pd.DataFrame(
        {       
        'Location': ['Los Angeles', 
                     'New York',
                     'Berlin', 
                     'New Delhi', 
                     'Moscow', 
                     'Tokyo', 
                     'Dubai'],
        'tz': ['US/Pacific', 
               'US/Eastern', 
               'Europe/Berlin', 
               'Asia/Kolkata', 
               'Europe/Moscow', 
               'Asia/Tokyo',
               'Asia/Dubai'],
        'visit_dt': pd.date_range(start='22:00',periods=7, freq='45min'),
        }).set_index('visit_dt')

In [150]:
df

Unnamed: 0_level_0,Location,tz
visit_dt,Unnamed: 1_level_1,Unnamed: 2_level_1
2025-10-07 22:00:00,Los Angeles,US/Pacific
2025-10-07 22:45:00,New York,US/Eastern
2025-10-07 23:30:00,Berlin,Europe/Berlin
2025-10-08 00:15:00,New Delhi,Asia/Kolkata
2025-10-08 01:00:00,Moscow,Europe/Moscow
2025-10-08 01:45:00,Tokyo,Asia/Tokyo
2025-10-08 02:30:00,Dubai,Asia/Dubai


In [151]:
df = df.tz_localize('UTC')
df

Unnamed: 0_level_0,Location,tz
visit_dt,Unnamed: 1_level_1,Unnamed: 2_level_1
2025-10-07 22:00:00+00:00,Los Angeles,US/Pacific
2025-10-07 22:45:00+00:00,New York,US/Eastern
2025-10-07 23:30:00+00:00,Berlin,Europe/Berlin
2025-10-08 00:15:00+00:00,New Delhi,Asia/Kolkata
2025-10-08 01:00:00+00:00,Moscow,Europe/Moscow
2025-10-08 01:45:00+00:00,Tokyo,Asia/Tokyo
2025-10-08 02:30:00+00:00,Dubai,Asia/Dubai


In [152]:
df_hq = df.tz_convert('Asia/Tokyo')
df_hq

Unnamed: 0_level_0,Location,tz
visit_dt,Unnamed: 1_level_1,Unnamed: 2_level_1
2025-10-08 07:00:00+09:00,Los Angeles,US/Pacific
2025-10-08 07:45:00+09:00,New York,US/Eastern
2025-10-08 08:30:00+09:00,Berlin,Europe/Berlin
2025-10-08 09:15:00+09:00,New Delhi,Asia/Kolkata
2025-10-08 10:00:00+09:00,Moscow,Europe/Moscow
2025-10-08 10:45:00+09:00,Tokyo,Asia/Tokyo
2025-10-08 11:30:00+09:00,Dubai,Asia/Dubai


In [153]:
df['local_dt'] = df.index
df['local_dt'] = df.apply(lambda x: pd.Timestamp.tz_convert(x['local_dt'], x['tz']), axis=1)
df

Unnamed: 0_level_0,Location,tz,local_dt
visit_dt,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2025-10-07 22:00:00+00:00,Los Angeles,US/Pacific,2025-10-07 15:00:00-07:00
2025-10-07 22:45:00+00:00,New York,US/Eastern,2025-10-07 18:45:00-04:00
2025-10-07 23:30:00+00:00,Berlin,Europe/Berlin,2025-10-08 01:30:00+02:00
2025-10-08 00:15:00+00:00,New Delhi,Asia/Kolkata,2025-10-08 05:45:00+05:30
2025-10-08 01:00:00+00:00,Moscow,Europe/Moscow,2025-10-08 04:00:00+03:00
2025-10-08 01:45:00+00:00,Tokyo,Asia/Tokyo,2025-10-08 10:45:00+09:00
2025-10-08 02:30:00+00:00,Dubai,Asia/Dubai,2025-10-08 06:30:00+04:00


In [154]:
df.apply(lambda x: pd.Timestamp.tz_convert(x['local_dt'], x['tz']), axis=1)

visit_dt
2025-10-07 22:00:00+00:00    2025-10-07 15:00:00-07:00
2025-10-07 22:45:00+00:00    2025-10-07 18:45:00-04:00
2025-10-07 23:30:00+00:00    2025-10-08 01:30:00+02:00
2025-10-08 00:15:00+00:00    2025-10-08 05:45:00+05:30
2025-10-08 01:00:00+00:00    2025-10-08 04:00:00+03:00
2025-10-08 01:45:00+00:00    2025-10-08 10:45:00+09:00
2025-10-08 02:30:00+00:00    2025-10-08 06:30:00+04:00
dtype: object

In [155]:
pd.to_datetime(df['local_dt'], utc=True)

visit_dt
2025-10-07 22:00:00+00:00   2025-10-07 22:00:00+00:00
2025-10-07 22:45:00+00:00   2025-10-07 22:45:00+00:00
2025-10-07 23:30:00+00:00   2025-10-07 23:30:00+00:00
2025-10-08 00:15:00+00:00   2025-10-08 00:15:00+00:00
2025-10-08 01:00:00+00:00   2025-10-08 01:00:00+00:00
2025-10-08 01:45:00+00:00   2025-10-08 01:45:00+00:00
2025-10-08 02:30:00+00:00   2025-10-08 02:30:00+00:00
Name: local_dt, dtype: datetime64[ns, UTC]

## There is More

In [159]:
df = pd.DataFrame(
        {       
        'Location': ['Los Angeles', 
                     'New York',
                     'Berlin', 
                     'New Delhi', 
                     'Moscow', 
                     'Tokyo', 
                     'Dubai'],
        'tz': ['US/Pacific', 
               'US/Eastern', 
               'Europe/Berlin', 
               'Asia/Kolkata', 
               'Europe/Moscow', 
               'Asia/Tokyo',
               'Asia/Dubai'],
        'visit_dt': pd.date_range(start='22:00',periods=7, freq='45min'),
        }).set_index('visit_dt').tz_localize('UTC').tz_convert('Asia/Tokyo')
df

Unnamed: 0_level_0,Location,tz
visit_dt,Unnamed: 1_level_1,Unnamed: 2_level_1
2025-10-08 07:00:00+09:00,Los Angeles,US/Pacific
2025-10-08 07:45:00+09:00,New York,US/Eastern
2025-10-08 08:30:00+09:00,Berlin,Europe/Berlin
2025-10-08 09:15:00+09:00,New Delhi,Asia/Kolkata
2025-10-08 10:00:00+09:00,Moscow,Europe/Moscow
2025-10-08 10:45:00+09:00,Tokyo,Asia/Tokyo
2025-10-08 11:30:00+09:00,Dubai,Asia/Dubai


In [160]:
df['am/pm'] = df.index.strftime('%p')
df

Unnamed: 0_level_0,Location,tz,am/pm
visit_dt,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2025-10-08 07:00:00+09:00,Los Angeles,US/Pacific,AM
2025-10-08 07:45:00+09:00,New York,US/Eastern,AM
2025-10-08 08:30:00+09:00,Berlin,Europe/Berlin,AM
2025-10-08 09:15:00+09:00,New Delhi,Asia/Kolkata,AM
2025-10-08 10:00:00+09:00,Moscow,Europe/Moscow,AM
2025-10-08 10:45:00+09:00,Tokyo,Asia/Tokyo,AM
2025-10-08 11:30:00+09:00,Dubai,Asia/Dubai,AM


In [161]:
type(df.index)

pandas.core.indexes.datetimes.DatetimeIndex

In [162]:
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 7 entries, 2025-10-08 07:00:00+09:00 to 2025-10-08 11:30:00+09:00
Data columns (total 3 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   Location  7 non-null      object
 1   tz        7 non-null      object
 2   am/pm     7 non-null      object
dtypes: object(3)
memory usage: 224.0+ bytes


# Recipe 6: Working with Date Offets

In [326]:
import pandas as pd
import numpy as np

In [163]:
np.random.seed(10)
days = 16
df = pd.DataFrame(
        {       
        'prod_date': pd.date_range('2023-12-20', periods=days, freq='D'),
        'production' : np.random.randint(4, 20, days)
        }).set_index('prod_date')
print(df)


            production
prod_date             
2023-12-20          13
2023-12-21          17
2023-12-22           8
2023-12-23          19
2023-12-24           4
2023-12-25           5
2023-12-26          15
2023-12-27          16
2023-12-28          13
2023-12-29          17
2023-12-30           4
2023-12-31          17
2024-01-01           5
2024-01-02          14
2024-01-03          12
2024-01-04          13


In [164]:
df['day_name'] = df.index.day_name()

In [165]:
print(df)

            production   day_name
prod_date                        
2023-12-20          13  Wednesday
2023-12-21          17   Thursday
2023-12-22           8     Friday
2023-12-23          19   Saturday
2023-12-24           4     Sunday
2023-12-25           5     Monday
2023-12-26          15    Tuesday
2023-12-27          16  Wednesday
2023-12-28          13   Thursday
2023-12-29          17     Friday
2023-12-30           4   Saturday
2023-12-31          17     Sunday
2024-01-01           5     Monday
2024-01-02          14    Tuesday
2024-01-03          12  Wednesday
2024-01-04          13   Thursday


In [167]:
type(pd.offsets.BDay(0))

pandas._libs.tslibs.offsets.BusinessDay

In [168]:
df['adj_bus_date'] = df.index + pd.offsets.BDay(0)
df['adj_bus_dname'] = df['adj_bus_date'].dt.day_name()
print(df)

            production   day_name adj_bus_date adj_bus_dname
prod_date                                                   
2023-12-20          13  Wednesday   2023-12-20     Wednesday
2023-12-21          17   Thursday   2023-12-21      Thursday
2023-12-22           8     Friday   2023-12-22        Friday
2023-12-23          19   Saturday   2023-12-25        Monday
2023-12-24           4     Sunday   2023-12-25        Monday
2023-12-25           5     Monday   2023-12-25        Monday
2023-12-26          15    Tuesday   2023-12-26       Tuesday
2023-12-27          16  Wednesday   2023-12-27     Wednesday
2023-12-28          13   Thursday   2023-12-28      Thursday
2023-12-29          17     Friday   2023-12-29        Friday
2023-12-30           4   Saturday   2024-01-01        Monday
2023-12-31          17     Sunday   2024-01-01        Monday
2024-01-01           5     Monday   2024-01-01        Monday
2024-01-02          14    Tuesday   2024-01-02       Tuesday
2024-01-03          12  

In [169]:
df['adj_bus_date'] = df.index.map(
        lambda x: pd.offsets.BDay().rollforward(x))
df['adj_bus_dname'] = df['adj_bus_date'].dt.day_name()
print(df)

            production   day_name adj_bus_date adj_bus_dname
prod_date                                                   
2023-12-20          13  Wednesday   2023-12-20     Wednesday
2023-12-21          17   Thursday   2023-12-21      Thursday
2023-12-22           8     Friday   2023-12-22        Friday
2023-12-23          19   Saturday   2023-12-25        Monday
2023-12-24           4     Sunday   2023-12-25        Monday
2023-12-25           5     Monday   2023-12-25        Monday
2023-12-26          15    Tuesday   2023-12-26       Tuesday
2023-12-27          16  Wednesday   2023-12-27     Wednesday
2023-12-28          13   Thursday   2023-12-28      Thursday
2023-12-29          17     Friday   2023-12-29        Friday
2023-12-30           4   Saturday   2024-01-01        Monday
2023-12-31          17     Sunday   2024-01-01        Monday
2024-01-01           5     Monday   2024-01-01        Monday
2024-01-02          14    Tuesday   2024-01-02       Tuesday
2024-01-03          12  

In [170]:
cols = ['adj_bus_date', 'adj_bus_dname']
df_weekdays = df.groupby(cols)[['production']].sum()
df_weekdays.reset_index(level='adj_bus_dname', inplace=True)
print(df_weekdays)

             adj_bus_dname  production
adj_bus_date                          
2023-12-20       Wednesday          13
2023-12-21        Thursday          17
2023-12-22          Friday           8
2023-12-25          Monday          28
2023-12-26         Tuesday          15
2023-12-27       Wednesday          16
2023-12-28        Thursday          13
2023-12-29          Friday          17
2024-01-01          Monday          26
2024-01-02         Tuesday          14
2024-01-03       Wednesday          12
2024-01-04        Thursday          13


In [171]:
df_weekdays['quarter_end'] = (df_weekdays.index + 
                             pd.offsets.QuarterEnd())
df_weekdays

Unnamed: 0_level_0,adj_bus_dname,production,quarter_end
adj_bus_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2023-12-20,Wednesday,13,2023-12-31
2023-12-21,Thursday,17,2023-12-31
2023-12-22,Friday,8,2023-12-31
2023-12-25,Monday,28,2023-12-31
2023-12-26,Tuesday,15,2023-12-31
2023-12-27,Wednesday,16,2023-12-31
2023-12-28,Thursday,13,2023-12-31
2023-12-29,Friday,17,2023-12-31
2024-01-01,Monday,26,2024-03-31
2024-01-02,Tuesday,14,2024-03-31


In [172]:
df_weekdays['month_end'] = (df_weekdays.index + 
                           pd.offsets.MonthEnd())

print(df_weekdays)

             adj_bus_dname  production quarter_end  month_end
adj_bus_date                                                 
2023-12-20       Wednesday          13  2023-12-31 2023-12-31
2023-12-21        Thursday          17  2023-12-31 2023-12-31
2023-12-22          Friday           8  2023-12-31 2023-12-31
2023-12-25          Monday          28  2023-12-31 2023-12-31
2023-12-26         Tuesday          15  2023-12-31 2023-12-31
2023-12-27       Wednesday          16  2023-12-31 2023-12-31
2023-12-28        Thursday          13  2023-12-31 2023-12-31
2023-12-29          Friday          17  2023-12-31 2023-12-31
2024-01-01          Monday          26  2024-03-31 2024-01-31
2024-01-02         Tuesday          14  2024-03-31 2024-01-31
2024-01-03       Wednesday          12  2024-03-31 2024-01-31
2024-01-04        Thursday          13  2024-03-31 2024-01-31


## There is More

In [173]:
from pandas.tseries.holiday import (
    USFederalHolidayCalendar,
    AbstractHolidayCalendar,
    Holiday,
    nearest_workday

)
from pandas.tseries.offsets import CustomBusinessDay, CDay

# df = pd.DataFrame(
#         {       
#         'purchase_dt': pd.date_range('2021-01-01', periods=6, freq='D'),
#         'production' : np.random.randint(4, 20, 6)
#         }).set_index('purchase_dt')
# df

In [174]:
df = df_weekdays[['adj_bus_dname', 'production']].copy()
print(df)

             adj_bus_dname  production
adj_bus_date                          
2023-12-20       Wednesday          13
2023-12-21        Thursday          17
2023-12-22          Friday           8
2023-12-25          Monday          28
2023-12-26         Tuesday          15
2023-12-27       Wednesday          16
2023-12-28        Thursday          13
2023-12-29          Friday          17
2024-01-01          Monday          26
2024-01-02         Tuesday          14
2024-01-03       Wednesday          12
2024-01-04        Thursday          13


In [175]:
USFederalHolidayCalendar

pandas.tseries.holiday.USFederalHolidayCalendar

In [176]:
USFederalHolidayCalendar.rules

[Holiday: New Year's Day (month=1, day=1, observance=<function nearest_workday at 0x11207b2e0>),
 Holiday: Birthday of Martin Luther King, Jr. (month=1, day=1, offset=<DateOffset: weekday=MO(+3)>),
 Holiday: Washington's Birthday (month=2, day=1, offset=<DateOffset: weekday=MO(+3)>),
 Holiday: Memorial Day (month=5, day=31, offset=<DateOffset: weekday=MO(-1)>),
 Holiday: Juneteenth National Independence Day (month=6, day=19, observance=<function nearest_workday at 0x11207b2e0>),
 Holiday: Independence Day (month=7, day=4, observance=<function nearest_workday at 0x11207b2e0>),
 Holiday: Labor Day (month=9, day=1, offset=<DateOffset: weekday=MO(+1)>),
 Holiday: Columbus Day (month=10, day=1, offset=<DateOffset: weekday=MO(+2)>),
 Holiday: Veterans Day (month=11, day=11, observance=<function nearest_workday at 0x11207b2e0>),
 Holiday: Thanksgiving Day (month=11, day=1, offset=<DateOffset: weekday=TH(+4)>),
 Holiday: Christmas Day (month=12, day=25, observance=<function nearest_workday at 

In [177]:
len(USFederalHolidayCalendar.rules)

11

In [178]:
# %time
def holiday_offset(date):
    us_holiday_offset = CDay(calendar=USFederalHolidayCalendar())
    return us_holiday_offset.rollforward(date)

df['adj_holiday_date'] = df.index.to_series().map(holiday_offset)
df['adj_holiday_dname'] = df['adj_holiday_date'].dt.day_name()
print(df)

             adj_bus_dname  production adj_holiday_date adj_holiday_dname
adj_bus_date                                                             
2023-12-20       Wednesday          13       2023-12-20         Wednesday
2023-12-21        Thursday          17       2023-12-21          Thursday
2023-12-22          Friday           8       2023-12-22            Friday
2023-12-25          Monday          28       2023-12-26           Tuesday
2023-12-26         Tuesday          15       2023-12-26           Tuesday
2023-12-27       Wednesday          16       2023-12-27         Wednesday
2023-12-28        Thursday          13       2023-12-28          Thursday
2023-12-29          Friday          17       2023-12-29            Friday
2024-01-01          Monday          26       2024-01-02           Tuesday
2024-01-02         Tuesday          14       2024-01-02           Tuesday
2024-01-03       Wednesday          12       2024-01-03         Wednesday
2024-01-04        Thursday          13

In [179]:
custom_newyears = Holiday("Custom New Years", 
                   month=1, 
                   day=1, 
                   observance=nearest_workday)
custom_newyears

Holiday: Custom New Years (month=1, day=1, observance=<function nearest_workday at 0x11207b2e0>)

In [180]:
class MyHolidayCalendar(AbstractHolidayCalendar):
    rules = [
        Holiday("Cust New Years", month=1, day=1, observance=nearest_workday),
        Holiday("Cust Christmass Day", month=12, day=25, observance=nearest_workday)
    ]

mycal = MyHolidayCalendar()

In [181]:
def custom_holiday_offset(date):
    cust_holiday_offset = CDay(calendar=mycal)
    return cust_holiday_offset.rollforward(date)

df['cust_holiday_date'] = df.index.to_series().map(custom_holiday_offset)
df['cust_holiday_dname'] = df['cust_holiday_date'].dt.day_name()

# df_weekdays['NewYearsHoliday'] = df_weekdays.index + pd.offsets.CDay(calendar=newyears)
# df_weekdays

df

Unnamed: 0_level_0,adj_bus_dname,production,adj_holiday_date,adj_holiday_dname,cust_holiday_date,cust_holiday_dname
adj_bus_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2023-12-20,Wednesday,13,2023-12-20,Wednesday,2023-12-20,Wednesday
2023-12-21,Thursday,17,2023-12-21,Thursday,2023-12-21,Thursday
2023-12-22,Friday,8,2023-12-22,Friday,2023-12-22,Friday
2023-12-25,Monday,28,2023-12-26,Tuesday,2023-12-26,Tuesday
2023-12-26,Tuesday,15,2023-12-26,Tuesday,2023-12-26,Tuesday
2023-12-27,Wednesday,16,2023-12-27,Wednesday,2023-12-27,Wednesday
2023-12-28,Thursday,13,2023-12-28,Thursday,2023-12-28,Thursday
2023-12-29,Friday,17,2023-12-29,Friday,2023-12-29,Friday
2024-01-01,Monday,26,2024-01-02,Tuesday,2024-01-02,Tuesday
2024-01-02,Tuesday,14,2024-01-02,Tuesday,2024-01-02,Tuesday


In [182]:
nearest_workday(pd.to_datetime('2024-11-30'))

Timestamp('2024-11-29 00:00:00')

In [183]:
nearest_workday(pd.to_datetime('2024-12-1'))

Timestamp('2024-12-02 00:00:00')

# Recipe 7: Working with custom business days

In [184]:
import pandas as pd
from pandas.tseries.holiday import AbstractHolidayCalendar, Holiday
from pandas.tseries.offsets import CustomBusinessDay

# Define a custom holiday calendar for Jordan's Independence Day and Labor Day holidays 
class JordanHolidayCalendar(AbstractHolidayCalendar):
    rules = [
        Holiday('Jordan Independence Day', month=5, day=25),
        Holiday('Jordan Labor Day', month=5, day=1)
    ]


In [185]:
# Define the work week: Sunday to Thursday
jo_mask = "Sun Mon Tue Wed Thu"
jo_mask = "1111100"

In [186]:
# Create a CustomBusinessDay object with Jordan's workdays and holidays
jo_busdays = CustomBusinessDay(
    holidays=JordanHolidayCalendar().holidays(start='2024-01-01',
                                             end='2025-12-31'), 
    weekmask=jo_mask
)

In [187]:
jo_busdays.weekmask

'1111100'

In [188]:
JordanHolidayCalendar().rules

[Holiday: Jordan Independence Day (month=5, day=25, ),
 Holiday: Jordan Labor Day (month=5, day=1, )]

In [189]:
JordanHolidayCalendar().holidays(start='2024-01-01', 
                                 end='2025-12-31')

DatetimeIndex(['2024-05-01', '2024-05-25', '2025-05-01', '2025-05-25'], dtype='datetime64[ns]', freq=None)

In [190]:
JordanHolidayCalendar().holidays()

DatetimeIndex(['1970-05-01', '1970-05-25', '1971-05-01', '1971-05-25',
               '1972-05-01', '1972-05-25', '1973-05-01', '1973-05-25',
               '1974-05-01', '1974-05-25',
               ...
               '2196-05-01', '2196-05-25', '2197-05-01', '2197-05-25',
               '2198-05-01', '2198-05-25', '2199-05-01', '2199-05-25',
               '2200-05-01', '2200-05-25'],
              dtype='datetime64[ns]', length=462, freq=None)

In [191]:
JordanHolidayCalendar().rules[0].dates('2024-01-01', '2024-12-31')

DatetimeIndex(['2024-05-25'], dtype='datetime64[ns]', freq='<DateOffset: years=1>')

In [192]:
JordanHolidayCalendar().rules[1].dates('2024-01-01', '2024-12-31')

DatetimeIndex(['2024-05-01'], dtype='datetime64[ns]', freq='<DateOffset: years=1>')

In [193]:
len(jo_busdays.holidays)

4

In [194]:
jo_busdays

<CustomBusinessDay>

In [195]:
jo_busdays.holidays[0]
# jo_busdays.holidays[53]

np.datetime64('2024-05-01')

In [196]:
jo_busdays.holidays[-1]

np.datetime64('2025-05-25')

In [197]:
days = 20
df = pd.DataFrame({'Date': pd.date_range('2024-5-1', periods=days, freq=jo_busdays)})
print(df)

         Date
0  2024-05-02
1  2024-05-03
2  2024-05-06
3  2024-05-07
4  2024-05-08
5  2024-05-09
6  2024-05-10
7  2024-05-13
8  2024-05-14
9  2024-05-15
10 2024-05-16
11 2024-05-17
12 2024-05-20
13 2024-05-21
14 2024-05-22
15 2024-05-23
16 2024-05-24
17 2024-05-27
18 2024-05-28
19 2024-05-29


In [198]:
df['Day_name'] = df.Date.dt.day_name()
print(df)

         Date   Day_name
0  2024-05-02   Thursday
1  2024-05-03     Friday
2  2024-05-06     Monday
3  2024-05-07    Tuesday
4  2024-05-08  Wednesday
5  2024-05-09   Thursday
6  2024-05-10     Friday
7  2024-05-13     Monday
8  2024-05-14    Tuesday
9  2024-05-15  Wednesday
10 2024-05-16   Thursday
11 2024-05-17     Friday
12 2024-05-20     Monday
13 2024-05-21    Tuesday
14 2024-05-22  Wednesday
15 2024-05-23   Thursday
16 2024-05-24     Friday
17 2024-05-27     Monday
18 2024-05-28    Tuesday
19 2024-05-29  Wednesday


### Custom Business Hours

In [199]:
JordanHolidayCalendar().rules

[Holiday: Jordan Independence Day (month=5, day=25, ),
 Holiday: Jordan Labor Day (month=5, day=1, )]

In [200]:
jo_busshours = pd.offsets.CustomBusinessHour(
    start="7:30",
    end="16:30",
    holidays=JordanHolidayCalendar().holidays(start='2024-01-01',
                                             end='2025-12-31'), 
    weekmask=jo_mask)

In [201]:
start = '2024-05-1'
end = '2024-05-29'

business_hours = pd.date_range(start=start, end=end, freq=jo_busshours)
print(business_hours[0:10])

DatetimeIndex(['2024-05-02 07:30:00', '2024-05-02 08:30:00',
               '2024-05-02 09:30:00', '2024-05-02 10:30:00',
               '2024-05-02 11:30:00', '2024-05-02 12:30:00',
               '2024-05-02 13:30:00', '2024-05-02 14:30:00',
               '2024-05-02 15:30:00', '2024-05-03 07:30:00'],
              dtype='datetime64[ns]', freq='cbh')


In [202]:
business_hours[0:10]

DatetimeIndex(['2024-05-02 07:30:00', '2024-05-02 08:30:00',
               '2024-05-02 09:30:00', '2024-05-02 10:30:00',
               '2024-05-02 11:30:00', '2024-05-02 12:30:00',
               '2024-05-02 13:30:00', '2024-05-02 14:30:00',
               '2024-05-02 15:30:00', '2024-05-03 07:30:00'],
              dtype='datetime64[ns]', freq='cbh')

In [203]:
# Localize to UTC
df['Date'] = df['Date'].dt.tz_localize('UTC')

# Convert to another time zone (e.g., Asia/Amman)
df['Date'] = df['Date'].dt.tz_convert('Asia/Amman')

In [204]:
df

Unnamed: 0,Date,Day_name
0,2024-05-02 03:00:00+03:00,Thursday
1,2024-05-03 03:00:00+03:00,Friday
2,2024-05-06 03:00:00+03:00,Monday
3,2024-05-07 03:00:00+03:00,Tuesday
4,2024-05-08 03:00:00+03:00,Wednesday
5,2024-05-09 03:00:00+03:00,Thursday
6,2024-05-10 03:00:00+03:00,Friday
7,2024-05-13 03:00:00+03:00,Monday
8,2024-05-14 03:00:00+03:00,Tuesday
9,2024-05-15 03:00:00+03:00,Wednesday
