# Working with Dates and Times

In [2]:
import pandas as pd
import datetime as dt

## Review of Python's datetime Module
- The `datetime` module is built into the core Python programming language.
- The common alias for the `datetime` module is `dt`.
- A module is a Python source file; think of like an internal library that Python loads on demand.
- The `datetime` module includes `date` and `datetime` classes for representing dates and datetimes.
- The `date` constructor accepts arguments for year, month, and day. Python defaults to 0 for any missing values.
- The `datetime` constructor accepts arguments for year, month, day, hour, minute, and second.

In [3]:
dt.date(2025, 12, 25)

datetime.date(2025, 12, 25)

In [5]:
someday = dt.date(2025, 12, 25)
someday.year

2025

In [6]:
someday.month

12

In [7]:
someday.day

25

In [8]:
dt.datetime(2025, 12, 25)

datetime.datetime(2025, 12, 25, 0, 0)

In [9]:
dt.datetime(2025, 12, 25, 8)

datetime.datetime(2025, 12, 25, 8, 0)

In [10]:
dt.datetime(2025, 12, 25, 8, 13)

datetime.datetime(2025, 12, 25, 8, 13)

In [11]:
dt.datetime(2025, 12, 25, 8, 13, 59)

datetime.datetime(2025, 12, 25, 8, 13, 59)

In [12]:
sometime = dt.datetime(2025, 12, 25, 8, 13, 59)
sometime.year

2025

In [13]:
sometime.month

12

In [14]:
sometime.day

25

In [15]:
sometime.hour

8

In [16]:
sometime.minute

13

In [17]:
sometime.second

59

## The Timestamp and DatetimeIndex Objects

- Pandas ships with several classes related to datetimes.
- The **Timestamp** is similar to Python's **datetime** object (but with expanded functionality).
- A **DatetimeIndex** is an index of **Timestamp** objects.
- The **Timestamp** constructor accepts a string, a **datetime** object, or equivalent arguments to the **datetime** clas.

In [19]:
pd.Timestamp(2027, 3, 12)

Timestamp('2027-03-12 00:00:00')

In [20]:
pd.Timestamp(2027, 3, 12, 18, 23, 49)

Timestamp('2027-03-12 18:23:49')

In [21]:
pd.Timestamp(dt.date(2028, 10, 23))

Timestamp('2028-10-23 00:00:00')

In [22]:
pd.Timestamp(dt.datetime(2028, 10, 23, 15, 35))

Timestamp('2028-10-23 15:35:00')

In [23]:
pd.Timestamp('2025-01-01')

Timestamp('2025-01-01 00:00:00')

In [24]:
pd.Timestamp('2025/01/01')

Timestamp('2025-01-01 00:00:00')

In [25]:
pd.Timestamp('2025-01-01 08:25:15')

Timestamp('2025-01-01 08:25:15')

In [26]:
pd.Series(pd.Timestamp('2025-03-08 07:25:15'))

0   2025-03-08 07:25:15
dtype: datetime64[ns]

In [27]:
pd.Series(pd.Timestamp('2025-03-08 07:25:15')).iloc[0]

Timestamp('2025-03-08 07:25:15')

In [28]:
pd.DatetimeIndex(['2025-01-01', '2025-02-01', '2025-03-01'])

DatetimeIndex(['2025-01-01', '2025-02-01', '2025-03-01'], dtype='datetime64[ns]', freq=None)

In [30]:
index = pd.DatetimeIndex([
    dt.date(2026,1,10),
    dt.date(2026,2,10)
])
index

DatetimeIndex(['2026-01-10', '2026-02-10'], dtype='datetime64[ns]', freq=None)

In [31]:
index[0]

Timestamp('2026-01-10 00:00:00')

## Create Range of Dates with pd.date_range Function
- The `date_range` function generates and returns a **DatetimeIndex** holding a sequence of dates.
- The function requires 2 of the 3 following parameters: `start`, `end`, and `period`.
- With `start` and `end`, Pandas will assume a daily period/interval.
- Every element within a **DatetimeIndex** is a **Timestamp**.

In [34]:
pd.date_range(start='2025-01-01', end='2025-01-07')
pd.date_range(start='2025-01-01', end='2025-01-07', freq='D')  # same as above

DatetimeIndex(['2025-01-01', '2025-01-02', '2025-01-03', '2025-01-04',
               '2025-01-05', '2025-01-06', '2025-01-07'],
              dtype='datetime64[ns]', freq='D')

In [35]:
pd.date_range(start='2025-01-01', end='2025-01-07', freq='2D')  # every 2 days print it out

DatetimeIndex(['2025-01-01', '2025-01-03', '2025-01-05', '2025-01-07'], dtype='datetime64[ns]', freq='2D')

In [36]:
pd.date_range(start='2025-01-01', end='2025-01-07', freq='B')  # B is short for business day, i.e. Monday ~ Friday 

DatetimeIndex(['2025-01-01', '2025-01-02', '2025-01-03', '2025-01-06',
               '2025-01-07'],
              dtype='datetime64[ns]', freq='B')

In [38]:
pd.date_range(start='2025-01-01', end='2025-01-31', freq='W')  # W is short for weekly, and because 2025-01-01 is Sunday, show every Sunday date

DatetimeIndex(['2025-01-05', '2025-01-12', '2025-01-19', '2025-01-26'], dtype='datetime64[ns]', freq='W-SUN')

In [40]:
pd.date_range(start='2025-01-01', end='2025-01-31', freq='W-FRI')  # jumping in the increment of the week, but instead of starting from Sunday(2025-01-01), starts from Friday 

DatetimeIndex(['2025-01-03', '2025-01-10', '2025-01-17', '2025-01-24',
               '2025-01-31'],
              dtype='datetime64[ns]', freq='W-FRI')

In [41]:
pd.date_range(start='2025-01-01', end='2025-01-31', freq='W-THU')

DatetimeIndex(['2025-01-02', '2025-01-09', '2025-01-16', '2025-01-23',
               '2025-01-30'],
              dtype='datetime64[ns]', freq='W-THU')

In [43]:
pd.date_range(start='2025-01-01', end='2025-01-31', freq='h')  # h is short for hourly

DatetimeIndex(['2025-01-01 00:00:00', '2025-01-01 01:00:00',
               '2025-01-01 02:00:00', '2025-01-01 03:00:00',
               '2025-01-01 04:00:00', '2025-01-01 05:00:00',
               '2025-01-01 06:00:00', '2025-01-01 07:00:00',
               '2025-01-01 08:00:00', '2025-01-01 09:00:00',
               ...
               '2025-01-30 15:00:00', '2025-01-30 16:00:00',
               '2025-01-30 17:00:00', '2025-01-30 18:00:00',
               '2025-01-30 19:00:00', '2025-01-30 20:00:00',
               '2025-01-30 21:00:00', '2025-01-30 22:00:00',
               '2025-01-30 23:00:00', '2025-01-31 00:00:00'],
              dtype='datetime64[ns]', length=721, freq='h')

In [44]:
pd.date_range(start='2025-01-01', end='2025-01-31', freq='6h')  # every 6 hours

DatetimeIndex(['2025-01-01 00:00:00', '2025-01-01 06:00:00',
               '2025-01-01 12:00:00', '2025-01-01 18:00:00',
               '2025-01-02 00:00:00', '2025-01-02 06:00:00',
               '2025-01-02 12:00:00', '2025-01-02 18:00:00',
               '2025-01-03 00:00:00', '2025-01-03 06:00:00',
               ...
               '2025-01-28 18:00:00', '2025-01-29 00:00:00',
               '2025-01-29 06:00:00', '2025-01-29 12:00:00',
               '2025-01-29 18:00:00', '2025-01-30 00:00:00',
               '2025-01-30 06:00:00', '2025-01-30 12:00:00',
               '2025-01-30 18:00:00', '2025-01-31 00:00:00'],
              dtype='datetime64[ns]', length=121, freq='6h')

In [63]:
pd.date_range(start='2000-01-01', end='2020-12-31', freq='24D 3h') # collect datetime in every 24 days and 3 hours

DatetimeIndex(['2000-01-01 00:00:00', '2000-01-25 03:00:00',
               '2000-02-18 06:00:00', '2000-03-13 09:00:00',
               '2000-04-06 12:00:00', '2000-04-30 15:00:00',
               '2000-05-24 18:00:00', '2000-06-17 21:00:00',
               '2000-07-12 00:00:00', '2000-08-05 03:00:00',
               ...
               '2020-05-05 12:00:00', '2020-05-29 15:00:00',
               '2020-06-22 18:00:00', '2020-07-16 21:00:00',
               '2020-08-10 00:00:00', '2020-09-03 03:00:00',
               '2020-09-27 06:00:00', '2020-10-21 09:00:00',
               '2020-11-14 12:00:00', '2020-12-08 15:00:00'],
              dtype='datetime64[ns]', length=318, freq='579h')

In [46]:
pd.date_range(start='2025-01-01', end='2025-12-31', freq='ME')  # ME is short for month end

DatetimeIndex(['2025-01-31', '2025-02-28', '2025-03-31', '2025-04-30',
               '2025-05-31', '2025-06-30', '2025-07-31', '2025-08-31',
               '2025-09-30', '2025-10-31', '2025-11-30', '2025-12-31'],
              dtype='datetime64[ns]', freq='ME')

In [47]:
pd.date_range(start='2025-01-01', end='2025-12-31', freq='MS')  # Ms is short for month start

DatetimeIndex(['2025-01-01', '2025-02-01', '2025-03-01', '2025-04-01',
               '2025-05-01', '2025-06-01', '2025-07-01', '2025-08-01',
               '2025-09-01', '2025-10-01', '2025-11-01', '2025-12-01'],
              dtype='datetime64[ns]', freq='MS')

In [49]:
pd.date_range(start='2025-01-01', end='2050-12-31', freq='YE')  # YE is short for year end

DatetimeIndex(['2025-12-31', '2026-12-31', '2027-12-31', '2028-12-31',
               '2029-12-31', '2030-12-31', '2031-12-31', '2032-12-31',
               '2033-12-31', '2034-12-31', '2035-12-31', '2036-12-31',
               '2037-12-31', '2038-12-31', '2039-12-31', '2040-12-31',
               '2041-12-31', '2042-12-31', '2043-12-31', '2044-12-31',
               '2045-12-31', '2046-12-31', '2047-12-31', '2048-12-31',
               '2049-12-31', '2050-12-31'],
              dtype='datetime64[ns]', freq='YE-DEC')

In [50]:
pd.date_range(start='2025-01-01', end='2050-12-31', freq='YS')  # YS is short for year start

DatetimeIndex(['2025-01-01', '2026-01-01', '2027-01-01', '2028-01-01',
               '2029-01-01', '2030-01-01', '2031-01-01', '2032-01-01',
               '2033-01-01', '2034-01-01', '2035-01-01', '2036-01-01',
               '2037-01-01', '2038-01-01', '2039-01-01', '2040-01-01',
               '2041-01-01', '2042-01-01', '2043-01-01', '2044-01-01',
               '2045-01-01', '2046-01-01', '2047-01-01', '2048-01-01',
               '2049-01-01', '2050-01-01'],
              dtype='datetime64[ns]', freq='YS-JAN')

In [52]:
pd.date_range(start='2025-09-09', freq='D', periods=25)  # periods: how many dates to generate

DatetimeIndex(['2025-09-09', '2025-09-10', '2025-09-11', '2025-09-12',
               '2025-09-13', '2025-09-14', '2025-09-15', '2025-09-16',
               '2025-09-17', '2025-09-18', '2025-09-19', '2025-09-20',
               '2025-09-21', '2025-09-22', '2025-09-23', '2025-09-24',
               '2025-09-25', '2025-09-26', '2025-09-27', '2025-09-28',
               '2025-09-29', '2025-09-30', '2025-10-01', '2025-10-02',
               '2025-10-03'],
              dtype='datetime64[ns]', freq='D')

In [54]:
pd.date_range(start='2025-09-09', freq='3D', periods=10)  # Get dates in every 3 days, and totally collect 10 days 

DatetimeIndex(['2025-09-09', '2025-09-12', '2025-09-15', '2025-09-18',
               '2025-09-21', '2025-09-24', '2025-09-27', '2025-09-30',
               '2025-10-03', '2025-10-06'],
              dtype='datetime64[ns]', freq='3D')

In [55]:
pd.date_range(start='2025-09-09', freq='B', periods=10)  # Get business days only, and collect 10 business days in total

DatetimeIndex(['2025-09-09', '2025-09-10', '2025-09-11', '2025-09-12',
               '2025-09-15', '2025-09-16', '2025-09-17', '2025-09-18',
               '2025-09-19', '2025-09-22'],
              dtype='datetime64[ns]', freq='B')

In [57]:
pd.date_range(end='2025-10-31', freq='D', periods=20)  # generate days from end day backwards to collect 20 days in total

DatetimeIndex(['2025-10-12', '2025-10-13', '2025-10-14', '2025-10-15',
               '2025-10-16', '2025-10-17', '2025-10-18', '2025-10-19',
               '2025-10-20', '2025-10-21', '2025-10-22', '2025-10-23',
               '2025-10-24', '2025-10-25', '2025-10-26', '2025-10-27',
               '2025-10-28', '2025-10-29', '2025-10-30', '2025-10-31'],
              dtype='datetime64[ns]', freq='D')

In [58]:
pd.date_range(end='2025-10-31', freq='B', periods=20)  # go backwards to collect 20 business days

DatetimeIndex(['2025-10-06', '2025-10-07', '2025-10-08', '2025-10-09',
               '2025-10-10', '2025-10-13', '2025-10-14', '2025-10-15',
               '2025-10-16', '2025-10-17', '2025-10-20', '2025-10-21',
               '2025-10-22', '2025-10-23', '2025-10-24', '2025-10-27',
               '2025-10-28', '2025-10-29', '2025-10-30', '2025-10-31'],
              dtype='datetime64[ns]', freq='B')

In [60]:
pd.date_range(end='2025-10-31', freq='W-FRI', periods=20)  # go backwards to collect every Friday until 20 Fridays collected

DatetimeIndex(['2025-06-20', '2025-06-27', '2025-07-04', '2025-07-11',
               '2025-07-18', '2025-07-25', '2025-08-01', '2025-08-08',
               '2025-08-15', '2025-08-22', '2025-08-29', '2025-09-05',
               '2025-09-12', '2025-09-19', '2025-09-26', '2025-10-03',
               '2025-10-10', '2025-10-17', '2025-10-24', '2025-10-31'],
              dtype='datetime64[ns]', freq='W-FRI')

## The dt Attribute
- The `dt` attribute reveals a `DatetimeProperties` object with attributes/methods for working with datetimes. It is similar to the `str` attribute for string methods.
- The `DatetimeProperties` object has attributes like `day`, `month`, and `year` to reveal information about each date in the **Series**.
- The `day_name` method returns the written day of the week.
- Attributes like `is_month_end` and `is_quarter_start` return Boolean **Series**.

In [66]:
bunch_of_dates = pd.Series(pd.date_range(start='2000-01-01', end='2020-12-31', freq='24D 3h'))
bunch_of_dates.head()

0   2000-01-01 00:00:00
1   2000-01-25 03:00:00
2   2000-02-18 06:00:00
3   2000-03-13 09:00:00
4   2000-04-06 12:00:00
dtype: datetime64[ns]

In [67]:
bunch_of_dates.dt.day

0       1
1      25
2      18
3      13
4       6
       ..
313     3
314    27
315    21
316    14
317     8
Length: 318, dtype: int32

In [68]:
bunch_of_dates.dt.month

0       1
1       1
2       2
3       3
4       4
       ..
313     9
314     9
315    10
316    11
317    12
Length: 318, dtype: int32

In [69]:
bunch_of_dates.dt.year

0      2000
1      2000
2      2000
3      2000
4      2000
       ... 
313    2020
314    2020
315    2020
316    2020
317    2020
Length: 318, dtype: int32

In [70]:
bunch_of_dates.dt.hour

0       0
1       3
2       6
3       9
4      12
       ..
313     3
314     6
315     9
316    12
317    15
Length: 318, dtype: int32

In [71]:
bunch_of_dates.dt.day_of_year

0        1
1       25
2       49
3       73
4       97
      ... 
313    247
314    271
315    295
316    319
317    343
Length: 318, dtype: int32

In [73]:
bunch_of_dates.dt.day_name()

0       Saturday
1        Tuesday
2         Friday
3         Monday
4       Thursday
         ...    
313     Thursday
314       Sunday
315    Wednesday
316     Saturday
317      Tuesday
Length: 318, dtype: object

In [74]:
bunch_of_dates.dt.is_month_end

0      False
1      False
2      False
3      False
4      False
       ...  
313    False
314    False
315    False
316    False
317    False
Length: 318, dtype: bool

In [75]:
bunch_of_dates.dt.is_month_start

0       True
1      False
2      False
3      False
4      False
       ...  
313    False
314    False
315    False
316    False
317    False
Length: 318, dtype: bool

In [76]:
bunch_of_dates.dt.is_quarter_start

0       True
1      False
2      False
3      False
4      False
       ...  
313    False
314    False
315    False
316    False
317    False
Length: 318, dtype: bool

In [77]:
bunch_of_dates[bunch_of_dates.dt.is_quarter_start]

0     2000-01-01 00:00:00
106   2007-01-01 06:00:00
212   2014-01-01 12:00:00
299   2019-10-01 09:00:00
dtype: datetime64[ns]

## Selecting Rows from a DataFrame with a DateTimeIndex
- The `iloc` accessor is available for index position-based extraction.
- The `loc` accessor accepts strings or **Timestamps** to extract by index label/value. Note that Python's `datetime` objects will not work.
- Use list slicing to extract a sequence of dates. The `truncate` method is another alternative.

In [81]:
stocks = pd.read_csv('ibm.csv', parse_dates=['Date'], index_col='Date').sort_index()  # IBM stock data
stocks

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1962-01-02,5.04610,5.04610,4.98716,4.98716,5.935630e+05
1962-01-03,4.98716,5.03292,4.98716,5.03292,4.451750e+05
1962-01-04,5.03292,5.03292,4.98052,4.98052,3.995136e+05
1962-01-05,4.97389,4.97389,4.87511,4.88166,5.593215e+05
1962-01-08,4.88166,4.88166,4.75059,4.78972,8.332738e+05
...,...,...,...,...,...
2023-10-05,140.90000,141.70000,140.19000,141.52000,3.223910e+06
2023-10-06,141.40000,142.94000,140.11000,142.03000,3.511347e+06
2023-10-09,142.30000,142.40000,140.68000,142.20000,2.354396e+06
2023-10-10,142.60000,143.41500,141.72000,142.11000,3.015784e+06


In [82]:
stocks.iloc[300]

Open           3.561240
High           3.574410
Low            3.554500
Close          3.561240
Volume    536491.781438
Name: 1963-03-12 00:00:00, dtype: float64

In [84]:
stocks.loc['2023-10-11']

Open          142.51
High          143.34
Low           142.14
Close         143.23
Volume    2511459.00
Name: 2023-10-11 00:00:00, dtype: float64

In [85]:
stocks.loc[pd.Timestamp(2023, 10, 11)]

Open          142.51
High          143.34
Low           142.14
Close         143.23
Volume    2511459.00
Name: 2023-10-11 00:00:00, dtype: float64

In [86]:
stocks.loc['2023-10-01': '2023-10-11']

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2023-10-02,140.04,141.45,139.86,140.8,3275461.0
2023-10-03,140.87,141.64,140.0,140.39,3284421.0
2023-10-04,140.37,141.2004,139.99,141.07,2637779.0
2023-10-05,140.9,141.7,140.19,141.52,3223910.0
2023-10-06,141.4,142.94,140.11,142.03,3511347.0
2023-10-09,142.3,142.4,140.68,142.2,2354396.0
2023-10-10,142.6,143.415,141.72,142.11,3015784.0
2023-10-11,142.51,143.34,142.14,143.23,2511459.0


In [87]:
stocks.loc[pd.Timestamp(2023, 10, 1): pd.Timestamp(2023, 10, 11)]

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2023-10-02,140.04,141.45,139.86,140.8,3275461.0
2023-10-03,140.87,141.64,140.0,140.39,3284421.0
2023-10-04,140.37,141.2004,139.99,141.07,2637779.0
2023-10-05,140.9,141.7,140.19,141.52,3223910.0
2023-10-06,141.4,142.94,140.11,142.03,3511347.0
2023-10-09,142.3,142.4,140.68,142.2,2354396.0
2023-10-10,142.6,143.415,141.72,142.11,3015784.0
2023-10-11,142.51,143.34,142.14,143.23,2511459.0


In [88]:
stocks.truncate('2023-10-01', '2023-10-11')  # same as above

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2023-10-02,140.04,141.45,139.86,140.8,3275461.0
2023-10-03,140.87,141.64,140.0,140.39,3284421.0
2023-10-04,140.37,141.2004,139.99,141.07,2637779.0
2023-10-05,140.9,141.7,140.19,141.52,3223910.0
2023-10-06,141.4,142.94,140.11,142.03,3511347.0
2023-10-09,142.3,142.4,140.68,142.2,2354396.0
2023-10-10,142.6,143.415,141.72,142.11,3015784.0
2023-10-11,142.51,143.34,142.14,143.23,2511459.0


In [89]:
stocks.loc['2023-10-11', 'Close']

143.23

In [90]:
stocks.loc['2023-10-11', 'High': 'Close']

High     143.34
Low      142.14
Close    143.23
Name: 2023-10-11 00:00:00, dtype: float64

In [91]:
stocks.loc[pd.Timestamp(2023, 10, 1): pd.Timestamp(2023, 10, 11), 'High': 'Close']

Unnamed: 0_level_0,High,Low,Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2023-10-02,141.45,139.86,140.8
2023-10-03,141.64,140.0,140.39
2023-10-04,141.2004,139.99,141.07
2023-10-05,141.7,140.19,141.52
2023-10-06,142.94,140.11,142.03
2023-10-09,142.4,140.68,142.2
2023-10-10,143.415,141.72,142.11
2023-10-11,143.34,142.14,143.23


## The DateOffset Object
- A **DateOffset** object adds time to a **Timestamp** to arrive at a new **Timestamp**.
- The **DateOffset** constructor accepts `days`, `weeks`, `months`, `years` parameters, and more.
- We can pass a **DateOffset** object to the `freq` parameter of the `pd.date_range` function.

In [92]:
stocks = pd.read_csv('ibm.csv', parse_dates=['Date'], index_col='Date').sort_index()
stocks.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1962-01-02,5.0461,5.0461,4.98716,4.98716,593562.955237
1962-01-03,4.98716,5.03292,4.98716,5.03292,445175.034277
1962-01-04,5.03292,5.03292,4.98052,4.98052,399513.586679
1962-01-05,4.97389,4.97389,4.87511,4.88166,559321.480565
1962-01-08,4.88166,4.88166,4.75059,4.78972,833273.771393


In [93]:
stocks.index

DatetimeIndex(['1962-01-02', '1962-01-03', '1962-01-04', '1962-01-05',
               '1962-01-08', '1962-01-09', '1962-01-10', '1962-01-11',
               '1962-01-12', '1962-01-15',
               ...
               '2023-09-28', '2023-09-29', '2023-10-02', '2023-10-03',
               '2023-10-04', '2023-10-05', '2023-10-06', '2023-10-09',
               '2023-10-10', '2023-10-11'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [94]:
pd.DateOffset(days=5)

<DateOffset: days=5>

In [95]:
stocks.index + pd.DateOffset(days=5)  # move each day 5 days offset forwards 

DatetimeIndex(['1962-01-07', '1962-01-08', '1962-01-09', '1962-01-10',
               '1962-01-13', '1962-01-14', '1962-01-15', '1962-01-16',
               '1962-01-17', '1962-01-20',
               ...
               '2023-10-03', '2023-10-04', '2023-10-07', '2023-10-08',
               '2023-10-09', '2023-10-10', '2023-10-11', '2023-10-14',
               '2023-10-15', '2023-10-16'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [96]:
stocks.index - pd.DateOffset(days=5) # move each day 5 days offset backwards 

DatetimeIndex(['1961-12-28', '1961-12-29', '1961-12-30', '1961-12-31',
               '1962-01-03', '1962-01-04', '1962-01-05', '1962-01-06',
               '1962-01-07', '1962-01-10',
               ...
               '2023-09-23', '2023-09-24', '2023-09-27', '2023-09-28',
               '2023-09-29', '2023-09-30', '2023-10-01', '2023-10-04',
               '2023-10-05', '2023-10-06'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [97]:
stocks.index + pd.DateOffset(months=5)  # move each day 5 months offset forwards 

DatetimeIndex(['1962-06-02', '1962-06-03', '1962-06-04', '1962-06-05',
               '1962-06-08', '1962-06-09', '1962-06-10', '1962-06-11',
               '1962-06-12', '1962-06-15',
               ...
               '2024-02-28', '2024-02-29', '2024-03-02', '2024-03-03',
               '2024-03-04', '2024-03-05', '2024-03-06', '2024-03-09',
               '2024-03-10', '2024-03-11'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [98]:
stocks.index - pd.DateOffset(months=5)  # move each day 5 months offset backwards 

DatetimeIndex(['1961-08-02', '1961-08-03', '1961-08-04', '1961-08-05',
               '1961-08-08', '1961-08-09', '1961-08-10', '1961-08-11',
               '1961-08-12', '1961-08-15',
               ...
               '2023-04-28', '2023-04-29', '2023-05-02', '2023-05-03',
               '2023-05-04', '2023-05-05', '2023-05-06', '2023-05-09',
               '2023-05-10', '2023-05-11'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [99]:
stocks.index + pd.DateOffset(years=1)  # move each day 1 year offset forwards 

DatetimeIndex(['1963-01-02', '1963-01-03', '1963-01-04', '1963-01-05',
               '1963-01-08', '1963-01-09', '1963-01-10', '1963-01-11',
               '1963-01-12', '1963-01-15',
               ...
               '2024-09-28', '2024-09-29', '2024-10-02', '2024-10-03',
               '2024-10-04', '2024-10-05', '2024-10-06', '2024-10-09',
               '2024-10-10', '2024-10-11'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [100]:
stocks.index - pd.DateOffset(years=1)  # move each day 1 year offset backwards 

DatetimeIndex(['1961-01-02', '1961-01-03', '1961-01-04', '1961-01-05',
               '1961-01-08', '1961-01-09', '1961-01-10', '1961-01-11',
               '1961-01-12', '1961-01-15',
               ...
               '2022-09-28', '2022-09-29', '2022-10-02', '2022-10-03',
               '2022-10-04', '2022-10-05', '2022-10-06', '2022-10-09',
               '2022-10-10', '2022-10-11'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [102]:
stocks.index + pd.DateOffset(hours=7)  # move each day 7 hours offset forwards 

DatetimeIndex(['1962-01-02 07:00:00', '1962-01-03 07:00:00',
               '1962-01-04 07:00:00', '1962-01-05 07:00:00',
               '1962-01-08 07:00:00', '1962-01-09 07:00:00',
               '1962-01-10 07:00:00', '1962-01-11 07:00:00',
               '1962-01-12 07:00:00', '1962-01-15 07:00:00',
               ...
               '2023-09-28 07:00:00', '2023-09-29 07:00:00',
               '2023-10-02 07:00:00', '2023-10-03 07:00:00',
               '2023-10-04 07:00:00', '2023-10-05 07:00:00',
               '2023-10-06 07:00:00', '2023-10-09 07:00:00',
               '2023-10-10 07:00:00', '2023-10-11 07:00:00'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [103]:
stocks.index - pd.DateOffset(hours=7)  # move each day 7 hours offset backwards 

DatetimeIndex(['1962-01-01 17:00:00', '1962-01-02 17:00:00',
               '1962-01-03 17:00:00', '1962-01-04 17:00:00',
               '1962-01-07 17:00:00', '1962-01-08 17:00:00',
               '1962-01-09 17:00:00', '1962-01-10 17:00:00',
               '1962-01-11 17:00:00', '1962-01-14 17:00:00',
               ...
               '2023-09-27 17:00:00', '2023-09-28 17:00:00',
               '2023-10-01 17:00:00', '2023-10-02 17:00:00',
               '2023-10-03 17:00:00', '2023-10-04 17:00:00',
               '2023-10-05 17:00:00', '2023-10-08 17:00:00',
               '2023-10-09 17:00:00', '2023-10-10 17:00:00'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [106]:
stocks.index + pd.DateOffset(years=1, months=3, days=2, hours=6, minutes=30, seconds=20)  # move each day 1 year, 2 days, 6 hours, 30 minutes and 20 seconds forwards

DatetimeIndex(['1963-04-04 06:30:20', '1963-04-05 06:30:20',
               '1963-04-06 06:30:20', '1963-04-07 06:30:20',
               '1963-04-10 06:30:20', '1963-04-11 06:30:20',
               '1963-04-12 06:30:20', '1963-04-13 06:30:20',
               '1963-04-14 06:30:20', '1963-04-17 06:30:20',
               ...
               '2024-12-30 06:30:20', '2024-12-31 06:30:20',
               '2025-01-04 06:30:20', '2025-01-05 06:30:20',
               '2025-01-06 06:30:20', '2025-01-07 06:30:20',
               '2025-01-08 06:30:20', '2025-01-11 06:30:20',
               '2025-01-12 06:30:20', '2025-01-13 06:30:20'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [128]:
# Find the IBM price on every one of the specific (says May 5, 1990)
pd.date_range(start='1990-05-05', end='2023-10-11', freq=pd.DateOffset(years=1))

DatetimeIndex(['1990-05-05', '1991-05-05', '1992-05-05', '1993-05-05',
               '1994-05-05', '1995-05-05', '1996-05-05', '1997-05-05',
               '1998-05-05', '1999-05-05', '2000-05-05', '2001-05-05',
               '2002-05-05', '2003-05-05', '2004-05-05', '2005-05-05',
               '2006-05-05', '2007-05-05', '2008-05-05', '2009-05-05',
               '2010-05-05', '2011-05-05', '2012-05-05', '2013-05-05',
               '2014-05-05', '2015-05-05', '2016-05-05', '2017-05-05',
               '2018-05-05', '2019-05-05', '2020-05-05', '2021-05-05',
               '2022-05-05', '2023-05-05'],
              dtype='datetime64[ns]', freq='<DateOffset: years=1>')

In [129]:
doublefivedays = pd.date_range(start='1990-05-05', end='2023-10-11', freq=pd.DateOffset(years=1))
doublefivedays

DatetimeIndex(['1990-05-05', '1991-05-05', '1992-05-05', '1993-05-05',
               '1994-05-05', '1995-05-05', '1996-05-05', '1997-05-05',
               '1998-05-05', '1999-05-05', '2000-05-05', '2001-05-05',
               '2002-05-05', '2003-05-05', '2004-05-05', '2005-05-05',
               '2006-05-05', '2007-05-05', '2008-05-05', '2009-05-05',
               '2010-05-05', '2011-05-05', '2012-05-05', '2013-05-05',
               '2014-05-05', '2015-05-05', '2016-05-05', '2017-05-05',
               '2018-05-05', '2019-05-05', '2020-05-05', '2021-05-05',
               '2022-05-05', '2023-05-05'],
              dtype='datetime64[ns]', freq='<DateOffset: years=1>')

In [130]:
stocks[stocks.index.isin(doublefivedays)]

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1992-05-05,15.1395,15.3815,15.1181,15.3639,12449050.0
1993-05-05,8.02285,8.10278,7.99689,8.08092,8236866.0
1994-05-05,9.38934,9.40788,9.31127,9.35031,6338064.0
1995-05-05,15.4293,15.4869,15.2058,15.243,18597180.0
1997-05-05,26.6573,27.2155,26.6164,27.1843,18897300.0
1998-05-05,38.2015,38.7138,38.2015,38.5685,8208253.0
1999-05-05,69.4408,69.5676,67.8656,69.4408,12894680.0
2000-05-05,69.8077,71.78,69.7247,70.6324,7145766.0
2003-05-05,57.2944,57.8214,56.8494,56.8494,11611210.0
2004-05-05,58.9701,58.9701,58.1542,58.4499,6700136.0


## Specialized Date Offsets
- Pandas nests more specialized date offsets in `pd.tseries.offsets`.
- We can add a different amount of time to each date (for example, month end, quarter end, year begin)

In [131]:
stocks = pd.read_csv('ibm.csv', parse_dates=['Date'], index_col='Date').sort_index()
stocks.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1962-01-02,5.0461,5.0461,4.98716,4.98716,593562.955237
1962-01-03,4.98716,5.03292,4.98716,5.03292,445175.034277
1962-01-04,5.03292,5.03292,4.98052,4.98052,399513.586679
1962-01-05,4.97389,4.97389,4.87511,4.88166,559321.480565
1962-01-08,4.88166,4.88166,4.75059,4.78972,833273.771393


In [133]:
stocks.tail()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2023-10-05,140.9,141.7,140.19,141.52,3223910.0
2023-10-06,141.4,142.94,140.11,142.03,3511347.0
2023-10-09,142.3,142.4,140.68,142.2,2354396.0
2023-10-10,142.6,143.415,141.72,142.11,3015784.0
2023-10-11,142.51,143.34,142.14,143.23,2511459.0


In [134]:
# Round each day to the start day of its month
stocks.index + pd.tseries.offsets.MonthEnd()  # tseries: time series module of pandas 

DatetimeIndex(['1962-01-31', '1962-01-31', '1962-01-31', '1962-01-31',
               '1962-01-31', '1962-01-31', '1962-01-31', '1962-01-31',
               '1962-01-31', '1962-01-31',
               ...
               '2023-09-30', '2023-09-30', '2023-10-31', '2023-10-31',
               '2023-10-31', '2023-10-31', '2023-10-31', '2023-10-31',
               '2023-10-31', '2023-10-31'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [135]:
stocks.index - pd.tseries.offsets.MonthEnd()

DatetimeIndex(['1961-12-31', '1961-12-31', '1961-12-31', '1961-12-31',
               '1961-12-31', '1961-12-31', '1961-12-31', '1961-12-31',
               '1961-12-31', '1961-12-31',
               ...
               '2023-08-31', '2023-08-31', '2023-09-30', '2023-09-30',
               '2023-09-30', '2023-09-30', '2023-09-30', '2023-09-30',
               '2023-09-30', '2023-09-30'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [136]:
stocks.index + pd.tseries.offsets.QuarterEnd()

DatetimeIndex(['1962-03-31', '1962-03-31', '1962-03-31', '1962-03-31',
               '1962-03-31', '1962-03-31', '1962-03-31', '1962-03-31',
               '1962-03-31', '1962-03-31',
               ...
               '2023-09-30', '2023-09-30', '2023-12-31', '2023-12-31',
               '2023-12-31', '2023-12-31', '2023-12-31', '2023-12-31',
               '2023-12-31', '2023-12-31'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [137]:
stocks.index - pd.tseries.offsets.QuarterEnd()

DatetimeIndex(['1961-12-31', '1961-12-31', '1961-12-31', '1961-12-31',
               '1961-12-31', '1961-12-31', '1961-12-31', '1961-12-31',
               '1961-12-31', '1961-12-31',
               ...
               '2023-06-30', '2023-06-30', '2023-09-30', '2023-09-30',
               '2023-09-30', '2023-09-30', '2023-09-30', '2023-09-30',
               '2023-09-30', '2023-09-30'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [138]:
stocks.index + pd.tseries.offsets.QuarterBegin()

DatetimeIndex(['1962-03-01', '1962-03-01', '1962-03-01', '1962-03-01',
               '1962-03-01', '1962-03-01', '1962-03-01', '1962-03-01',
               '1962-03-01', '1962-03-01',
               ...
               '2023-12-01', '2023-12-01', '2023-12-01', '2023-12-01',
               '2023-12-01', '2023-12-01', '2023-12-01', '2023-12-01',
               '2023-12-01', '2023-12-01'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [139]:
stocks.index + pd.tseries.offsets.QuarterBegin(startingMonth=1)

DatetimeIndex(['1962-04-01', '1962-04-01', '1962-04-01', '1962-04-01',
               '1962-04-01', '1962-04-01', '1962-04-01', '1962-04-01',
               '1962-04-01', '1962-04-01',
               ...
               '2023-10-01', '2023-10-01', '2024-01-01', '2024-01-01',
               '2024-01-01', '2024-01-01', '2024-01-01', '2024-01-01',
               '2024-01-01', '2024-01-01'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [140]:
stocks.index - pd.tseries.offsets.QuarterBegin(startingMonth=1)

DatetimeIndex(['1962-01-01', '1962-01-01', '1962-01-01', '1962-01-01',
               '1962-01-01', '1962-01-01', '1962-01-01', '1962-01-01',
               '1962-01-01', '1962-01-01',
               ...
               '2023-07-01', '2023-07-01', '2023-10-01', '2023-10-01',
               '2023-10-01', '2023-10-01', '2023-10-01', '2023-10-01',
               '2023-10-01', '2023-10-01'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [141]:
stocks.index + pd.tseries.offsets.YearEnd()

DatetimeIndex(['1962-12-31', '1962-12-31', '1962-12-31', '1962-12-31',
               '1962-12-31', '1962-12-31', '1962-12-31', '1962-12-31',
               '1962-12-31', '1962-12-31',
               ...
               '2023-12-31', '2023-12-31', '2023-12-31', '2023-12-31',
               '2023-12-31', '2023-12-31', '2023-12-31', '2023-12-31',
               '2023-12-31', '2023-12-31'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [142]:
stocks.index + pd.tseries.offsets.YearBegin()

DatetimeIndex(['1963-01-01', '1963-01-01', '1963-01-01', '1963-01-01',
               '1963-01-01', '1963-01-01', '1963-01-01', '1963-01-01',
               '1963-01-01', '1963-01-01',
               ...
               '2024-01-01', '2024-01-01', '2024-01-01', '2024-01-01',
               '2024-01-01', '2024-01-01', '2024-01-01', '2024-01-01',
               '2024-01-01', '2024-01-01'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

## Timedeltas
- A **Timedelta** is a pandas object that represents a duration (an amount of time).
- Subtracting two **Timestamp** objects will yield a **Timedelta** object (this applies to subtracting a **Series** from another **Series**).
- The **Timedelta** constructor accepts parameters for time as well as string descriptions.

In [143]:
pd.Timestamp('2023-03-31') - pd.Timestamp('2023-03-20')

Timedelta('11 days 00:00:00')

In [144]:
pd.Timestamp('2023-03-31 12:30:48') - pd.Timestamp('2023-03-20 19:25:59')

Timedelta('10 days 17:04:49')

In [145]:
pd.Timestamp('2023-03-20 19:25:59') - pd.Timestamp('2023-03-31 12:30:48')

Timedelta('-11 days +06:55:11')

In [147]:
pd.Timedelta(days=3, hours=2, minutes=5)  # create a time duration

Timedelta('3 days 02:05:00')

In [148]:
pd.Timedelta('5 minutes')

Timedelta('0 days 00:05:00')

In [149]:
pd.Timedelta('3 days 2 hours 5 minutes')

Timedelta('3 days 02:05:00')

In [164]:
ecommerce = pd.read_csv('ecommerce.csv', index_col='ID', parse_dates=['order_date', 'delivery_date'], date_format='%m/%d/%y')
ecommerce.head()

Unnamed: 0_level_0,order_date,delivery_date
ID,Unnamed: 1_level_1,Unnamed: 2_level_1
1,1998-05-24,1999-02-05
2,1992-04-22,1998-03-06
4,1991-02-10,1992-08-26
5,1992-07-21,1997-11-20
7,1993-09-02,1998-06-10


In [165]:
ecommerce['Delivery Time'] = ecommerce['delivery_date'] - ecommerce['order_date']
ecommerce.head()

Unnamed: 0_level_0,order_date,delivery_date,Delivery Time
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,1998-05-24,1999-02-05,257 days
2,1992-04-22,1998-03-06,2144 days
4,1991-02-10,1992-08-26,563 days
5,1992-07-21,1997-11-20,1948 days
7,1993-09-02,1998-06-10,1742 days


In [166]:
ecommerce['If It Took Twice As Long'] = ecommerce['delivery_date'] + ecommerce['Delivery Time']
ecommerce.head()

Unnamed: 0_level_0,order_date,delivery_date,Delivery Time,If It Took Twice As Long
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,1998-05-24,1999-02-05,257 days,1999-10-20
2,1992-04-22,1998-03-06,2144 days,2004-01-18
4,1991-02-10,1992-08-26,563 days,1994-03-12
5,1992-07-21,1997-11-20,1948 days,2003-03-22
7,1993-09-02,1998-06-10,1742 days,2003-03-18


In [168]:
ecommerce['Delivery Time'].max()

Timedelta('3583 days 00:00:00')

In [169]:
ecommerce['Delivery Time'].min()

Timedelta('8 days 00:00:00')

In [170]:
ecommerce['Delivery Time'].mean()

Timedelta('1217 days 22:53:53.532934128')

In [171]:
ecommerce['Delivery Time'].median()

Timedelta('998 days 00:00:00')