# Working with Dates and Times

In [2]:
import pandas as pd
import datetime as dt

## Review of Python's datetime Module
- The `datetime` module is built into the core Python programming language.
- The common alias for the `datetime` module is `dt`.
- A module is a Python source file; think of like an internal library that Python loads on demand.
- The `datetime` module includes `date` and `datetime` classes for representing dates and datetimes.
- The `date` constructor accepts arguments for year, month, and day. Python defaults to 0 for any missing values.
- The `datetime` constructor accepts arguments for year, month, day, hour, minute, and second.

In [19]:
someday= dt.date(2025, 1, 28)

print(someday.day)
print(someday.month)
print(someday.year)
print('-'*10, end= '\n\n')

dt.datetime(2025, 1, 28)
dt.datetime(2025, 1, 28, 12, 9)
dt.datetime(2025, 1, 28, 12, 9, 43, 11)

somedatetime= dt.datetime(2025, 1, 28, 12, 9, 43, 11)
print(somedatetime.year)
print(somedatetime.month)
print(somedatetime.day)
print(somedatetime.hour)
print(somedatetime.minute)
print(somedatetime.second)
print(somedatetime.microsecond)

28
1
2025
----------

2025
1
28
12
9
43
11


## The Timestamp and DatetimeIndex Objects

- Pandas ships with several classes related to datetimes.
- The **Timestamp** is similar to Python's **datetime** object (but with expanded functionality).
- A **DatetimeIndex** is an index of **Timestamp** objects.
- The **Timestamp** constructor accepts a string, a **datetime** object, or equivalent arguments to the **datetime** clas.

In [28]:
pd.Timestamp(2024, 3, 12)
pd.Timestamp(2025, 1, 31, 13, 47, 23)
pd.Timestamp(dt.date(2015, 11, 10))
pd.Timestamp(dt.datetime(2018, 10, 15, 16, 12, 48))
pd.Timestamp('2023-12-25')
pd.Timestamp('05/11/2001') # MM/dd/yyyy
pd.Timestamp('2024-10-29 15:42:33')

Timestamp('2024-10-29 15:42:33')

In [33]:
pd.Series([pd.Timestamp('2024-10-29 15:42:33')]).iloc[0]

Timestamp('2024-10-29 15:42:33')

In [37]:
pd.DatetimeIndex(['2005-10-12', '2010-11-24', '2029-06-23'])

index= pd.DatetimeIndex(['2005-10-12', '2010-11-24', '2029-06-23'])
index

DatetimeIndex(['2005-10-12', '2010-11-24', '2029-06-23'], dtype='datetime64[ns]', freq=None)

In [39]:
type(index)
type(index[0])

pandas._libs.tslibs.timestamps.Timestamp

- We can understand the datetime index as a list of timestamp objects, but we can also have an original Series of it

## Create Range of Dates with pd.date_range Function
- The `date_range` function generates and returns a **DatetimeIndex** holding a sequence of dates.
- The function requires 2 of the 3 following parameters: `start`, `end`, and `period`.
- With `start` and `end`, Pandas will assume a daily period/interval.
- Every element within a **DatetimeIndex** is a **Timestamp**.

In [63]:
pd.date_range(start= '2025-01-01', end= '2025-01-07')
pd.date_range(start= '2025-01-01', end= '2025-01-07', freq= 'D') #getting the exact same result above
pd.date_range(start= '2025-01-01', end= '2025-01-07', freq= '2d')
pd.date_range(start= '2014-11-20', end= '2014-12-31', freq= 'B') # business days (monday-friday)
pd.date_range(start= '2025-01-01', end= '2025-01-31', freq= 'W') # week days (intervals of 1 week starting from sunday)
pd.date_range(start= '2025-01-01', end= '2025-01-31', freq= 'W-TUE') # week days (intervals of 1 week starting from tuesday)

pd.date_range(start= '2024-01-01', end= '2024-01-31', freq='H') # values hourly spaced (default: 1h)
pd.date_range(start= '2024-01-01', end= '2024-01-31', freq='6H')

pd.date_range(start= '2025-01-01', end= '2025-12-31', freq= 'M') # dates monthly spaced, but always considering the end of each month
pd.date_range(start= '2025-01-01', end= '2025-12-31', freq= 'MS') # dates monthly spaced, but considering the start of each month
pd.date_range(start= '2025-01-01', end= '2050-12-31', freq='A') # dates annualy spaced
pd.date_range(start= '2025-01-01', end= '2050-12-31', freq='AS-FEB') # dates annualy spaced every feb, and always taking the start of each month

pd.date_range(start='2025-01-28', freq= 'D', periods= 25) # we're taking 25 occurrences of normal days ahead of the starting point
pd.date_range(start= '2025-01-28', freq='B', periods= 50) # we're taking 50 occurrences of business days ahead of the starting point
pd.date_range(end= '2025-01-28', freq='D', periods= 28) # we're taking 28 occurrences of normal days behind the starting point

DatetimeIndex(['2025-01-01', '2025-01-02', '2025-01-03', '2025-01-04',
               '2025-01-05', '2025-01-06', '2025-01-07', '2025-01-08',
               '2025-01-09', '2025-01-10', '2025-01-11', '2025-01-12',
               '2025-01-13', '2025-01-14', '2025-01-15', '2025-01-16',
               '2025-01-17', '2025-01-18', '2025-01-19', '2025-01-20',
               '2025-01-21', '2025-01-22', '2025-01-23', '2025-01-24',
               '2025-01-25', '2025-01-26', '2025-01-27', '2025-01-28'],
              dtype='datetime64[ns]', freq='D')

## The dt Attribute
- The `dt` attribute reveals a `DatetimeProperties` object with attributes/methods for working with datetimes. It is similar to the `str` attribute for string methods.
- The `DatetimeProperties` object has attributes like `day`, `month`, and `year` to reveal information about each date in the **Series**.
- The `day_name` method returns the written day of the week.
- Attributes like `is_month_end` and `is_quarter_start` return Boolean **Series**.

In [69]:
bunch_of_dates= pd.Series(pd.date_range(start= '2000-01-01', end= '2020-12-31', freq= '24D 4.5H'))
bunch_of_dates

0     2000-01-01 00:00:00
1     2000-01-25 04:30:00
2     2000-02-18 09:00:00
3     2000-03-13 13:30:00
4     2000-04-06 18:00:00
              ...        
313   2020-09-22 16:30:00
314   2020-10-16 21:00:00
315   2020-11-10 01:30:00
316   2020-12-04 06:00:00
317   2020-12-28 10:30:00
Length: 318, dtype: datetime64[ns]

In [78]:
bunch_of_dates.dt.day
bunch_of_dates.dt.month
bunch_of_dates.dt.year
bunch_of_dates.dt.hour
bunch_of_dates.dt.day_of_year
bunch_of_dates.dt.day_of_week.map({0: 'Sunday', 1: 'Monday', 2: 'Tuesday', 3: 'Wednesday', 4: 'Thursday', 5: 'Friday', 6: 'Saturday'})
bunch_of_dates.dt.day_name()

0      Saturday
1       Tuesday
2        Friday
3        Monday
4      Thursday
         ...   
313     Tuesday
314      Friday
315     Tuesday
316      Friday
317      Monday
Length: 318, dtype: object

In [82]:
bunch_of_dates.dt.day_of_week

0      5
1      1
2      4
3      0
4      3
      ..
313    1
314    4
315    1
316    4
317    0
Length: 318, dtype: int32

In [81]:
sorted(bunch_of_dates.dt.day_of_week.unique())

[0, 1, 2, 3, 4, 5, 6]

## Selecting Rows from a DataFrame with a DateTimeIndex
- The `iloc` accessor is available for index position-based extraction.
- The `loc` accessor accepts strings or **Timestamps** to extract by index label/value. Note that Python's `datetime` objects will not work.
- Use list slicing to extract a sequence of dates. The `truncate` method is another alternative.

## The DateOffset Object
- A **DateOffset** object adds time to a **Timestamp** to arrive at a new **Timestamp**.
- The **DateOffset** constructor accepts `days`, `weeks`, `months`, `years` parameters, and more.
- We can pass a **DateOffset** object to the `freq` parameter of the `pd.date_range` function.

## Specialized Date Offsets
- Pandas nests more specialized date offsets in `pd.tseries.offsets`.
- We can add a different amount of time to each date (for example, month end, quarter end, year begin)

## Timedeltas
- A **Timedelta** is a pandas object that represents a duration (an amount of time).
- Subtracting two **Timestamp** objects will yield a **Timedelta** object (this applies to subtracting a **Series** from another **Series**).
- The **Timedelta** constructor accepts parameters for time as well as string descriptions.