# Working with Dates and Times

In [1]:
import pandas as pd
import datetime as dt

## Review of Python's datetime Module
- The `datetime` module is built into the core Python programming language.
- The common alias for the `datetime` module is `dt`.
- A module is a Python source file; think of like an internal library that Python loads on demand.
- The `datetime` module includes `date` and `datetime` classes for representing dates and datetimes.
- The `date` constructor accepts arguments for year, month, and day. Python defaults to 0 for any missing values.
- The `datetime` constructor accepts arguments for year, month, day, hour, minute, and second.

In [2]:
day1 = dt.date(2018,11,3)

day1.day

3

In [3]:
datetime1 = dt.datetime(2018,11,3,8,0,9)

datetime1.minute

0

## The Timestamp and DatetimeIndex Objects

- Pandas ships with several classes related to datetimes.
- The **Timestamp** is similar to Python's **datetime** object (but with expanded functionality).
- A **DatetimeIndex** is an index of **Timestamp** objects.
- The **Timestamp** constructor accepts a string, a **datetime** object, or equivalent arguments to the **datetime** clas.

In [4]:
pd.Timestamp(2012,3,12,12,30,14)

Timestamp('2012-03-12 12:30:14')

## Create Range of Dates with pd.date_range Function
- The `date_range` function generates and returns a **DatetimeIndex** holding a sequence of dates.
- The function requires 2 of the 3 following parameters: `start`, `end`, and `period`.
- With `start` and `end`, Pandas will assume a daily period/interval.
- Every element within a **DatetimeIndex** is a **Timestamp**.

In [5]:
pd.date_range(start="25/01/24", end="25/02/24")

DatetimeIndex(['2024-01-25', '2024-01-26', '2024-01-27', '2024-01-28',
               '2024-01-29', '2024-01-30', '2024-01-31', '2024-02-01',
               '2024-02-02', '2024-02-03', '2024-02-04', '2024-02-05',
               '2024-02-06', '2024-02-07', '2024-02-08', '2024-02-09',
               '2024-02-10', '2024-02-11', '2024-02-12', '2024-02-13',
               '2024-02-14', '2024-02-15', '2024-02-16', '2024-02-17',
               '2024-02-18', '2024-02-19', '2024-02-20', '2024-02-21',
               '2024-02-22', '2024-02-23', '2024-02-24', '2024-02-25'],
              dtype='datetime64[ns]', freq='D')

## The dt Attribute
- The `dt` attribute reveals a `DatetimeProperties` object with attributes/methods for working with datetimes. It is similar to the `str` attribute for string methods.
- The `DatetimeProperties` object has attributes like `day`, `month`, and `year` to reveal information about each date in the **Series**.
- The `day_name` method returns the written day of the week.
- Attributes like `is_month_end` and `is_quarter_start` return Boolean **Series**.

In [9]:
dates =pd.Series(pd.date_range(start= "2000-01-01", end="2020-01-01", freq="24D 3h"))

In [12]:
dates.dt.is_month_end

0      False
1      False
2      False
3      False
4      False
       ...  
298    False
299    False
300    False
301    False
302    False
Length: 303, dtype: bool

## Selecting Rows from a DataFrame with a DateTimeIndex
- The `iloc` accessor is available for index position-based extraction.
- The `loc` accessor accepts strings or **Timestamps** to extract by index label/value. Note that Python's `datetime` objects will not work.
- Use list slicing to extract a sequence of dates. The `truncate` method is another alternative.

In [19]:
ibm = pd.read_csv("ibm.csv", parse_dates=["Date"], index_col="Date").sort_index()
ibm

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1962-01-02,5.04610,5.04610,4.98716,4.98716,5.935630e+05
1962-01-03,4.98716,5.03292,4.98716,5.03292,4.451750e+05
1962-01-04,5.03292,5.03292,4.98052,4.98052,3.995136e+05
1962-01-05,4.97389,4.97389,4.87511,4.88166,5.593215e+05
1962-01-08,4.88166,4.88166,4.75059,4.78972,8.332738e+05
...,...,...,...,...,...
2023-10-05,140.90000,141.70000,140.19000,141.52000,3.223910e+06
2023-10-06,141.40000,142.94000,140.11000,142.03000,3.511347e+06
2023-10-09,142.30000,142.40000,140.68000,142.20000,2.354396e+06
2023-10-10,142.60000,143.41500,141.72000,142.11000,3.015784e+06


In [20]:
ibm.loc["1962-01-02":"1964-01-02"]

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1962-01-02,5.04610,5.04610,4.98716,4.98716,5.935630e+05
1962-01-03,4.98716,5.03292,4.98716,5.03292,4.451750e+05
1962-01-04,5.03292,5.03292,4.98052,4.98052,3.995136e+05
1962-01-05,4.97389,4.97389,4.87511,4.88166,5.593215e+05
1962-01-08,4.88166,4.88166,4.75059,4.78972,8.332738e+05
...,...,...,...,...,...
1963-12-26,4.23794,4.28400,4.23794,4.27746,7.876144e+05
1963-12-27,4.30352,4.34315,4.30352,4.30352,1.643723e+06
1963-12-30,4.32343,4.36266,4.32343,4.34949,2.613974e+06
1963-12-31,4.36266,4.44171,4.36266,4.42171,2.648217e+06


## The DateOffset Object
- A **DateOffset** object adds time to a **Timestamp** to arrive at a new **Timestamp**.
- The **DateOffset** constructor accepts `days`, `weeks`, `months`, `years` parameters, and more.
- We can pass a **DateOffset** object to the `freq` parameter of the `pd.date_range` function.

In [24]:
offset = pd.DateOffset(days=5)
ibm.index + offset #datetimeIndex

DatetimeIndex(['1962-01-07', '1962-01-08', '1962-01-09', '1962-01-10',
               '1962-01-13', '1962-01-14', '1962-01-15', '1962-01-16',
               '1962-01-17', '1962-01-20',
               ...
               '2023-10-03', '2023-10-04', '2023-10-07', '2023-10-08',
               '2023-10-09', '2023-10-10', '2023-10-11', '2023-10-14',
               '2023-10-15', '2023-10-16'],
              dtype='datetime64[ns]', name='Date', length=15546, freq=None)

In [29]:
pd.date_range(start="2000-07-20", end="2004-07-21", freq=pd.DateOffset(years=1))

DatetimeIndex(['2000-07-20', '2001-07-20', '2002-07-20', '2003-07-20',
               '2004-07-20'],
              dtype='datetime64[ns]', freq='<DateOffset: years=1>')

## Specialized Date Offsets
- Pandas nests more specialized date offsets in `pd.tseries.offsets`.
- We can add a different amount of time to each date (for example, month end, quarter end, year begin)

## Timedeltas
- A **Timedelta** is a pandas object that represents a duration (an amount of time).
- Subtracting two **Timestamp** objects will yield a **Timedelta** object (this applies to subtracting a **Series** from another **Series**).
- The **Timedelta** constructor accepts parameters for time as well as string descriptions.