<a href="https://colab.research.google.com/github/statrliu/data_bootcamp_part1/blob/main/pandas_datatime.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Pandas Datetime**



## *Creating datetime objects in pandas*

In this lesson, we will learn
+ The different datetime formats in pandas.
+ how to create datetime objects using the `to_datetime()` method.
+ how to set a datetime object as the index of a pandas DataFrame.

### **Understanding datetime formats in pandas**
Pandas supports a variety of datetime formats, including:

+ ISO 8601 format: `YYYY-MM-DDTHH:MM:SS.ssssss`
+ Unix timestamp: the number of seconds since January 1, 1970 (also known as "epoch time")
+ Python datetime object: a datetime object from the built-in datetime module in Python
+ `%Y-%m-%d %H:%M:%S:` This format represents the date and time in the ISO format. It includes the year, month, day, hour, minute, and second in the format `YYYY-MM-DD HH:MM:SS`. Example: `'2022-01-01 10:30:00'`.

+ `%Y/%m/%d %H:%M:%S:` This format represents the date and time in a forward slash-separated format. It includes the year, month, day, hour, minute, and second in the format `YYYY/MM/DD HH:MM:SS`. Example: `'2022/01/01 10:30:00'`.

+ `%m/%d/%Y %H:%M:%S:` This format represents the date and time in a month/day/year format. It includes the month, day, year, hour, minute, and second in the format `MM/DD/YYYY HH:MM:SS`. Example: `'01/01/2022 10:30:00'.`

+ `%d-%b-%Y %H:%M:%S:` This format represents the date and time in a day-month-year format. It includes the day, abbreviated month name, year, hour, minute, and second in the format `DD-MMM-YYYY HH:MM:SS.` Example: `'01-Jan-2022 10:30:00'.`

+ `%A, %B %d, %Y %I:%M:%S %p:` This format represents the date and time in a day of the week, month day, year format with the time in a 12-hour format. It includes the day of the week, full month name, day, year, hour, minute, and second in the format `Weekday, Month DD, YYYY HH:MM:SS AM/PM.` Example: `'Saturday, January 01, 2022 10:30:00 AM'.`

These are just a few examples of datetime formats in pandas. The format of the datetime value will depend on the data source and the specific requirements of the analysis or application.

Pandas provides a variety of datetime functions to manipulate and transform datetime values, allowing users to work with datetime data in a flexible and powerful way.







### **Creating datetime objects using `to_datetime()`**
The `to_datetime()` method in pandas is a powerful tool for creating datetime objects from a variety of input formats.



In [1]:
import pandas as pd

# create a datetime object from a string in ISO 8601 format:
date_str = '2022-03-24T12:30:00.000000'
date_obj = pd.to_datetime(date_str)
print(date_obj)

2022-03-24 12:30:00


In [2]:
# Convert a list of string to date
date_list = ['2022-01-01', '2022-01-02', '2022-01-03']
date_objs = pd.to_datetime(date_list)
print(date_objs)

DatetimeIndex(['2022-01-01', '2022-01-02', '2022-01-03'], dtype='datetime64[ns]', freq=None)


### **Setting a datetime object as the index of a pandas DataFrame**

Once we have created a datetime object using `to_datetime()`, we can set it as the index of a pandas DataFrame using the `set_index()` method.


In [3]:
dates = ['2022-03-24', '2022-03-25', '2022-03-26', '2022-03-27', '2022-03-28']
data = [10, 20, 30, 40, 50]

df = pd.DataFrame({'data': data}, index=pd.to_datetime(dates))
print(df)

            data
2022-03-24    10
2022-03-25    20
2022-03-26    30
2022-03-27    40
2022-03-28    50


## *Working with datetime objects in pandas*
In this lesson, we will learn how to manipulate datetime objects in pandas.
+ common operations such as extracting year, month, and day values.
+ resampling time series data using the resample() method.

### **Extracting values from datetime objects**

One of the most common operations when working with datetime objects is extracting specific values such as year, month, and day. We can do this using the dt accessor in pandas.


In [4]:
date_str = '2023-09-18T12:30:00.000000'
date_obj = pd.to_datetime(date_str)

print(date_obj.year)
print(date_obj.month)
print(date_obj.day)
print(date_obj.weekday())

2023
9
18
0


### **Resampling time series data**
Another common operation when working with datetime objects is resampling time series data. We can use the `resample()` method in pandas to aggregate or interpolate time series data over a different time frequency. https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects


In [5]:
import pandas as pd
import numpy as np

# Create a DataFrame with a datetime index and random values
np.random.seed(123)
date_range = pd.date_range(start='2022-01-01', end='2022-03-31', freq='D')
df = pd.DataFrame({'value': np.random.rand(len(date_range))}, index=date_range)
print(df.head())
print("\n\n")

               value
2022-01-01  0.696469
2022-01-02  0.286139
2022-01-03  0.226851
2022-01-04  0.551315
2022-01-05  0.719469





In [6]:
# Resample to weekly frequency and calculate the mean
weekly_mean = df.resample('W').mean()
print(weekly_mean)

               value
2022-01-02  0.491304
2022-01-09  0.581038
2022-01-16  0.442662
2022-01-23  0.518516
2022-01-30  0.453024
2022-02-06  0.373542
2022-02-13  0.544438
2022-02-20  0.548338
2022-02-27  0.531382
2022-03-06  0.683295
2022-03-13  0.405466
2022-03-20  0.404292
2022-03-27  0.630915
2022-04-03  0.275986


In [7]:
# Resample to monthly frequency and calculate the sum
monthly_sum = df.resample('MS').sum()
print(monthly_sum)


                value
2022-01-01  15.041394
2022-02-01  14.767261
2022-03-01  15.096258


### *DateOffset*

DateOffset is a class in the pandas module that represents a duration of time in pandas-specific terms.

#### **Creating a DateOffset object**
You can create a DateOffset object by specifying the number of days, months, or years. Here are some examples:


In [8]:
from pandas.core.api import DateOffset
import pandas as pd

# Create a DateOffset object representing 1 day
dof = pd.DateOffset(days=1) #, hours = 0
print(dof)


<DateOffset: days=1>


In [9]:
# Create a DateOffset object representing 2 months
dof = pd.DateOffset(months=2)
print(dof)

# Create a DateOffset object representing 1 year
dof = pd.DateOffset(years=1)
print(dof)

<DateOffset: months=2>
<DateOffset: years=1>


#### **Applying a DateOffset to a date or datetime**

You can apply a DateOffset object to a date or datetime object to shift the date or time by the specified duration. Here are some examples:


In [10]:
import pandas as pd

# Create a date object for January 1, 2022
date = pd.to_datetime('2022-01-01')
print(date)

# Add 1 day to the date
new_date = date + pd.DateOffset(days=1)
print(new_date)

2022-01-01 00:00:00
2022-01-02 00:00:00


In [11]:
# Subtract 2 months from the date
new_date = date - pd.DateOffset(months=2)
print(new_date)

2021-11-01 00:00:00


In [12]:
# Add 1 year to the datetime
datetime = pd.to_datetime('2022-01-01 12:00:00')
new_datetime = datetime + pd.DateOffset(years=1)
print(new_datetime)


2023-01-01 12:00:00
