# Panda Datetime module

| Function/Attribute | Purpose | Example |
| :--- | :--- | :--- |
| **`pd.to_datetime()`** | **Converts** a column (Series) containing strings, integers, or lists into the proper Pandas `datetime64[ns]` type. **This is the critical first step.** | `df['Date'] = pd.to_datetime(df['Date_str'])` |
| **`.dt` Accessor** | Provides access to date/time components (attributes) *after* a Series is converted to `datetime64`. | `df['Month'] = df['Date'].dt.month` |
| **`.dt.year`, `.dt.day_name()`, etc.** | Extracts specific components (year, month, day name, hour, etc.) from a datetime Series. | `df['Hour'] = df['Date'].dt.hour` |
| **`df.resample()`** | **Changes the time frequency** of your data (e.g., from daily data to monthly averages). Requires the datetime column to be the index or specified using `on`. | `df.resample('M').mean()` |
| **`pd.date_range()`** | **Generates a fixed-frequency DatetimeIndex** (a sequence of dates/times). Useful for creating new time axes or indices. | `idx = pd.date_range(start='2025-01-01', periods=5, freq='D')` |
| **`pd.Timestamp()`** | Creates a **single point in time** (scalar value). Useful for defining fixed boundaries. | `end_time = pd.Timestamp('2025-12-31')` |
| **`pd.Timedelta()`** | Represents a **duration** or difference between two points in time. | `time_diff = pd.Timedelta(hours=4)` |

In [None]:
import pandas as pd
import datetime as dt
import yfinance as yf
pd.set_option('display.float_format', lambda x: '%.2f' % x) #setting float format to 2 decimal places, this will help in better visualization of data



In [80]:
bigmac.head()

Unnamed: 0,Date,Country,Price in US Dollars
0,2016-01-01,Argentina,2.39
1,2016-01-01,Australia,3.74
2,2016-01-01,Brazil,3.35
3,2016-01-01,Britain,4.22
4,2016-01-01,Canada,4.14


In [81]:
# make up a date
someday = dt.date(2010, 1, 20)
print('datetime', str(someday))
print('year:', someday.year)
print('month:', someday.month)
print('day:', someday.day)

datetime 2010-01-20
year: 2010
month: 1
day: 20


In [82]:
someday

datetime.date(2010, 1, 20)

In [83]:
str(someday)

'2010-01-20'

In [84]:
sometime = dt.datetime(2010, 1, 10, 17, 13, 57) # year, month, day, hour, minute, second
print('datetime:', str(sometime))
print('year:', sometime.year)
print('month:', sometime.month)
print('day:', sometime.day)
print('hour:', sometime.hour)
print('minute:', sometime.minute)
print('second:', sometime.second)

datetime: 2010-01-10 17:13:57
year: 2010
month: 1
day: 10
hour: 17
minute: 13
second: 57


## The `pandas Timestamp` Object

In [85]:
print(pd.Timestamp("2015-03-31")) # ISO 8601 format
print(pd.Timestamp("2015/03/31")) # alternative separator
print(pd.Timestamp("March 31, 2015")) # long format
print(pd.Timestamp("31 March 2015")) # long format, day first
print(pd.Timestamp("2015, 3, 31")) # with commas
print(pd.Timestamp("3/31/2015")) # US format
print(pd.Timestamp("31/3/2015")) # non-US format
print(pd.Timestamp("2015-03-31 18:13:29")) # with time in 24-hour format
print(pd.Timestamp("2015-03-31 6:13:29 PM")) # with time in 12-hour format

2015-03-31 00:00:00
2015-03-31 00:00:00
2015-03-31 00:00:00
2015-03-31 00:00:00
2015-03-31 00:00:00
2015-03-31 00:00:00
2015-03-31 00:00:00
2015-03-31 18:13:29
2015-03-31 18:13:29


## The `pandas DateTimeIndex` Object

In [86]:
dates = ["2016/01/02", "2016/04/12", "2009/09/07"] # list of date strings
pd.DatetimeIndex(dates) # convert to DatetimeIndex

DatetimeIndex(['2016-01-02', '2016-04-12', '2009-09-07'], dtype='datetime64[ns]', freq=None)

In [87]:
dates = [dt.date(2016, 1, 10), dt.date(1994, 6, 13), dt.date(2003, 12, 29)] # list of date objects
dtIndex = pd.DatetimeIndex(dates) # convert to DatetimeIndex
dtIndex

DatetimeIndex(['2016-01-10', '1994-06-13', '2003-12-29'], dtype='datetime64[ns]', freq=None)

## The `pd.to_datetime()` Method

In [88]:
print(pd.to_datetime("2015-01-01")) # single date string
print(pd.to_datetime(dt.date(2015, 1, 1))) # date object
print(pd.to_datetime(dt.datetime(2015, 1, 1, 14, 35, 20))) # datetime object
print(pd.to_datetime(["2015-01-01", "2014-02-08", "2016-02-08", "2017-07-07"])) # list of date strings

2015-01-01 00:00:00
2015-01-01 00:00:00
2015-01-01 14:35:20
DatetimeIndex(['2015-01-01', '2014-02-08', '2016-02-08', '2017-07-07'], dtype='datetime64[ns]', freq=None)


## Import Financial Data Set with `yfinance` Library that correct permission errors associated with `pandas_datareader`

In [89]:
# Download stock data for a given company and date range
company = "AAPL"
start = dt.datetime(2020, 1, 1) # January 1, 2020 with dt package
end = dt.datetime(2025, 1, 1) # January 1, 2020 with dt package

# Use yfinance.download() 
# The function returns a DataFrame directly
stocks = yf.download(company, start=start, end=end)

stocks.tail()

[*********************100%***********************]  1 of 1 completed


Price,Close,High,Low,Open,Volume
Ticker,AAPL,AAPL,AAPL,AAPL,AAPL
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
2024-12-24,257.04,257.05,254.14,254.34,23234700
2024-12-26,257.85,258.93,256.47,257.03,27237100
2024-12-27,254.44,257.54,251.92,256.67,42355300
2024-12-30,251.06,252.36,249.62,251.09,35557500
2024-12-31,249.29,252.14,248.31,251.3,39480700


In [90]:
stocks.columns = stocks.columns.droplevel(1) # remove the 'Ticker' multi index level
stocks.head()

Price,Close,High,Low,Open,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2020-01-02,72.47,72.53,71.22,71.48,135480400
2020-01-03,71.76,72.52,71.54,71.7,146322800
2020-01-06,72.34,72.37,70.63,70.89,118387200
2020-01-07,72.0,72.6,71.78,72.35,108872000
2020-01-08,73.15,73.46,71.7,71.7,132079200


**Use the `.resample()` function**

In [91]:
Monthlystock = stocks.resample('M').mean() # resample to monthly frequency, taking the mean of each month
Monthlystock.head() #the man close, high, low, open make sense but this does not for volume

  Monthlystock = stocks.resample('M').mean() # resample to monthly frequency, taking the mean of each month


Price,Close,High,Low,Open,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2020-01-31,75.26,75.84,74.51,75.08,139731923.81
2020-02-29,75.24,76.21,74.14,75.01,158909431.58
2020-03-31,63.47,65.23,61.64,63.14,285457836.36
2020-04-30,65.88,66.7,64.84,65.74,155490438.1
2020-05-31,75.12,75.92,74.25,74.86,140296800.0


In [94]:
Monthlystock = stocks.resample('ME').agg({
    # Column: Function to apply
    'Close': 'mean',        # Get the average closing price for the month
    'Open': 'mean',         # Get the average opening price for the month
    'High': 'mean',          # Get the highest price for the month
    'Low': 'max',           # Get the lowest price for the month
    'Volume': 'sum',        # Get the total volume traded for the month
    'Close': ['min', 'max'] # You can even apply multiple functions to one column
})
Monthlystock.head() #the man close, high, low, open make sense but this does not for volume

Price,Close,Close,Open,High,Low,Volume
Unnamed: 0_level_1,min,max,mean,mean,max,sum
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
2020-01-31,71.76,78.26,75.08,75.84,77.54,2934370400
2020-02-29,66.11,79.13,75.01,76.21,78.2,3019279200
2020-03-31,54.26,73.22,63.14,65.23,70.89,6280072400
2020-04-30,58.26,71.06,65.74,66.7,69.74,3265299200
2020-05-31,69.91,77.42,74.86,75.92,76.76,2805936000


In [95]:
Annualstock = stocks.resample('YE').agg({
    # Column: Function to apply
    'Close': 'mean',        # Get the average closing price for the month
    'Open': 'mean',         # Get the average opening price for the month
    'High': 'mean',          # Get the highest price for the month
    'Low': 'max',           # Get the lowest price for the month
    'Volume': 'sum',        # Get the total volume traded for the month
    'Close': ['min', 'max'] # You can even apply multiple functions to one column
})
Annualstock.head() #the man close, high, low, open make sense but this does not for volume

Price,Close,Close,Open,High,Low,Volume
Unnamed: 0_level_1,min,max,mean,mean,max,sum
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
2020-12-31,54.26,133.06,92.44,93.8,130.77,39863855600
2021-12-31,113.44,176.62,137.6,139.05,174.86,22812206100
2022-12-31,124.17,178.27,152.06,154.12,175.44,22065504500
2023-12-31,123.16,196.26,170.22,171.8,195.16,14805886900
2024-12-31,163.66,257.85,205.59,207.53,256.47,14390908500


Next Chapter [Data Visualization](9.DataVisualization.ipynb)