# 18 - Working with Dates and Times

## Introduction

Working with dates and times is crucial in data engineering. You'll often need to filter data by date ranges, calculate time differences, and format dates. This notebook covers the `datetime` module.

## What You'll Learn

- Creating date and time objects
- Formatting dates
- Parsing dates from strings
- Calculating time differences
- Working with date ranges


## Getting Current Date and Time

The `datetime` module provides classes for working with dates and times.


In [1]:
from datetime import datetime, date, time

# Get current date and time
now = datetime.now()
print("Current date and time:", now)

# Get just the date
today = date.today()
print("Today's date:", today)

# Get just the time
current_time = now.time()
print("Current time:", current_time)


Current date and time: 2025-12-26 00:30:01.451900
Today's date: 2025-12-26
Current time: 00:30:01.451900


## Creating Specific Dates and Times

You can create date and time objects for specific dates.


In [2]:
# Create a specific date
specific_date = date(2024, 1, 15)
print("Specific date:", specific_date)

# Create a specific datetime
specific_datetime = datetime(2024, 1, 15, 14, 30, 0)
print("Specific datetime:", specific_datetime)

# Create a specific time
specific_time = time(14, 30, 0)
print("Specific time:", specific_time)


Specific date: 2024-01-15
Specific datetime: 2024-01-15 14:30:00
Specific time: 14:30:00


## Formatting Dates

You can format dates into strings using `strftime()` method.


In [3]:
# Format datetime
now = datetime.now()

# Different formats
print("Default format:", now)
print("Formatted (YYYY-MM-DD):", now.strftime("%Y-%m-%d"))
print("Formatted (DD/MM/YYYY):", now.strftime("%d/%m/%Y"))
print("Formatted (Month Day, Year):", now.strftime("%B %d, %Y"))
print("Formatted (with time):", now.strftime("%Y-%m-%d %H:%M:%S"))


Default format: 2025-12-26 00:30:01.461074
Formatted (YYYY-MM-DD): 2025-12-26
Formatted (DD/MM/YYYY): 26/12/2025
Formatted (Month Day, Year): December 26, 2025
Formatted (with time): 2025-12-26 00:30:01


## Parsing Dates from Strings

You can convert strings to datetime objects using `strptime()`.


In [4]:
# Parse string to datetime
date_string = "2024-01-15"
parsed_date = datetime.strptime(date_string, "%Y-%m-%d")
print("Parsed date:", parsed_date)

# Parse with time
datetime_string = "2024-01-15 14:30:00"
parsed_datetime = datetime.strptime(datetime_string, "%Y-%m-%d %H:%M:%S")
print("Parsed datetime:", parsed_datetime)


Parsed date: 2024-01-15 00:00:00
Parsed datetime: 2024-01-15 14:30:00


## Calculating Time Differences

You can calculate the difference between two dates or times.


In [5]:
from datetime import timedelta

# Calculate difference between dates
date1 = date(2024, 1, 1)
date2 = date(2024, 1, 15)
difference = date2 - date1
print("Days difference:", difference.days)

# Add or subtract days
future_date = date1 + timedelta(days=30)
print("30 days from date1:", future_date)

# Calculate time difference
datetime1 = datetime(2024, 1, 1, 10, 0, 0)
datetime2 = datetime(2024, 1, 1, 14, 30, 0)
time_diff = datetime2 - datetime1
print("Time difference:", time_diff)
print("Hours:", time_diff.total_seconds() / 3600)


Days difference: 14
30 days from date1: 2024-01-31
Time difference: 4:30:00
Hours: 4.5


## Extracting Date Components

You can extract individual components from a datetime object.


In [6]:
# Extract components
now = datetime.now()
print("Year:", now.year)
print("Month:", now.month)
print("Day:", now.day)
print("Hour:", now.hour)
print("Minute:", now.minute)
print("Second:", now.second)
print("Weekday (0=Monday, 6=Sunday):", now.weekday())


Year: 2025
Month: 12
Day: 26
Hour: 0
Minute: 30
Second: 1
Weekday (0=Monday, 6=Sunday): 4


## Practical Example: Date Filtering

In data engineering, you often need to filter data by date ranges.


In [7]:
# Example: Filter dates within a range
dates = [
    date(2024, 1, 5),
    date(2024, 1, 15),
    date(2024, 2, 10),
    date(2024, 3, 20)
]

start_date = date(2024, 1, 1)
end_date = date(2024, 2, 28)

# Filter dates in range
filtered_dates = [d for d in dates if start_date <= d <= end_date]
print("Dates in range:", filtered_dates)


Dates in range: [datetime.date(2024, 1, 5), datetime.date(2024, 1, 15), datetime.date(2024, 2, 10)]


## Key Points to Remember

- Use `datetime` module for working with dates and times
- `strftime()` formats datetime to string
- `strptime()` parses string to datetime
- Use `timedelta` for date arithmetic
- Date handling is crucial in data engineering for filtering and aggregating time-series data
- PySpark has similar date functions for DataFrame operations
