# Week 5: Date & Time Operations - Part 2: Advanced DateTime Analysis
**Business Scenario**: NaijaCommerce Delivery Performance & Customer Analytics
**Focus**: Date arithmetic, time series analysis, and business intelligence

## Learning Objectives
- Perform date arithmetic using timedelta operations
- Calculate business metrics like delivery times and customer lifetime
- Use rolling windows and period-over-period comparisons
- Create time-based cohort analysis for business insights

## 📚 Setup and Data Loading
Let's start with our imports and recreate our dataset from Part 1.

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

# Set display options
pd.set_option('display.max_columns', None)
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("📦 All libraries imported successfully!")

In [None]:
# Recreate our sample dataset
np.random.seed(42)
n_orders = 15000
start_date = datetime(2017, 1, 1)
end_date = datetime(2018, 12, 31)

# Generate realistic e-commerce data
orders_df = pd.DataFrame({
    'order_id': [f'order_{i:06d}' for i in range(n_orders)],
    'customer_id': [f'customer_{np.random.randint(1, 8000):06d}' for _ in range(n_orders)],
    'order_purchase_timestamp': pd.date_range(start=start_date, end=end_date, periods=n_orders),
    'order_status': np.random.choice(['delivered', 'shipped', 'cancelled'], n_orders, p=[0.82, 0.13, 0.05])
})

# Add realistic delivery and approval timestamps
approval_delays = np.random.exponential(scale=1.2, size=n_orders)
delivery_delays = np.random.exponential(scale=8, size=n_orders) + approval_delays

orders_df['order_approved_at'] = orders_df['order_purchase_timestamp'] + pd.to_timedelta(approval_delays, unit='D')
orders_df['order_delivered_customer_date'] = orders_df['order_purchase_timestamp'] + pd.to_timedelta(delivery_delays, unit='D')
orders_df['order_estimated_delivery_date'] = orders_df['order_purchase_timestamp'] + pd.to_timedelta(10, unit='D')  # Standard 10-day estimate

# Set delivery dates to None for non-delivered orders
mask = orders_df['order_status'] != 'delivered'
orders_df.loc[mask, 'order_delivered_customer_date'] = pd.NaT

# Create order items with pricing
delivered_orders = orders_df[orders_df['order_status'] == 'delivered'].copy()
order_items = []
for _, order in delivered_orders.iterrows():
    n_items = np.random.choice([1, 2, 3], p=[0.7, 0.25, 0.05])  # Most orders have 1 item
    for item_id in range(n_items):
        order_items.append({
            'order_id': order['order_id'],
            'order_item_id': item_id + 1,
            'price': np.random.exponential(scale=50) + 10,  # Price between 10-200 with exponential distribution
            'product_id': f'product_{np.random.randint(1, 1000):04d}'
        })

order_items_df = pd.DataFrame(order_items)

print(f"📊 Created dataset with {len(orders_df):,} orders and {len(order_items_df):,} items")
print(f"📅 Date range: {orders_df['order_purchase_timestamp'].min().date()} to {orders_df['order_purchase_timestamp'].max().date()}")
print(f"🎯 Delivered orders: {(orders_df['order_status'] == 'delivered').sum():,}")

## ⏰ Part 1: Date Arithmetic with Timedelta
Learn to calculate delivery times, customer age, and other business metrics using date arithmetic.

### 1.1 Basic Date Arithmetic - Calculating Delivery Times

In [None]:
# Calculate delivery performance metrics
# This replicates our SQL date arithmetic from Thursday

delivered_orders = orders_df[orders_df['order_status'] == 'delivered'].copy()

# Calculate time differences (equivalent to SQL date subtraction)
delivered_orders['total_delivery_time'] = (
    delivered_orders['order_delivered_customer_date'] - delivered_orders['order_purchase_timestamp']
)
delivered_orders['approval_time'] = (
    delivered_orders['order_approved_at'] - delivered_orders['order_purchase_timestamp']
)
delivered_orders['fulfillment_time'] = (
    delivered_orders['order_delivered_customer_date'] - delivered_orders['order_approved_at']
)

# Extract days from timedelta (equivalent to SQL EXTRACT(DAY FROM interval))
delivered_orders['delivery_days'] = delivered_orders['total_delivery_time'].dt.days
delivered_orders['approval_days'] = delivered_orders['approval_time'].dt.days
delivered_orders['fulfillment_days'] = delivered_orders['fulfillment_time'].dt.days

# Display sample results
sample_metrics = delivered_orders[[
    'order_id', 'order_purchase_timestamp', 'order_delivered_customer_date',
    'delivery_days', 'approval_days', 'fulfillment_days'
]].head(10)

print("⏱️ Delivery Performance Metrics (Python equivalent of SQL date arithmetic):")
print(sample_metrics)

In [None]:
# Business KPI: Delivery Performance Summary
delivery_summary = {
    'Total Delivered Orders': len(delivered_orders),
    'Average Delivery Days': delivered_orders['delivery_days'].mean().round(2),
    'Median Delivery Days': delivered_orders['delivery_days'].median(),
    'Average Approval Days': delivered_orders['approval_days'].mean().round(2),
    'Orders Delivered ≤ 7 Days': (delivered_orders['delivery_days'] <= 7).sum(),
    'Fast Delivery Rate (%)': ((delivered_orders['delivery_days'] <= 7).sum() / len(delivered_orders) * 100).round(2)
}

print("📊 DELIVERY PERFORMANCE DASHBOARD")
print("=" * 35)
for key, value in delivery_summary.items():
    print(f"{key}: {value}")

# Create visualization
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
delivered_orders['delivery_days'].hist(bins=30, alpha=0.7, color='skyblue')
plt.axvline(delivered_orders['delivery_days'].mean(), color='red', linestyle='--', label=f'Mean: {delivered_orders["delivery_days"].mean():.1f} days')
plt.xlabel('Delivery Days')
plt.ylabel('Number of Orders')
plt.title('Distribution of Delivery Times')
plt.legend()

plt.subplot(1, 2, 2)
delivery_categories = pd.cut(delivered_orders['delivery_days'], 
                           bins=[0, 7, 14, 21, float('inf')], 
                           labels=['≤7 days', '8-14 days', '15-21 days', '>21 days'])
delivery_categories.value_counts().plot(kind='bar', color=['green', 'orange', 'red', 'darkred'])
plt.title('Delivery Performance Categories')
plt.xticks(rotation=45)
plt.ylabel('Number of Orders')

plt.tight_layout()
plt.show()

### 1.2 Working with Timedelta Objects for Business Rules

In [None]:
# Create business rules using timedelta operations
# Equivalent to SQL INTERVAL operations

# Add business deadlines and classifications
delivered_orders['expected_delivery'] = delivered_orders['order_purchase_timestamp'] + pd.Timedelta(days=7)
delivered_orders['warranty_expiry'] = delivered_orders['order_delivered_customer_date'] + pd.Timedelta(days=90)
delivered_orders['order_age'] = datetime.now() - delivered_orders['order_purchase_timestamp']

# Business classification based on delivery performance
def classify_delivery_performance(delivery_days):
    if delivery_days <= 7:
        return 'Fast Delivery'
    elif delivery_days <= 14:
        return 'Standard Delivery'
    else:
        return 'Slow Delivery'

delivered_orders['delivery_classification'] = delivered_orders['delivery_days'].apply(
    classify_delivery_performance
)

# Late delivery analysis
delivered_orders['is_late'] = (
    delivered_orders['order_delivered_customer_date'] > delivered_orders['order_estimated_delivery_date']
)
delivered_orders['delay_days'] = (
    delivered_orders['order_delivered_customer_date'] - delivered_orders['order_estimated_delivery_date']
).dt.days
delivered_orders['delay_days'] = delivered_orders['delay_days'].clip(lower=0)  # Only positive delays

# Business insights
performance_summary = delivered_orders['delivery_classification'].value_counts()
late_delivery_rate = delivered_orders['is_late'].mean() * 100
avg_delay = delivered_orders[delivered_orders['is_late']]['delay_days'].mean()

print("🚚 DELIVERY PERFORMANCE ANALYSIS")
print("=" * 35)
print(performance_summary)
print(f"\n📈 Late Delivery Rate: {late_delivery_rate:.1f}%")
print(f"📊 Average Delay (when late): {avg_delay:.1f} days")

# Sample of classified orders
sample_classified = delivered_orders[[
    'order_id', 'delivery_days', 'delivery_classification', 'is_late', 'delay_days'
]].head(10)

print("\n📋 Sample Delivery Classifications:")
print(sample_classified)

## 📈 Part 2: Advanced Time Series Analysis
Implement sophisticated temporal analysis techniques for business intelligence.

### 2.1 Period-over-Period Comparisons using shift()

In [None]:
# Create monthly revenue analysis with growth calculations
# Python equivalent of SQL LAG window functions

# First, merge orders with order items to get revenue data
order_revenue = orders_df.merge(order_items_df, on='order_id', how='inner')

# Monthly revenue analysis
monthly_revenue = order_revenue.groupby(
    order_revenue['order_purchase_timestamp'].dt.to_period('M')
).agg({
    'order_id': 'nunique',
    'price': 'sum',
    'customer_id': 'nunique'
}).rename(columns={
    'order_id': 'total_orders',
    'price': 'total_revenue',
    'customer_id': 'unique_customers'
})

# Calculate period-over-period changes (equivalent to SQL LAG)
monthly_revenue['prev_month_orders'] = monthly_revenue['total_orders'].shift(1)
monthly_revenue['prev_month_revenue'] = monthly_revenue['total_revenue'].shift(1)

# Calculate growth rates
monthly_revenue['orders_mom_growth'] = (
    (monthly_revenue['total_orders'] - monthly_revenue['prev_month_orders']) / 
    monthly_revenue['prev_month_orders'] * 100
).round(2)

monthly_revenue['revenue_mom_growth'] = (
    (monthly_revenue['total_revenue'] - monthly_revenue['prev_month_revenue']) / 
    monthly_revenue['prev_month_revenue'] * 100
).round(2)

# Using pandas pct_change() for easier growth calculation
monthly_revenue['orders_growth_pct'] = (monthly_revenue['total_orders'].pct_change() * 100).round(2)
monthly_revenue['revenue_growth_pct'] = (monthly_revenue['total_revenue'].pct_change() * 100).round(2)

print("📊 Month-over-Month Growth Analysis (Python equivalent of SQL LAG functions):")
print(monthly_revenue[['total_orders', 'total_revenue', 'orders_growth_pct', 'revenue_growth_pct']].head(12))

### 2.2 Rolling Windows and Moving Averages

In [None]:
# Create daily order trends with rolling averages
# Python equivalent of SQL window functions

# Daily order counts
daily_orders = orders_df.groupby(
    orders_df['order_purchase_timestamp'].dt.date
)['order_id'].count().reset_index()

daily_orders.columns = ['order_date', 'daily_orders']
daily_orders['order_date'] = pd.to_datetime(daily_orders['order_date'])
daily_orders = daily_orders.sort_values('order_date').reset_index(drop=True)

# Calculate rolling averages (equivalent to SQL ROWS BETWEEN)
daily_orders['rolling_7d_avg'] = daily_orders['daily_orders'].rolling(window=7, min_periods=1).mean().round(2)
daily_orders['rolling_30d_avg'] = daily_orders['daily_orders'].rolling(window=30, min_periods=1).mean().round(2)

# Calculate rolling statistics
daily_orders['rolling_7d_std'] = daily_orders['daily_orders'].rolling(window=7, min_periods=1).std().round(2)
daily_orders['rolling_7d_max'] = daily_orders['daily_orders'].rolling(window=7, min_periods=1).max()
daily_orders['rolling_7d_min'] = daily_orders['daily_orders'].rolling(window=7, min_periods=1).min()

# Day-over-day changes
daily_orders['daily_change'] = daily_orders['daily_orders'].diff()
daily_orders['daily_change_pct'] = daily_orders['daily_orders'].pct_change() * 100

print("📊 Daily Order Trends with Rolling Statistics:")
print(daily_orders[['order_date', 'daily_orders', 'rolling_7d_avg', 'rolling_30d_avg', 'daily_change']].head(15))

# Visualization
plt.figure(figsize=(15, 8))

plt.subplot(2, 1, 1)
plt.plot(daily_orders['order_date'], daily_orders['daily_orders'], alpha=0.3, label='Daily Orders', color='lightblue')
plt.plot(daily_orders['order_date'], daily_orders['rolling_7d_avg'], label='7-Day Average', color='blue')
plt.plot(daily_orders['order_date'], daily_orders['rolling_30d_avg'], label='30-Day Average', color='red')
plt.title('Daily Orders with Moving Averages')
plt.ylabel('Number of Orders')
plt.legend()
plt.grid(alpha=0.3)

plt.subplot(2, 1, 2)
plt.plot(daily_orders['order_date'], daily_orders['daily_change'], color='green', alpha=0.7)
plt.axhline(y=0, color='black', linestyle='--', alpha=0.5)
plt.title('Day-over-Day Change in Orders')
plt.xlabel('Date')
plt.ylabel('Change in Orders')
plt.grid(alpha=0.3)

plt.tight_layout()
plt.show()

# Business insights
avg_daily_orders = daily_orders['daily_orders'].mean()
volatility = daily_orders['daily_orders'].std()
trend_direction = 'Increasing' if daily_orders['rolling_30d_avg'].iloc[-1] > daily_orders['rolling_30d_avg'].iloc[-30] else 'Decreasing'

print(f"\n📈 BUSINESS INSIGHTS:")
print(f"Average Daily Orders: {avg_daily_orders:.1f}")
print(f"Daily Volatility (Std Dev): {volatility:.1f}")
print(f"30-Day Trend: {trend_direction}")

### 2.3 Seasonal Analysis and Patterns

In [None]:
# Comprehensive seasonal analysis
# Python equivalent of our SQL seasonal pattern detection

# Add temporal features for seasonal analysis
order_revenue['month'] = order_revenue['order_purchase_timestamp'].dt.month
order_revenue['quarter'] = order_revenue['order_purchase_timestamp'].dt.quarter
order_revenue['year'] = order_revenue['order_purchase_timestamp'].dt.year
order_revenue['month_name'] = order_revenue['order_purchase_timestamp'].dt.month_name()
order_revenue['day_of_week'] = order_revenue['order_purchase_timestamp'].dt.dayofweek
order_revenue['day_name'] = order_revenue['order_purchase_timestamp'].dt.day_name()

# Monthly seasonal analysis
monthly_seasonal = order_revenue.groupby('month').agg({
    'order_id': 'nunique',
    'price': 'sum',
    'customer_id': 'nunique'
}).rename(columns={
    'order_id': 'total_orders',
    'price': 'total_revenue',
    'customer_id': 'unique_customers'
})

# Calculate seasonal indices (comparing each month to yearly average)
yearly_avg_orders = monthly_seasonal['total_orders'].mean()
yearly_avg_revenue = monthly_seasonal['total_revenue'].mean()

monthly_seasonal['seasonal_index_orders'] = (monthly_seasonal['total_orders'] / yearly_avg_orders * 100).round(2)
monthly_seasonal['seasonal_index_revenue'] = (monthly_seasonal['total_revenue'] / yearly_avg_revenue * 100).round(2)

# Add month names
monthly_seasonal['month_name'] = pd.to_datetime(monthly_seasonal.index, format='%m').month_name()

print("🌿 SEASONAL PATTERN ANALYSIS")
print("=" * 40)
print(monthly_seasonal[['month_name', 'total_orders', 'total_revenue', 'seasonal_index_orders', 'seasonal_index_revenue']])

# Identify peak and low seasons
peak_month_orders = monthly_seasonal['seasonal_index_orders'].idxmax()
low_month_orders = monthly_seasonal['seasonal_index_orders'].idxmin()
peak_month_revenue = monthly_seasonal['seasonal_index_revenue'].idxmax()

print(f"\n📊 SEASONAL INSIGHTS:")
print(f"Peak Order Month: {monthly_seasonal.loc[peak_month_orders, 'month_name']} (Index: {monthly_seasonal.loc[peak_month_orders, 'seasonal_index_orders']})")
print(f"Low Order Month: {monthly_seasonal.loc[low_month_orders, 'month_name']} (Index: {monthly_seasonal.loc[low_month_orders, 'seasonal_index_orders']})")
print(f"Peak Revenue Month: {monthly_seasonal.loc[peak_month_revenue, 'month_name']} (Index: {monthly_seasonal.loc[peak_month_revenue, 'seasonal_index_revenue']})")

# Visualization
fig, axes = plt.subplots(2, 2, figsize=(15, 10))

# Monthly orders
axes[0, 0].bar(monthly_seasonal['month_name'], monthly_seasonal['total_orders'], color='skyblue')
axes[0, 0].set_title('Orders by Month')
axes[0, 0].tick_params(axis='x', rotation=45)

# Monthly revenue
axes[0, 1].bar(monthly_seasonal['month_name'], monthly_seasonal['total_revenue'], color='lightgreen')
axes[0, 1].set_title('Revenue by Month')
axes[0, 1].tick_params(axis='x', rotation=45)

# Seasonal indices
axes[1, 0].plot(monthly_seasonal.index, monthly_seasonal['seasonal_index_orders'], marker='o', color='blue')
axes[1, 0].axhline(y=100, color='red', linestyle='--', alpha=0.7, label='Average (100)')
axes[1, 0].set_title('Seasonal Index - Orders')
axes[1, 0].set_xlabel('Month')
axes[1, 0].legend()
axes[1, 0].grid(alpha=0.3)

# Day of week analysis
dow_analysis = order_revenue.groupby('day_name')['order_id'].nunique().reindex(
    ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
)
axes[1, 1].bar(dow_analysis.index, dow_analysis.values, color='orange')
axes[1, 1].set_title('Orders by Day of Week')
axes[1, 1].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

## 👥 Part 3: Customer Cohort Analysis
Advanced time-based customer analytics for understanding retention and lifetime value.

In [None]:
# Customer cohort analysis based on first purchase month
# Python equivalent of our SQL cohort analysis

# Identify first purchase date for each customer
customer_first_purchase = orders_df.groupby('customer_id')['order_purchase_timestamp'].min().reset_index()
customer_first_purchase.columns = ['customer_id', 'first_purchase_date']
customer_first_purchase['cohort_month'] = customer_first_purchase['first_purchase_date'].dt.to_period('M')

# Merge back with all orders to track customer activity
orders_with_cohort = orders_df.merge(customer_first_purchase, on='customer_id')
orders_with_cohort['order_period'] = orders_with_cohort['order_purchase_timestamp'].dt.to_period('M')

# Calculate months since first purchase
orders_with_cohort['months_since_first_purchase'] = (
    orders_with_cohort['order_period'] - orders_with_cohort['cohort_month']
).apply(lambda x: x.n)

# Create cohort table
cohort_data = orders_with_cohort.groupby(['cohort_month', 'months_since_first_purchase'])['customer_id'].nunique().reset_index()
cohort_data.columns = ['cohort_month', 'period_number', 'active_customers']

# Calculate cohort sizes (customers in each cohort)
cohort_sizes = customer_first_purchase.groupby('cohort_month')['customer_id'].nunique()

# Merge cohort sizes
cohort_data = cohort_data.merge(
    cohort_sizes.reset_index().rename(columns={'customer_id': 'cohort_size'}),
    on='cohort_month'
)

# Calculate retention rates
cohort_data['retention_rate'] = (cohort_data['active_customers'] / cohort_data['cohort_size'] * 100).round(2)

# Focus on first 12 months and recent cohorts
cohort_analysis = cohort_data[
    (cohort_data['period_number'] <= 12) & 
    (cohort_data['cohort_month'] >= '2017-01')
]

print("👥 CUSTOMER COHORT ANALYSIS")
print("=" * 35)
print("Sample cohort retention data:")
print(cohort_analysis[cohort_analysis['cohort_month'] == '2017-01'].head(12))

# Create cohort retention matrix
cohort_matrix = cohort_analysis.pivot(index='cohort_month', columns='period_number', values='retention_rate')

print("\n📊 Cohort Retention Matrix (First 6 months):")
print(cohort_matrix.iloc[:6, :7])  # Show first 6 cohorts, first 7 periods

# Visualization
plt.figure(figsize=(15, 8))
sns.heatmap(cohort_matrix.iloc[:8, :13], annot=True, fmt='.1f', cmap='Blues', 
           cbar_kws={'label': 'Retention Rate (%)'}, vmin=0, vmax=100)
plt.title('Customer Cohort Retention Heatmap')
plt.xlabel('Period Number (Months Since First Purchase)')
plt.ylabel('Cohort Month')
plt.tight_layout()
plt.show()

# Key insights
avg_month_1_retention = cohort_matrix[1].mean()
avg_month_6_retention = cohort_matrix[6].mean()
avg_month_12_retention = cohort_matrix[12].mean()

print(f"\n📈 COHORT INSIGHTS:")
print(f"Average Month 1 Retention: {avg_month_1_retention:.1f}%")
print(f"Average Month 6 Retention: {avg_month_6_retention:.1f}%")
print(f"Average Month 12 Retention: {avg_month_12_retention:.1f}%")

## 🌍 Nigerian Business Context Analysis
Applying our temporal analysis to Nigerian market conditions.

In [None]:
# Nigerian-specific temporal analysis
# Adapting Brazilian data patterns to Nigerian business context

# Define Nigerian seasons and holidays
def get_nigerian_season(month):
    if month in [5, 6, 7, 8, 9, 10]:
        return 'Rainy Season'
    else:
        return 'Dry Season'

def get_nigerian_business_period(month):
    if month == 12:
        return 'Christmas Season'
    elif month == 10:
        return 'Independence Month'
    elif month == 9:
        return 'Back-to-School'
    elif month in [4, 5]:
        return 'Easter/Salary Season'
    else:
        return 'Regular Period'

# Apply Nigerian context to our data
nigeria_analysis = order_revenue.copy()
nigeria_analysis['nigerian_season'] = nigeria_analysis['month'].apply(get_nigerian_season)
nigeria_analysis['business_period'] = nigeria_analysis['month'].apply(get_nigerian_business_period)

# Seasonal impact on delivery performance
seasonal_delivery = delivered_orders.copy()
seasonal_delivery['nigerian_season'] = seasonal_delivery['order_purchase_timestamp'].dt.month.apply(get_nigerian_season)

seasonal_performance = seasonal_delivery.groupby('nigerian_season').agg({
    'delivery_days': ['mean', 'median', 'count'],
    'is_late': 'mean'
}).round(2)

seasonal_performance.columns = ['avg_delivery_days', 'median_delivery_days', 'total_orders', 'late_delivery_rate']
seasonal_performance['late_delivery_rate'] = seasonal_performance['late_delivery_rate'] * 100

print("🌍 NIGERIAN SEASONAL DELIVERY ANALYSIS")
print("=" * 40)
print(seasonal_performance)

# Business period revenue analysis
business_period_analysis = nigeria_analysis.groupby('business_period').agg({
    'order_id': 'nunique',
    'price': 'sum'
}).rename(columns={
    'order_id': 'total_orders',
    'price': 'total_revenue'
})

# Calculate seasonal multipliers for business planning
avg_period_orders = business_period_analysis['total_orders'].mean()
business_period_analysis['demand_multiplier'] = (
    business_period_analysis['total_orders'] / avg_period_orders
).round(2)

# Add business recommendations
def get_business_recommendation(multiplier):
    if multiplier >= 1.2:
        return 'Increase inventory 20%+'
    elif multiplier >= 1.1:
        return 'Increase inventory 10%+'
    elif multiplier <= 0.8:
        return 'Reduce inventory 20%'
    else:
        return 'Maintain normal levels'

business_period_analysis['recommendation'] = business_period_analysis['demand_multiplier'].apply(
    get_business_recommendation
)

print("\n🎯 NIGERIAN BUSINESS PERIOD ANALYSIS")
print("=" * 40)
print(business_period_analysis)

# Visualization
fig, axes = plt.subplots(2, 2, figsize=(15, 10))

# Seasonal delivery performance
seasonal_performance['avg_delivery_days'].plot(kind='bar', ax=axes[0, 0], color=['skyblue', 'orange'])
axes[0, 0].set_title('Average Delivery Days by Nigerian Season')
axes[0, 0].set_ylabel('Days')
axes[0, 0].tick_params(axis='x', rotation=45)

# Business period demand
business_period_analysis['demand_multiplier'].plot(kind='bar', ax=axes[0, 1], color='lightgreen')
axes[0, 1].axhline(y=1, color='red', linestyle='--', alpha=0.7, label='Average Demand')
axes[0, 1].set_title('Demand Multiplier by Business Period')
axes[0, 1].set_ylabel('Multiplier')
axes[0, 1].legend()
axes[0, 1].tick_params(axis='x', rotation=45)

# Monthly pattern with Nigerian context
monthly_nigeria = nigeria_analysis.groupby('month')['order_id'].nunique()
colors = ['red' if month in [12, 10, 9] else 'skyblue' for month in monthly_nigeria.index]
monthly_nigeria.plot(kind='bar', ax=axes[1, 0], color=colors)
axes[1, 0].set_title('Monthly Orders (Red = Nigerian Holiday Months)')
axes[1, 0].set_xlabel('Month')
axes[1, 0].set_ylabel('Orders')

# Seasonal order distribution
seasonal_orders = nigeria_analysis.groupby('nigerian_season')['order_id'].nunique()
seasonal_orders.plot(kind='pie', ax=axes[1, 1], autopct='%1.1f%%')
axes[1, 1].set_title('Order Distribution by Nigerian Season')
axes[1, 1].set_ylabel('')

plt.tight_layout()
plt.show()

print("\n🎯 KEY NIGERIAN BUSINESS INSIGHTS:")
print(f"• Rainy season shows {'higher' if seasonal_performance.loc['Rainy Season', 'avg_delivery_days'] > seasonal_performance.loc['Dry Season', 'avg_delivery_days'] else 'lower'} delivery times")
print(f"• Christmas season shows {business_period_analysis.loc['Christmas Season', 'demand_multiplier']:.1f}x normal demand")
print(f"• Independence month shows {business_period_analysis.loc['Independence Month', 'demand_multiplier']:.1f}x normal demand")
print(f"• Recommendation for Christmas: {business_period_analysis.loc['Christmas Season', 'recommendation']}")

## 🎯 SQL vs Python DateTime Comparison
Summary of equivalent operations and when to use each tool.

In [None]:
# Comparison table of SQL vs Python datetime operations
comparison_data = {
    'Operation': [
        'Extract Month',
        'Extract Year',
        'Date Truncation',
        'Date Arithmetic',
        'Period Comparison',
        'Rolling Average',
        'Date Formatting',
        'Null Handling',
        'Timezone Conversion',
        'Seasonal Analysis'
    ],
    'SQL Approach': [
        'EXTRACT(MONTH FROM date)',
        'EXTRACT(YEAR FROM date)',
        'DATE_TRUNC("month", date)',
        'date1 - date2',
        'LAG(value) OVER (ORDER BY date)',
        'AVG(value) OVER (ROWS 6 PRECEDING)',
        'TO_CHAR(date, "YYYY-MM-DD")',
        'WHERE date IS NOT NULL',
        'AT TIME ZONE "UTC"',
        'CASE WHEN EXTRACT(MONTH...) THEN...'
    ],
    'Python Pandas Approach': [
        'df["date"].dt.month',
        'df["date"].dt.year',
        'df.resample("M")',
        'df["date1"] - df["date2"]',
        'df["value"].shift(1)',
        'df["value"].rolling(7).mean()',
        'df["date"].dt.strftime("%Y-%m-%d")',
        'df["date"].notna()',
        'df["date"].dt.tz_convert("UTC")',
        'df["month"].apply(seasonal_function)'
    ],
    'Best Use Case': [
        'Both equally effective',
        'Both equally effective',
        'Python better for analysis',
        'Both handle well',
        'Python more intuitive',
        'Python much better',
        'Python more flexible',
        'Both handle well',
        'Python more comprehensive',
        'Python better for complex logic'
    ]
}

comparison_df = pd.DataFrame(comparison_data)
print("🔄 SQL vs PYTHON DATETIME OPERATIONS COMPARISON")
print("=" * 60)
print(comparison_df.to_string(index=False))

print("\n🎯 WHEN TO USE EACH TOOL:")
print("\nSQL DateTime Operations:")
print("✅ Database-level filtering and aggregation")
print("✅ Large dataset initial processing")
print("✅ Simple date extractions and basic arithmetic")
print("✅ Integration with existing database workflows")

print("\nPython DateTime Operations:")
print("✅ Complex time series analysis")
print("✅ Advanced statistical operations")
print("✅ Data visualization and dashboards")
print("✅ Machine learning and forecasting")
print("✅ Flexible data manipulation and transformation")

print("\n💡 BEST PRACTICE:")
print("Use SQL for initial data extraction and basic filtering,")
print("then use Python for advanced analysis and visualization!")

## 📝 Summary and Next Steps

### 🎯 Key Achievements
You've now mastered Python datetime operations equivalent to Thursday's SQL analysis:

1. **Date Component Extraction**: Using `.dt` accessor for business reporting
2. **Date Arithmetic**: Calculating delivery times and business metrics
3. **Time Series Analysis**: Rolling windows, period comparisons, and trend analysis
4. **Seasonal Analysis**: Identifying business patterns and seasonal trends
5. **Cohort Analysis**: Understanding customer retention and lifetime value
6. **Nigerian Context**: Adapting analysis for local business conditions

### 🔄 SQL + Python Integration
The real power comes from combining both tools:
- **Extract** data efficiently with SQL
- **Analyze** deeply with Python
- **Visualize** insights with Python libraries
- **Store** results back to database with SQL

### 🚀 Upcoming Applications
These datetime skills will be essential for:
- **Google Looker Studio**: Creating time-based dashboards
- **Streamlit**: Building interactive temporal analytics apps
- **Advanced Analytics**: Time series forecasting and trend prediction
- **Business Intelligence**: Automated reporting and KPI tracking

### 📊 Your Next Challenge
In the exercises, you'll apply these concepts to real business scenarios, comparing your Python results with Thursday's SQL findings!