# Function 9: Analyze Temporal Patterns 📈

**🤖 AI-Enhanced Learning: Time Series Analysis with Pandas**

In this notebook, you'll learn how to build the `analyze_temporal_patterns()` function using AI assistance. This function analyzes temporal patterns in environmental data, calculating trends, seasonal patterns, and statistical summaries over time periods.

## 🎯 What This Function Does
- Converts date columns to pandas datetime objects
- Calculates temporal trends and seasonal patterns
- Performs time-based grouping and resampling
- Identifies patterns across multiple time scales
- Generates comprehensive temporal analysis reports

## 🤖 AI Learning Objectives
By the end of this notebook, you will:
1. **Use Copilot CHAT** to understand pandas datetime functionality
2. **Use AGENT mode** to implement groupby operations and time resampling
3. **Use EDIT mode** to add trend analysis and statistical calculations
4. **Learn about** datetime indexing, time series analysis, and temporal aggregation

## 🔧 Function Signature
```python
def analyze_temporal_patterns(df, date_column='date', value_column='temperature', 
                            groupby_column='station_id'):
    """
    Args:
        df (pandas.DataFrame): Environmental data with datetime information
        date_column (str): Name of date/datetime column
        value_column (str): Name of value column to analyze over time
        groupby_column (str): Column to group by for pattern analysis
    
    Returns:
        dict: Temporal analysis results with trends, patterns, and statistics
    """
```

## 🚀 Let's Discover Time Patterns!


## 🚀 Step 1: Import Libraries and Prepare Data

Let's start by importing pandas and preparing time series data:


In [None]:
import pandas as pd
import numpy as np
from datetime import datetime, timedelta

print(f"✅ Pandas version: {pd.__version__}")

# Load our environmental data
df = pd.read_csv('../data/temperature_readings.csv')

# Create sample temporal data if date column doesn't exist
if 'date' not in df.columns:
    # Create a range of dates for demonstration
    start_date = datetime(2023, 1, 1)
    dates = [start_date + timedelta(days=i) for i in range(len(df))]
    df['date'] = dates
    print("📅 Created sample date column for temporal analysis")

print(f"📊 Data shape: {df.shape}")
print(f"🗓️  Date range: {df['date'].min()} to {df['date'].max()}")
print(f"📈 Sample data with dates:")
print(df.head())


## 📅 Step 2: Understanding Pandas Datetime Operations

**Temporal pattern analysis** is crucial for environmental data because:

- **Seasonal patterns**: Temperature varies by month/season
- **Daily cycles**: Weather patterns change throughout the day
- **Long-term trends**: Climate change over years
- **Anomaly detection**: Unusual weather events

### 🕰️ **Key Pandas Datetime Functions:**

1. **`pd.to_datetime()`**: Convert strings to datetime objects
2. **`.dt.hour/.dt.month/.dt.year`**: Extract time components
3. **`.groupby(df['date'].dt.month)`**: Group by time periods
4. **`.resample()`**: Aggregate data over time periods  
5. **`.rolling()`**: Calculate moving averages


In [None]:
# Step 3: Convert date column to datetime and explore temporal components
print("📅 Converting date column to pandas datetime")

# Convert date column to datetime if it's not already
df['date'] = pd.to_datetime(df['date'])

# Extract temporal components
df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month
df['day'] = df['date'].dt.day
df['weekday'] = df['date'].dt.day_name()

print(f"✅ Date conversion complete!")
print(f"📊 Data types:")
print(df[['date', 'year', 'month', 'day', 'weekday', 'temperature']].dtypes)

print(f"\n🗓️ Temporal components sample:")
print(df[['date', 'year', 'month', 'day', 'weekday', 'temperature']].head())


## 📊 Step 4: Analyzing Temporal Patterns

Let's analyze different types of temporal patterns in our environmental data:


In [None]:
# Example 1: Monthly patterns - seasonal analysis
print("🌅 Analyzing monthly temperature patterns")

monthly_stats = df.groupby('month')['temperature'].agg(['mean', 'min', 'max', 'count'])
monthly_stats['month_name'] = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                               'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'][:len(monthly_stats)]

print("📊 Monthly temperature statistics:")
print(monthly_stats)

# Find warmest and coolest months
warmest_month = monthly_stats['mean'].idxmax()
coolest_month = monthly_stats['mean'].idxmin()

print(f"\n🔥 Warmest month: {warmest_month} ({monthly_stats.loc[warmest_month, 'mean']:.1f}°C average)")
print(f"❄️  Coolest month: {coolest_month} ({monthly_stats.loc[coolest_month, 'mean']:.1f}°C average)")
print(f"🌡️  Temperature range: {monthly_stats['mean'].max() - monthly_stats['mean'].min():.1f}°C")


In [None]:
# Step 5: Building the Complete Temporal Analysis Function
def analyze_temporal_patterns(df, date_column='date', value_column='temperature', groupby_column='station_id'):
    """
    Analyze temporal patterns in environmental data.
    
    Args:
        df (pandas.DataFrame): Environmental data with datetime information
        date_column (str): Name of date/datetime column
        value_column (str): Name of value column to analyze over time  
        groupby_column (str): Column to group by for pattern analysis
    
    Returns:
        dict: Temporal analysis results with trends, patterns, and statistics
    """
    print("=" * 50)
    print("ANALYZING TEMPORAL PATTERNS")
    print("=" * 50)
    
    # Validate inputs
    if date_column not in df.columns:
        print(f"❌ Error: Date column '{date_column}' not found")
        return {}
        
    if value_column not in df.columns:
        print(f"❌ Error: Value column '{value_column}' not found")
        return {}
    
    # Make a copy and ensure date column is datetime
    analysis_df = df.copy()
    analysis_df[date_column] = pd.to_datetime(analysis_df[date_column])
    
    print(f"📅 Date range: {analysis_df[date_column].min()} to {analysis_df[date_column].max()}")
    print(f"📊 Analyzing {len(analysis_df)} records")
    
    # Extract temporal components
    analysis_df['year'] = analysis_df[date_column].dt.year
    analysis_df['month'] = analysis_df[date_column].dt.month
    analysis_df['day_of_week'] = analysis_df[date_column].dt.dayofweek
    analysis_df['day_name'] = analysis_df[date_column].dt.day_name()
    
    results = {}
    
    # 1. Overall statistics
    results['overall_stats'] = {
        'mean': analysis_df[value_column].mean(),
        'min': analysis_df[value_column].min(),
        'max': analysis_df[value_column].max(),
        'std': analysis_df[value_column].std(),
        'count': len(analysis_df)
    }
    
    print(f"\n📈 Overall {value_column} statistics:")
    print(f"   Mean: {results['overall_stats']['mean']:.2f}")
    print(f"   Range: {results['overall_stats']['min']:.2f} to {results['overall_stats']['max']:.2f}")
    print(f"   Std Dev: {results['overall_stats']['std']:.2f}")
    
    # 2. Monthly patterns
    monthly_patterns = analysis_df.groupby('month')[value_column].agg(['mean', 'min', 'max', 'count'])
    results['monthly_patterns'] = monthly_patterns.to_dict('index')
    
    warmest_month = monthly_patterns['mean'].idxmax()
    coolest_month = monthly_patterns['mean'].idxmin()
    
    results['seasonal_summary'] = {
        'warmest_month': int(warmest_month),
        'warmest_temp': float(monthly_patterns.loc[warmest_month, 'mean']),
        'coolest_month': int(coolest_month),
        'coolest_temp': float(monthly_patterns.loc[coolest_month, 'mean']),
        'seasonal_range': float(monthly_patterns['mean'].max() - monthly_patterns['mean'].min())
    }
    
    print(f"\n🌅 Seasonal patterns:")
    print(f"   Warmest month: {warmest_month} ({results['seasonal_summary']['warmest_temp']:.1f}°C)")
    print(f"   Coolest month: {coolest_month} ({results['seasonal_summary']['coolest_temp']:.1f}°C)")
    print(f"   Seasonal range: {results['seasonal_summary']['seasonal_range']:.1f}°C")
    
    # 3. Day of week patterns
    if len(analysis_df['day_of_week'].unique()) > 1:
        dow_patterns = analysis_df.groupby(['day_of_week', 'day_name'])[value_column].mean().reset_index()
        results['day_of_week_patterns'] = dow_patterns.to_dict('records')
        
        print(f"\n📅 Day of week patterns:")
        for _, row in dow_patterns.iterrows():
            print(f"   {row['day_name']}: {row[value_column]:.2f}")
    
    # 4. Station-specific patterns (if groupby column exists and has multiple values)
    if (groupby_column in analysis_df.columns and 
        analysis_df[groupby_column].nunique() > 1):
        
        station_patterns = analysis_df.groupby(groupby_column)[value_column].agg(['mean', 'min', 'max', 'count'])
        results['station_patterns'] = station_patterns.to_dict('index')
        
        # Find stations with extreme values
        hottest_station = station_patterns['mean'].idxmax()
        coolest_station = station_patterns['mean'].idxmin()
        
        results['station_summary'] = {
            'hottest_station': str(hottest_station),
            'hottest_temp': float(station_patterns.loc[hottest_station, 'mean']),
            'coolest_station': str(coolest_station),
            'coolest_temp': float(station_patterns.loc[coolest_station, 'mean'])
        }
        
        print(f"\n🌡️  Station patterns:")
        print(f"   Hottest station: {hottest_station} ({results['station_summary']['hottest_temp']:.1f}°C)")
        print(f"   Coolest station: {coolest_station} ({results['station_summary']['coolest_temp']:.1f}°C)")
    
    # 5. Temporal trends (if multiple years)
    unique_years = analysis_df['year'].unique()
    if len(unique_years) > 1:
        yearly_trends = analysis_df.groupby('year')[value_column].mean()
        results['yearly_trends'] = yearly_trends.to_dict()
        
        # Calculate trend direction
        first_year_avg = yearly_trends.iloc[0]
        last_year_avg = yearly_trends.iloc[-1]
        trend_change = last_year_avg - first_year_avg
        
        results['trend_analysis'] = {
            'first_year': int(yearly_trends.index[0]),
            'last_year': int(yearly_trends.index[-1]),
            'first_year_avg': float(first_year_avg),
            'last_year_avg': float(last_year_avg),
            'total_change': float(trend_change),
            'trend_direction': 'warming' if trend_change > 0 else 'cooling'
        }
        
        print(f"\n📊 Long-term trends:")
        print(f"   {yearly_trends.index[0]}: {first_year_avg:.2f}°C")
        print(f"   {yearly_trends.index[-1]}: {last_year_avg:.2f}°C")
        print(f"   Change: {trend_change:+.2f}°C ({results['trend_analysis']['trend_direction']})")
    
    print(f"\n✅ Temporal analysis complete!")
    
    return results

# Test the function
print("🧪 Testing temporal analysis function:")
temporal_results = analyze_temporal_patterns(df, 'date', 'temperature', 'station_id')

print(f"\n📋 Results summary:")
print(f"Number of analysis categories: {len(temporal_results)}")
for key in temporal_results.keys():
    print(f"  - {key}")


## 🎯 **Your Assignment Task**

### **✅ STEP-BY-STEP INSTRUCTIONS:**

#### **1. COPY YOUR WORKING FUNCTION**
- **FROM**: The complete function in the cell above
- **TO**: `src/pandas_basics.py` 
- **REPLACE**: All the TODO comments in the `analyze_temporal_patterns()` function

#### **2. TEST YOUR IMPLEMENTATION:**
```bash
# Test just this function
uv run pytest tests/test_pandas_basics.py::test_analyze_temporal_patterns -v
```

#### **3. ⚠️ COMMON MISTAKES TO AVOID:**
- ❌ **Forgetting `pd.to_datetime()`** for date conversion
- ❌ **Not handling missing columns** gracefully
- ❌ **Accessing `.dt` on non-datetime columns** (causes errors)
- ✅ **Do validate data types** before datetime operations

---

## 🔑 Key Learning Points

- **`pd.to_datetime()`** converts strings to datetime objects for temporal analysis
- **`.dt.year/.dt.month`** extracts specific time components from datetime columns
- **Temporal grouping** reveals seasonal patterns and trends in environmental data
- **Multiple time scales** (yearly, monthly, daily) show different patterns
- **Trend analysis** identifies long-term changes in data over time
- **Dictionary results** organize complex analysis results for easy access

## 🚀 Congratulations!

You've completed all **8 functions** in the pandas assignment! 🎉

**Final steps:**
1. **Test all functions**: `uv run pytest tests/ -v`
2. **Complete your reflection**: Write `AI_LEARNING_REFLECTION.md` 
3. **Submit your assignment**: Push to your repository

**Remember: You've learned professional-grade pandas skills that are used daily by environmental scientists, GIS professionals, and data analysts worldwide! 🌍📊**
