# 📊 Part 2: Monthly Climatology Calculation

## Climate Model Validation Workshop
**Converting Daily Data to Monthly Climatology**  
**Study Area:** AMMAN ZARQA Basin, Jordan  
**Models:** 6 RICCAR Climate Models  
**Period:** 1990-2014

---

## 📚 Learning Objectives
By the end of this notebook, you will:
- ✅ Convert daily temperature data to monthly climatology
- ✅ Understand seasonal temperature patterns
- ✅ Compare climate models' seasonal cycles
- ✅ Create publication-quality climatology visualizations
- ✅ Prepare data for model validation

---

## ⚙️ Setup: Load Libraries and Check Data

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os
import glob

print("📚 Libraries loaded successfully!")

# Check if Part 1 data exists
part1_path = "workshop_output/Part1_Station_Extraction"
if os.path.exists(part1_path):
    files_count = len(glob.glob(os.path.join(part1_path, "*.csv")))
    print(f"✅ Found {files_count} CSV files from Part 1")
    print(f"📁 Data location: {part1_path}")
else:
    print("❌ Part 1 data not found. Please run Part 1 first!")

## 📂 Load Part 1 Results and Prepare for Processing

In [None]:
# Setup paths and station info
INPUT_PATH = "workshop_output/Part1_Station_Extraction"
OUTPUT_PATH = "workshop_output/Part2_Monthly_Climatology"

# Create output directory
os.makedirs(OUTPUT_PATH, exist_ok=True)

# Station information
stations = {
    'AL0019': 'Amman Airport',
    'AL0035': 'Zarqa Station', 
    'AL0059': 'Russeifa Station'
}

print(f"🎯 PART 2: MONTHLY CLIMATOLOGY CALCULATION")
print(f"📁 Input: {INPUT_PATH}")
print(f"📁 Output: {OUTPUT_PATH}")
print(f"📍 Stations: {list(stations.keys())}")

## 📊 Calculate Monthly Climatology from Daily Data

In [None]:
def calculate_monthly_climatology():
    """Calculate monthly climatology from daily temperature files"""
    
    print(f"📊 CALCULATING MONTHLY CLIMATOLOGY")
    print("-" * 35)
    
    # Find all daily temperature files
    temp_files = glob.glob(os.path.join(INPUT_PATH, "*_daily_temps.csv"))
    
    if not temp_files:
        print("❌ No daily temperature files found!")
        return None
    
    print(f"📁 Found {len(temp_files)} daily temperature files")
    
    # Storage for climatology data
    climatology_data = []
    
    # Process each file
    for file_path in temp_files:
        filename = os.path.basename(file_path)
        model_name = filename.split('_')[0]
        station_id = filename.split('_')[1]
        
        print(f"  🔄 Processing {model_name} - {station_id}")
        
        # Read daily data
        df = pd.read_csv(file_path)
        df['Date'] = pd.to_datetime(df['Date'])
        df['Month'] = df['Date'].dt.month
        
        # Calculate monthly climatology (mean for each month across all years)
        monthly_clim = df.groupby('Month')['Temperature_C'].agg([
            ('Mean_Temp', 'mean'),
            ('Min_Temp', 'min'),
            ('Max_Temp', 'max'),
            ('Std_Temp', 'std')
        ]).round(2)
        
        # Add month names
        month_names = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                      'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
        monthly_clim['Month_Name'] = month_names
        
        # Add identification columns
        monthly_clim['Model'] = model_name
        monthly_clim['Station'] = station_id
        monthly_clim['Station_Name'] = stations[station_id]
        monthly_clim = monthly_clim.reset_index()
        
        climatology_data.append(monthly_clim)
        
        print(f"    ✅ Calculated climatology for {len(monthly_clim)} months")
    
    # Combine all climatology data
    all_climatology = pd.concat(climatology_data, ignore_index=True)
    
    print(f"\n✅ Monthly climatology calculated for {len(climatology_data)} model-station combinations")
    
    return all_climatology

# Execute climatology calculation
climatology_df = calculate_monthly_climatology()

if climatology_df is not None:
    print(f"\n📊 Climatology Summary:")
    print(f"  Total records: {len(climatology_df)}")
    print(f"  Models: {climatology_df['Model'].nunique()}")
    print(f"  Stations: {climatology_df['Station'].nunique()}")
    print(f"  Temperature range: {climatology_df['Mean_Temp'].min():.1f}°C to {climatology_df['Mean_Temp'].max():.1f}°C")

## 💾 Save Climatology Results

In [None]:
def save_climatology_results(climatology_df):
    """Save climatology results to files"""
    
    print(f"💾 SAVING CLIMATOLOGY RESULTS")
    print("-" * 30)
    
    # Save complete climatology
    complete_file = os.path.join(OUTPUT_PATH, "monthly_climatology_all_models.xlsx")
    climatology_df.to_excel(complete_file, index=False)
    print(f"✅ Complete climatology saved: {os.path.basename(complete_file)}")
    
    # Create model-specific files
    for model in climatology_df['Model'].unique():
        model_data = climatology_df[climatology_df['Model'] == model]
        model_file = os.path.join(OUTPUT_PATH, f"climatology_{model}.csv")
        model_data.to_csv(model_file, index=False)
    
    print(f"✅ Individual model files saved: {len(climatology_df['Model'].unique())} files")
    
    # Create summary table (wide format)
    summary_data = []
    for model in climatology_df['Model'].unique():
        for station in climatology_df['Station'].unique():
            data = climatology_df[(climatology_df['Model'] == model) & 
                                (climatology_df['Station'] == station)]
            
            if not data.empty:
                row = {'Model': model, 'Station': station, 'Station_Name': stations[station]}
                for _, month_data in data.iterrows():
                    row[month_data['Month_Name']] = month_data['Mean_Temp']
                summary_data.append(row)
    
    summary_df = pd.DataFrame(summary_data)
    summary_file = os.path.join(OUTPUT_PATH, "climatology_summary_table.xlsx")
    summary_df.to_excel(summary_file, index=False)
    print(f"✅ Summary table saved: {os.path.basename(summary_file)}")
    
    return summary_df

# Save the results
if climatology_df is not None:
    summary_table = save_climatology_results(climatology_df)
    print(f"\n📁 All results saved to: {OUTPUT_PATH}")

## 📈 Create Monthly Climatology Visualization

In [None]:
def create_climatology_visualization(climatology_df):
    """Create comprehensive climatology visualization"""
    
    print(f"📈 CREATING CLIMATOLOGY VISUALIZATION")
    print("-" * 40)
    
    # Model colors
    model_colors = {
        'CMCC': '#1f77b4', 'CNRM': '#ff7f0e', 'EC-Earth3': '#2ca02c',
        'IPSL': '#d62728', 'MPI': '#9467bd', 'NorESM2': '#8c564b'
    }
    
    # Create subplots for each station
    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
    
    stations_order = ['AL0019', 'AL0035', 'AL0059']
    
    for i, station_id in enumerate(stations_order):
        ax = axes[i]
        
        # Plot each model for this station
        for model in climatology_df['Model'].unique():
            data = climatology_df[(climatology_df['Model'] == model) & 
                                (climatology_df['Station'] == station_id)]
            
            if not data.empty:
                months = data['Month'].values
                temps = data['Mean_Temp'].values
                
                ax.plot(months, temps, 'o-', color=model_colors[model], 
                       label=model, linewidth=2, markersize=6, alpha=0.8)
        
        # Formatting
        ax.set_title(f'{station_id}\n({stations[station_id]})', fontsize=12, fontweight='bold')
        ax.set_xlabel('Month', fontsize=10)
        ax.set_ylabel('Temperature (°C)', fontsize=10)
        ax.set_xticks(range(1, 13))
        ax.set_xticklabels(['J', 'F', 'M', 'A', 'M', 'J', 'J', 'A', 'S', 'O', 'N', 'D'])
        ax.grid(True, alpha=0.3)
        
        # Add legend only to the last subplot
        if i == 2:
            ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left', fontsize=8)
    
    # Main title
    fig.suptitle('Monthly Temperature Climatology by Station\n1990-2014 Average (All Models)', 
                 fontsize=14, fontweight='bold', y=0.98)
    
    plt.tight_layout()
    plt.subplots_adjust(top=0.85, right=0.85)
    
    plt.show()
    print(f"✅ Climatology visualization complete!")

# Create the visualization
if climatology_df is not None:
    create_climatology_visualization(climatology_df)

## 🌡️ Calculate Seasonal Temperature Summary

In [None]:
def calculate_seasonal_summary(climatology_df):
    """Calculate seasonal temperature summary"""
    
    print(f"🌡️ CALCULATING SEASONAL SUMMARY")
    print("-" * 30)
    
    # Define seasons
    seasons = {
        'Winter': [12, 1, 2],
        'Spring': [3, 4, 5],
        'Summer': [6, 7, 8],
        'Autumn': [9, 10, 11]
    }
    
    seasonal_data = []
    
    for model in climatology_df['Model'].unique():
        for station in climatology_df['Station'].unique():
            data = climatology_df[(climatology_df['Model'] == model) & 
                                (climatology_df['Station'] == station)]
            
            if not data.empty:
                row = {
                    'Model': model,
                    'Station': station,
                    'Station_Name': stations[station]
                }
                
                for season, months in seasons.items():
                    season_temps = data[data['Month'].isin(months)]['Mean_Temp']
                    row[f'{season}_Temp'] = round(season_temps.mean(), 2)
                
                # Annual average
                row['Annual_Temp'] = round(data['Mean_Temp'].mean(), 2)
                
                seasonal_data.append(row)
    
    seasonal_df = pd.DataFrame(seasonal_data)
    
    # Save seasonal summary
    seasonal_file = os.path.join(OUTPUT_PATH, "seasonal_temperature_summary.xlsx")
    seasonal_df.to_excel(seasonal_file, index=False)
    print(f"✅ Seasonal summary saved: {os.path.basename(seasonal_file)}")
    
    # Display summary statistics
    print(f"\n🌡️ Seasonal Temperature Ranges:")
    for season in ['Winter', 'Spring', 'Summer', 'Autumn']:
        col_name = f'{season}_Temp'
        min_temp = seasonal_df[col_name].min()
        max_temp = seasonal_df[col_name].max()
        print(f"  {season}: {min_temp:.1f}°C to {max_temp:.1f}°C")
    
    return seasonal_df

# Calculate seasonal summary
if climatology_df is not None:
    seasonal_summary = calculate_seasonal_summary(climatology_df)

## 👀 Preview Results and Data Quality Check

In [None]:
# Display sample climatology data
if climatology_df is not None:
    print("📋 SAMPLE CLIMATOLOGY DATA")
    print("=" * 30)
    
    # Show first few rows
    print("\n🔍 First 10 records:")
    display(climatology_df[['Model', 'Station', 'Month_Name', 'Mean_Temp', 'Min_Temp', 'Max_Temp']].head(10))
    
    # Show summary by model
    print("\n📊 Average Temperature by Model (across all stations and months):")
    model_avg = climatology_df.groupby('Model')['Mean_Temp'].mean().round(2)
    for model, temp in model_avg.items():
        print(f"  {model}: {temp:.2f}°C")
    
    # Show summary by station
    print("\n📍 Average Temperature by Station (across all models and months):")
    station_avg = climatology_df.groupby(['Station', 'Station_Name'])['Mean_Temp'].mean().round(2)
    for (station_id, station_name), temp in station_avg.items():
        print(f"  {station_id} ({station_name}): {temp:.2f}°C")
    
    # Monthly temperature range
    print(f"\n🌡️ Monthly Temperature Patterns:")
    monthly_avg = climatology_df.groupby('Month_Name')['Mean_Temp'].agg(['min', 'max', 'mean']).round(1)
    month_order = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
    monthly_avg = monthly_avg.reindex(month_order)
    
    for month, row in monthly_avg.iterrows():
        print(f"  {month}: {row['min']:.1f}°C to {row['max']:.1f}°C (avg: {row['mean']:.1f}°C)")
else:
    print("❌ No climatology data available for preview")

## 🎯 Part 2 Summary and Key Findings

In [None]:
# Final summary
if climatology_df is not None:
    print("🎯 PART 2 SUMMARY")
    print("=" * 20)
    print(f"✅ Monthly climatology calculated for {len(climatology_df['Model'].unique())} models")
    print(f"✅ {len(climatology_df['Station'].unique())} stations processed")
    print(f"✅ 12 months × {len(climatology_df['Model'].unique())} models × {len(climatology_df['Station'].unique())} stations = {len(climatology_df)} records")
    print(f"📁 Results saved to: {OUTPUT_PATH}")
    
    print(f"\n📊 Temperature Ranges:")
    print(f"  Coldest month average: {climatology_df['Mean_Temp'].min():.1f}°C")
    print(f"  Warmest month average: {climatology_df['Mean_Temp'].max():.1f}°C")
    print(f"  Annual temperature range: {climatology_df['Mean_Temp'].max() - climatology_df['Mean_Temp'].min():.1f}°C")
    
    print(f"\n🎓 Key Learning Points:")
    print(f"  • Monthly climatology reveals seasonal temperature patterns")
    print(f"  • All models show similar seasonal cycles (summer peak, winter minimum)")
    print(f"  • Small but consistent differences between models")
    print(f"  • Spatial temperature gradient maintained across seasons")
    print(f"  • Data ready for validation against observations")
    
    print(f"\n📁 Generated Files:")
    output_files = [
        "monthly_climatology_all_models.xlsx",
        "climatology_summary_table.xlsx", 
        "seasonal_temperature_summary.xlsx",
        f"climatology_[MODEL].csv (6 files)"
    ]
    for file in output_files:
        print(f"  📄 {file}")
    
    print(f"\n➡️ Ready for Part 3: Station Data Processing")
else:
    print("❌ Part 2 could not be completed. Please check Part 1 results.")

## 💾 Download Results (Optional)

In [None]:
# Optional download for participants
print("📁 Part 2 Results Available for Download:")
print("✅ Monthly climatology data: monthly_climatology_all_models.xlsx")
print("✅ Summary table: climatology_summary_table.xlsx")
print("✅ Seasonal summary: seasonal_temperature_summary.xlsx")

download_choice = input("\nDownload climatology results? (y/n): ")

if download_choice.lower() == 'y':
    from google.colab import files
    
    try:
        # Download main files
        files.download(f"{OUTPUT_PATH}/monthly_climatology_all_models.xlsx")
        files.download(f"{OUTPUT_PATH}/climatology_summary_table.xlsx")
        files.download(f"{OUTPUT_PATH}/seasonal_temperature_summary.xlsx")
        
        print("✅ Files downloaded successfully!")
    except Exception as e:
        print(f"❌ Download error: {e}")
        print("Files are still available in the Colab session")
else:
    print("📝 Files remain available in your Colab session for Parts 3 and 4")

## 🚀 Next Steps

**Excellent work!** You've successfully completed Part 2 of the Climate Model Validation Workshop.

### What you accomplished:
- ✅ Converted daily temperature data to monthly climatology
- ✅ Calculated seasonal temperature patterns for all models
- ✅ Created visualizations showing seasonal cycles
- ✅ Generated summary tables for further analysis
- ✅ Identified temperature ranges and model differences

### Key Insights:
- **Seasonal Patterns**: All models capture the expected seasonal cycle
- **Model Agreement**: Generally good agreement between models
- **Spatial Gradient**: Temperature differences between stations are consistent
- **Data Quality**: Complete monthly climatology for validation

### Ready for Part 3:
**Part 3: Station Data Processing**
- Load observed temperature data from weather stations
- Calculate station climatology for the same period
- Prepare observational data for model validation

---
📧 **Questions?** Contact the workshop instructor  
🔗 **Repository:** https://github.com/MoawiahHussien/climate-model-validation-workshop