# Temporal Raster Averaging Utility - Demo

This notebook demonstrates the `temporal_raster_utils` module for computing temporal statistics (mean, sum, max, etc.) from monthly raster files.

**Features:**
- ✅ Automatic month detection from filenames
- ✅ Strict validation (requires 12 complete months)
- ✅ Multiple statistics (mean, median, sum, min, max, std)
- ✅ Custom periods (seasonal, custom date ranges)
- ✅ Spatial consistency checking
- ✅ Jupyter-friendly with progress bars

**Example Data:** Togo ODIAC CO2 emissions (2022)

## Setup

In [None]:
import sys
from pathlib import Path
import rasterio
from rasterio.plot import show
import matplotlib.pyplot as plt
import numpy as np

# Add src to path
project_root = Path.cwd().parent
sys.path.insert(0, str(project_root / "src"))

from geoworkflow.utils.temporal_raster_utils import compute_temporal_average

# Define paths
DATA_DIR = project_root.parent / "data" / "ghg" / "odiac" / "countries" / "TGO"
OUTPUT_DIR = DATA_DIR / "temporal_analysis"
OUTPUT_DIR.mkdir(exist_ok=True)

print("✓ Setup complete")
print(f"Data directory: {DATA_DIR}")
print(f"Output directory: {OUTPUT_DIR}")

---

## Example 1: Annual Mean

Compute the annual mean from 12 monthly TIFFs.

In [None]:
result = compute_temporal_average(
    input_dir=DATA_DIR,
    output_path=OUTPUT_DIR / "tgo_annual_mean.tif",
    statistic="mean",
    overwrite=True
)

print(f"\n✓ {result['message']}")
print(f"Output file: {result['output_file'].name}")

### Visualize the Annual Mean

In [None]:
# Load and visualize the annual mean
with rasterio.open(OUTPUT_DIR / "tgo_annual_mean.tif") as src:
    data = src.read(1)
    
    # Mask nodata
    data_masked = np.ma.masked_equal(data, src.nodata)
    
    print(f"Raster info:")
    print(f"  Shape: {src.shape}")
    print(f"  CRS: {src.crs}")
    print(f"  Min: {data_masked.min():.2f}")
    print(f"  Max: {data_masked.max():.2f}")
    print(f"  Mean: {data_masked.mean():.2f}")
    
    # Plot
    fig, ax = plt.subplots(figsize=(12, 8))
    im = ax.imshow(data_masked, cmap='YlOrRd', interpolation='nearest')
    plt.colorbar(im, ax=ax, label='CO₂ Emissions (tonne C/km²/month)', shrink=0.6)
    ax.set_title('Togo: Annual Mean CO₂ Emissions (2022)', fontsize=14, fontweight='bold')
    ax.axis('off')
    plt.tight_layout()
    plt.show()

---

## Example 2: Multiple Statistics

Compute different temporal statistics: mean, sum, max, std

In [None]:
statistics = ['mean', 'sum', 'max', 'std']

print("Computing multiple statistics...\n")

for stat in statistics:
    result = compute_temporal_average(
        input_dir=DATA_DIR,
        output_path=OUTPUT_DIR / f"tgo_annual_{stat}.tif",
        statistic=stat,
        verbose=False,
        overwrite=True
    )
    print(f"✓ {stat}: {result['output_file'].name}")

print("\nAll statistics computed successfully!")

### Compare Statistics Side-by-Side

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(14, 12))
axes = axes.flatten()

titles = {
    'mean': 'Annual Mean',
    'sum': 'Annual Total (Sum)',
    'max': 'Annual Maximum',
    'std': 'Annual Std Dev'
}

for idx, stat in enumerate(statistics):
    with rasterio.open(OUTPUT_DIR / f"tgo_annual_{stat}.tif") as src:
        data = src.read(1)
        data_masked = np.ma.masked_equal(data, src.nodata)
        
        im = axes[idx].imshow(data_masked, cmap='YlOrRd', interpolation='nearest')
        axes[idx].set_title(f"{titles[stat]}\n(Min: {data_masked.min():.1f}, Max: {data_masked.max():.1f})",
                           fontsize=11, fontweight='bold')
        axes[idx].axis('off')
        plt.colorbar(im, ax=axes[idx], shrink=0.7)

plt.suptitle('Togo CO₂ Emissions - Multiple Statistics (2022)', 
             fontsize=14, fontweight='bold', y=0.98)
plt.tight_layout()
plt.show()

---

## Example 3: Seasonal Analysis

Compute means for different seasons:
- **Winter (DJF)**: December, January, February
- **Spring (MAM)**: March, April, May
- **Summer (JJA)**: June, July, August
- **Fall (SON)**: September, October, November

In [None]:
seasons = {
    'Winter (DJF)': [12, 1, 2],
    'Spring (MAM)': [3, 4, 5],
    'Summer (JJA)': [6, 7, 8],
    'Fall (SON)': [9, 10, 11]
}

print("Computing seasonal means...\n")

seasonal_results = {}
for season_name, months in seasons.items():
    result = compute_temporal_average(
        input_dir=DATA_DIR,
        output_path=OUTPUT_DIR / f"tgo_{season_name.split()[0].lower()}_mean.tif",
        statistic="mean",
        period_months=months,
        require_complete=False,
        verbose=False,
        overwrite=True
    )
    seasonal_results[season_name] = result
    print(f"✓ {season_name}: {result['input_count']} months averaged")

print("\nAll seasons computed!")

### Compare Seasonal Patterns

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(14, 12))
axes = axes.flatten()

for idx, (season_name, result) in enumerate(seasonal_results.items()):
    with rasterio.open(result['output_file']) as src:
        data = src.read(1)
        data_masked = np.ma.masked_equal(data, src.nodata)
        
        im = axes[idx].imshow(data_masked, cmap='YlOrRd', interpolation='nearest',
                             vmin=data_masked.min(), vmax=data_masked.max())
        axes[idx].set_title(f"{season_name}\n(Mean: {data_masked.mean():.1f} tonne C/km²/month)",
                           fontsize=11, fontweight='bold')
        axes[idx].axis('off')
        plt.colorbar(im, ax=axes[idx], shrink=0.7)

plt.suptitle('Togo CO₂ Emissions - Seasonal Comparison (2022)', 
             fontsize=14, fontweight='bold', y=0.98)
plt.tight_layout()
plt.show()

---

## Example 4: Custom Period Analysis

Analyze a custom period - e.g., the dry season (November through March)

In [None]:
# Custom period: Dry season (Nov-Dec-Jan-Feb-Mar)
result = compute_temporal_average(
    input_dir=DATA_DIR,
    output_path=OUTPUT_DIR / "tgo_dry_season_mean.tif",
    statistic="mean",
    period_months=[11, 12, 1, 2, 3],
    require_complete=False,
    overwrite=True
)

print(f"\n✓ Custom period processed: {result['input_count']} months")

---

## Example 5: Summary Statistics from Output Files

In [None]:
import pandas as pd

# Collect statistics from all seasonal outputs
summary_data = []

for season_name, result in seasonal_results.items():
    with rasterio.open(result['output_file']) as src:
        data = src.read(1)
        data_masked = np.ma.masked_equal(data, src.nodata)
        
        summary_data.append({
            'Season': season_name,
            'Mean': f"{data_masked.mean():.2f}",
            'Min': f"{data_masked.min():.2f}",
            'Max': f"{data_masked.max():.2f}",
            'Std Dev': f"{data_masked.std():.2f}",
            'Months': result['input_count']
        })

df = pd.DataFrame(summary_data)
print("\nSeasonal Statistics Summary (tonne C/km²/month)")
print("="*70)
print(df.to_string(index=False))

---

## Summary

This notebook demonstrated the `temporal_raster_utils` module with:

1. ✅ **Annual statistics**: mean, sum, max, std
2. ✅ **Seasonal analysis**: DJF, MAM, JJA, SON
3. ✅ **Custom periods**: User-defined month ranges
4. ✅ **Visualization**: Raster plots and comparisons
5. ✅ **Summary tables**: Statistical summaries

### Output Files Created

All outputs are saved in: `{OUTPUT_DIR}`

In [None]:
# List all generated files
output_files = sorted(OUTPUT_DIR.glob("*.tif"))

print(f"Generated {len(output_files)} output files:")
print("="*70)

total_size = 0
for f in output_files:
    size_kb = f.stat().st_size / 1024
    total_size += size_kb
    print(f"  {f.name:40s} {size_kb:6.1f} KB")

print("="*70)
print(f"Total size: {total_size/1024:.2f} MB")

---

## Additional Usage Notes

### Month Detection Strategies

The utility supports three month detection methods:

```python
# 1. Automatic (default) - tries filename then metadata
result = compute_temporal_average(
    input_dir="data/",
    output_path="output.tif",
    month_detection="auto"  # default
)

# 2. Filename only - with custom pattern
result = compute_temporal_average(
    input_dir="data/",
    output_path="output.tif",
    month_detection="filename",
    filename_pattern=r"month_(\d{2})\.tif"  # custom regex
)

# 3. Metadata only
result = compute_temporal_average(
    input_dir="data/",
    output_path="output.tif",
    month_detection="metadata",
    metadata_key="TEMPORAL_PERIOD"  # custom key
)

# 4. Manual mapping
result = compute_temporal_average(
    input_dir="data/",
    output_path="output.tif",
    month_detection="manual",
    file_month_mapping={
        "jan.tif": 1,
        "feb.tif": 2,
        # ... etc
    }
)
```

### Validation Options

```python
# Strict: require all 12 months (default)
result = compute_temporal_average(
    input_dir="data/",
    output_path="output.tif",
    require_complete=True  # default
)

# Flexible: allow missing months with warning
result = compute_temporal_average(
    input_dir="data/",
    output_path="output.tif",
    require_complete=False
)
```