# DEA Annual Land Cover Processing - Demo

This notebook demonstrates how to use the DEA (Digital Earth Australia) annual land cover processing workflow to:

1. Download state boundaries for NSW and QLD
2. Configure processing parameters
3. Process annual land cover data (1988-present)
4. Generate woody/non-woody classifications
5. Create time-series animations

## Prerequisites

Before running this notebook, ensure you have:

- Installed all required dependencies (`pip install -r requirements.txt`)
- Downloaded state boundaries (run `scripts/fetch_australian_state_geojson.py`)
- Configured data access (ODC or STAC) - to be implemented in sweep-2

## Note on Data Fetching

This is a **template implementation** for sweep-1. The actual data fetching from DEA (via datacube or STAC) will be implemented in sweep-2. The `fetch_dea_raster_for_year()` function currently raises a `NotImplementedError` with instructions for implementation.

## 1. Setup and Imports

In [None]:
import sys
from pathlib import Path

# Add src to path
repo_root = Path.cwd().parent if 'notebooks' in str(Path.cwd()) else Path.cwd()
sys.path.insert(0, str(repo_root / 'src'))

from aus_land_clearing.dea_processor import (
    load_config,
    load_aoi,
    reclassify_dea_classes,
    process_dea_timeseries
)

import numpy as np
import matplotlib.pyplot as plt

## 2. Load Configuration

In [None]:
# Load configuration
config = load_config()

# Display DEA profile settings
dea_config = config['dea_profile']
print("DEA Processing Configuration:")
print("="*50)
print(f"Product ID: {dea_config['product_id']}")
print(f"Time range: {dea_config['start_year']} - {dea_config['end_year']}")
print(f"CRS: {dea_config['crs']}")
print(f"Resolution: {dea_config['resolution']} meters")
print(f"Output directory: {dea_config['output_dir']}")
print(f"\nStates: {', '.join(dea_config['aoi_paths'].keys())}")

## 3. Test Reclassification Function

Before processing real data, let's test the reclassification logic with synthetic data.

In [None]:
# Create synthetic DEA land cover data
synthetic_data = np.array([
    [111, 124, 214],  # Row 1: woody, woody, non-woody
    [112, 215, 125],  # Row 2: woody, non-woody, woody
    [216, 999, 111]   # Row 3: non-woody, unknown, woody
])

print("Synthetic DEA land cover classes:")
print(synthetic_data)

# Reclassify
classes_map = dea_config['classes_map']
woody_data = reclassify_dea_classes(synthetic_data, classes_map)

print("\nReclassified (1=woody, 0=non-woody, NaN=unknown):")
print(woody_data)

# Visualize
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

im1 = ax1.imshow(synthetic_data, cmap='tab20', interpolation='nearest')
ax1.set_title('Original DEA Classes')
ax1.set_xlabel('Column')
ax1.set_ylabel('Row')
plt.colorbar(im1, ax=ax1)

im2 = ax2.imshow(woody_data, cmap='RdYlGn', vmin=0, vmax=1, interpolation='nearest')
ax2.set_title('Reclassified (Woody/Non-woody)')
ax2.set_xlabel('Column')
ax2.set_ylabel('Row')
plt.colorbar(im2, ax=ax2, label='1=Woody, 0=Non-woody')

plt.tight_layout()
plt.show()

## 4. Load State Boundaries

Load the NSW and QLD boundaries that will be used for spatial subsetting.

In [None]:
# Check if boundary files exist
import os

for state, path in dea_config['aoi_paths'].items():
    if os.path.exists(path):
        print(f"✓ {state.upper()}: {path}")
        # Load and display info
        aoi = load_aoi(path)
        print(f"  CRS: {aoi.crs}")
        print(f"  Bounds: {aoi.total_bounds}")
    else:
        print(f"✗ {state.upper()}: {path} not found")
        print("  Run: python scripts/fetch_australian_state_geojson.py")

## 5. Run Processing (Template)

This section demonstrates how to call the processing function. Note that the actual data fetching is not yet implemented.

In [None]:
# Example: Process a single state for a limited year range
# This will raise NotImplementedError until data fetching is implemented in sweep-2

try:
    result = process_dea_timeseries(
        state_code='nsw',
        years=[2020, 2021, 2022]  # Limited year range for testing
    )
    
    print("Processing completed!")
    print(f"State: {result['state']}")
    print(f"Years processed: {result['years_processed']}")
    print(f"Output directory: {result['output_dir']}")
    
except NotImplementedError as e:
    print("⚠ Data fetching not yet implemented")
    print("="*70)
    print(str(e))
    print("="*70)
    print("\nThis is expected for sweep-1.")
    print("The data fetching backend will be implemented in sweep-2.")

## 6. Alternative: Use Command-Line Script

You can also run the processing using the command-line script:

```bash
# Process all years for both states
python scripts/run_dea_processing.py

# Process only NSW
python scripts/run_dea_processing.py --state nsw

# Process specific year range
python scripts/run_dea_processing.py --years 2020-2023

# Process single year
python scripts/run_dea_processing.py --years 2020 --state qld
```

## 7. Next Steps

### For Sweep-2:

1. **Implement data fetching** in `fetch_dea_raster_for_year()` using:
   - Open Data Cube (ODC) if you have a datacube instance
   - STAC API via `odc-stac` and `pystac_client`
   - Direct download from DEA repository

2. **Test with real data**:
   - Process a single year for a small region first
   - Verify outputs and visualizations
   - Scale up to full time series

3. **Optimize performance**:
   - Add parallel processing for multiple years
   - Implement caching for repeated queries
   - Add progress bars for long-running processes

### Google Earth Engine Alternative:

If you prefer to use GEE, see `gee/dea_annual_landcover_nsw_qld.js` for a template script that uses alternative datasets (ESA WorldCover, Landsat).

## 8. References

- **DEA Land Cover**: https://www.dea.ga.gov.au/products/dea-land-cover
- **DEA STAC API**: https://explorer.dea.ga.gov.au/stac/
- **Open Data Cube**: https://www.opendatacube.org/
- **odc-stac**: https://odc-stac.readthedocs.io/
- **Repository README**: See main README.md for detailed setup instructions