Multi-biome forest change detection dataset generation with Sentinel-2 and Google Earth Engine.
- 🌍 Multi-Biome Support: 6 forest biomes (Tropical Rainforest, Tropical Dry Forest, Temperate, Boreal, Mediterranean, Mangroves)
- 🛰️ Sentinel-2 Integration: Automated cloud-masked S2 SR Harmonized exports
- 📊 Hansen GFC Labels: Global Forest Change loss detection (2001-2024)
- 🖼️ Interactive Visualizations: PNG composites with sortable/filterable HTML gallery, statistics dashboard, and fullscreen modal
- 📈 Baseline Methods: NDVI differencing, spectral differencing, simple threshold, and U-Net deep learning
- 🧠 Deep Learning: PyTorch U-Net implementation with training pipeline and comprehensive metrics
- 🎯 Evaluation Metrics: Pixel-level (IoU, F1, Precision, Recall) and event-level (Polygon IoU, Detection rate) metrics
- ⚙️ Extensible Architecture: Plugin system for custom biome samplers, cloud masks, and exporters
- ✅ Type-Safe: Full typing with mypy strict mode
- 📦 Production-Ready: Structured logging, resumable pipelines, comprehensive tests
💡 For complete local development guide, see LOCAL_DEVELOPMENT.md
# Clone repository
git clone https://github.com/yourusername/forest-change.git
cd forest-change
# Install in editable mode with all dependencies
pip install -e .[all]
earthengine authenticate --project your-ee-project-id
forest-change validate-config --config config.yaml
forest-change generate-aoi \
--config config.yaml \
--output output/aoi/forest_aoi.csv \
--count 200
Generates 200 spatially-distributed Area of Interest (AOI) points across 6 forest biomes.
Default output: output/aoi/forest_aoi.csv
forest-change export \
--config config.yaml \
--aoi-csv output/aoi/forest_aoi.csv \
--drive-folder S2H_CD_Multibiome
Exports for each AOI × season × year:
{id}__pre.tif
/{id}__post.tif
: 4-band Sentinel-2 (B2, B3, B4, B8){id}__label.tif
: Binary GFC loss labels{id}__metadata.json
: GeoJSON with acquisition info, cloud %, biome
Two modes available:
# Downloads from Drive, processes, and uploads results back to Drive
forest-change generate-composites \
--config config.yaml \
--drive-geotiff-folder "S2H_CD_Multibiome_2020_2024" \
--drive-metadata-folder "S2H_CD_Multibiome_2020_2024_metadata" \
--drive-output-folder "S2H_CD_Multibiome_2020_2024_composites" \
--service-account service-account-key.json \
--output-dir output/composites \
--dpi 150
# After downloading GeoTIFFs from Google Drive manually
forest-change generate-composites \
--config config.yaml \
--input-dir geotiffs/ \
--output-dir output/composites \
--metadata-dir metadata/ \
--dpi 150
Default output: output/composites/
from pathlib import Path
from forest_change.config.settings import Settings
from forest_change.pipelines.composite import CompositePipeline
settings = Settings.from_file(Path('config.yaml'))
pipeline = CompositePipeline(settings)
# Local mode
stats = pipeline.run(
input_dir=Path('geotiffs/'),
output_dir=Path('output/composites'),
metadata_dir=Path('metadata/'),
percentile_stretch=(2, 98),
dpi=150
)
# Drive mode
stats = pipeline.run_from_drive(
drive_geotiff_folder="S2H_CD_Multibiome_2020_2024",
drive_metadata_folder="S2H_CD_Multibiome_2020_2024_metadata",
drive_output_folder="S2H_CD_Multibiome_2020_2024_composites",
local_output_dir=Path('output/composites'),
service_account_path=Path('service-account-key.json'),
percentile_stretch=(2, 98),
dpi=150
)
Outputs:
output/composites/{id}.png
- 4-panel visualization (PRE/POST/LABEL/OVERLAY)output/composites/index.html
- Interactive gallery with:- Statistics dashboard (avg/max/min loss %)
- Sortable by loss %, biome, AOI ID, name
- Filterable by biome
- Fullscreen modal view (click images, ESC to close)
- Legend on LABEL panel (green=no loss, red=loss)
Generate comprehensive AOI statistics, visualizations, and maps:
# Full mode (default): 10+ visualizations + statistics dashboard
forest-change --config config.yaml generate-statistics \
--aoi-csv output/aoi/forest_aoi.csv \
--output-dir output/statistics \
--all
# Compact mode: statistics dashboard only
forest-change --config config.yaml generate-statistics \
--aoi-csv output/aoi/forest_aoi.csv \
--output-dir output/statistics \
--dashboard-only
Default output: output/statistics/
Full mode outputs (10+ visualizations):
- Global distribution maps (Cartopy + simple matplotlib)
- Geographic distribution by biome (6 scatter plots)
- Latitudinal analysis with climate zones
- Spatial density heatmap (hexbin)
- Hemisphere distribution (3 pie charts)
- Statistical summary table
- Comparative box plots (lat, lon, tree cover)
- Tree cover violin plots
- Correlation matrix
- Interactive HTML index (
statistics/index.html
) - Interpretation guide (
INTERPRETATION_GUIDE.txt
) - Compact statistics dashboard (
statistics_dashboard.png
) - Detailed text report (
statistics_report.txt
)
Dashboard-only mode outputs:
statistics_dashboard.png
- Comprehensive dashboard with 6 panels:- Dataset summary (counts, ranges, means)
- Biome distribution bar chart
- Hemisphere distribution
- Latitudinal distribution histogram
- Climate zone pie chart
- Tree cover distribution (if available)
statistics_report.txt
- Detailed text report with all metrics
Requires Cartopy for advanced map projections (optional):
pip install cartopy
Run baseline change detection methods for comparison with foundation models:
# Single method evaluation
forest-change evaluate-baselines \
--input-dir geotiffs/ \
--output-dir output/baselines \
--method ndvi_diff \
--threshold 0.1
# With labels for automatic metrics computation
forest-change evaluate-baselines \
--input-dir geotiffs/ \
--output-dir output/baselines \
--label-dir geotiffs/ \
--method spectral_diff \
--threshold 0.15
Default output: output/baselines/
Available baseline methods:
ndvi_diff
: NDVI differencing (recommended threshold: 0.1)spectral_diff
: Multi-band spectral differencing with L2 norm (recommended threshold: 0.15)simple_threshold
: Simple brightness decrease (recommended threshold: 0.2)unet
: Deep learning U-Net (requires trained model, see below)
Outputs:
{id}__{method}.tif
- Binary change map{id}__{method}_conf.tif
- Confidence map (0-1)metrics_{method}.json
- Aggregated metrics (if labels provided)
Metrics computed (when labels available):
- Pixel-level: IoU, F1-score, Precision, Recall, Accuracy
- Event-level: Polygon IoU, Detection rate, False event rate, Mean event size
from pathlib import Path
from forest_change.config.settings import Settings
from forest_change.pipelines.baselines import BaselinesPipeline
settings = Settings.from_file(Path('config.yaml'))
pipeline = BaselinesPipeline(settings)
# Process single image pair
result = pipeline.run(
pre_path=Path('composites/item_001__pre.tif'),
post_path=Path('composites/item_001__post.tif'),
method='ndvi_diff',
threshold=0.1
)
# Batch processing with metrics
batch_result = pipeline.run_batch(
input_dir=Path('geotiffs/'),
output_dir=Path('output/baselines'),
label_dir=Path('geotiffs/'), # Optional
method='spectral_diff',
threshold=0.15
)
print(f"Processed: {batch_result['stats']['processed']}")
if batch_result['metrics']:
print(f"Average IoU: {batch_result['metrics'][0]['metrics']['pixel_level']['iou']:.3f}")
Create a comprehensive web dashboard that integrates composites and statistics:
# Generate dashboard from both composites and statistics
forest-change --config config.yaml generate-dashboard \
--output-dir output/dashboard \
--composites-dir output/composites \
--statistics-dir output/statistics \
--title "Forest Change Detection Dashboard"
# Or generate from composites only
forest-change --config config.yaml generate-dashboard \
--output-dir output/dashboard \
--composites-dir output/composites
# Or from statistics only
forest-change --config config.yaml generate-dashboard \
--output-dir output/dashboard \
--statistics-dir output/statistics
Default output: output/dashboard/
Outputs:
output/dashboard/index.html
- Unified web dashboard with:- Overview section with dataset statistics
- Statistics visualizations (maps, plots, analysis)
- Composites gallery with interactive filtering/sorting
- Statistical report viewer
- Responsive design with modern CSS
- Navigation tabs for easy browsing
output/dashboard/dashboard.css
- Comprehensive stylesheet
Features:
- Interactive Navigation: Smooth scrolling tabs (Overview, Statistics, Composites, Report)
- Filtering & Sorting: Filter composites by biome, sort by loss %, AOI ID, or name
- Responsive Design: Works on desktop, tablet, and mobile
- Modern UI: CSS variables, gradients, shadows, animations
- Organized Visualizations: Statistics grouped by category (maps, distribution, statistics, etc.)
from pathlib import Path
from forest_change.config.settings import Settings
from forest_change.pipelines.dashboard import DashboardPipeline
settings = Settings.from_file(Path('config.yaml'))
pipeline = DashboardPipeline(settings)
result = pipeline.run(
output_dir=Path('output/dashboard'),
composites_dir=Path('output/composites'),
statistics_dir=Path('output/statistics'),
title="Forest Change Detection Dashboard"
)
print(f"Dashboard: {result['html_path']}")
print(f"Composites: {result['composites_count']}")
print(f"Statistics: {result['statistics_count']}")
Train a U-Net model for supervised change detection:
# Install PyTorch (required for U-Net)
pip install -e ".[ml]"
# Train U-Net on your dataset
forest-change train-unet \
--train-dir data/train/ \
--val-dir data/val/ \
--output-dir output/unet_checkpoints \
--epochs 50 \
--batch-size 4 \
--lr 1e-4
# Use trained model for inference
forest-change evaluate-baselines \
--input-dir geotiffs/ \
--output-dir output/baselines/unet \
--method unet \
--unet-model-path output/unet_checkpoints/best_model.pt \
--threshold 0.5
Default output: output/unet_checkpoints/
Training data structure:
data/
train/
item001__pre.tif
item001__post.tif
item001__label.tif
...
val/
item050__pre.tif
item050__post.tif
item050__label.tif
...
Architecture: Siamese U-Net with 4 encoder/decoder levels (~7.8M parameters)
Key features:
- Combined BCE + Dice loss for handling class imbalance
- AdamW optimizer with learning rate scheduling
- Automatic checkpointing (best, final, periodic)
- Validation monitoring for early stopping
# Run complete workflow example
make quickstart
# Or manually:
python3 examples/quickstart.py
from pathlib import Path
from forest_change.config.settings import Settings
from forest_change.pipelines.aoi_generation import GenerateAOIPipeline
from forest_change.pipelines.export import ExportPipeline
from forest_change.pipelines.composite import CompositePipeline
from forest_change.pipelines.statistics import StatisticsPipeline
from forest_change.pipelines.dashboard import DashboardPipeline
from forest_change.pipelines.baselines import BaselinesPipeline
# Load configuration
settings = Settings.from_file(Path("config.yaml"))
# 1. Generate AOI dataset
aoi_pipeline = GenerateAOIPipeline(settings)
df = aoi_pipeline.run(
n_total=200,
output_path=Path("output/aoi/forest_aoi.csv"),
dry_run=False
)
# 2. Generate statistics and visualizations (full mode)
stats_pipeline = StatisticsPipeline(settings)
stats = stats_pipeline.run(
aoi_csv=Path("output/aoi/forest_aoi.csv"),
output_dir=Path("output/statistics"),
dpi=150,
cartopy_dpi=300,
generate_all_visualizations=True # Full mode with all visualizations
)
print(f"Total AOIs: {stats['summary']['total_aois']}, Biomes: {stats['summary']['biome_count']}")
print(f"Generated {stats['total_visualizations']} visualizations")
# 3. Export to Google Drive
export_pipeline = ExportPipeline(settings)
export_pipeline.run(
aoi_csv=Path("output/aoi/forest_aoi.csv"),
drive_folder="S2H_CD_Multibiome",
test_mode=False,
resume=True
)
# 4. Generate PNG composites (after downloading from Drive)
composite_pipeline = CompositePipeline(settings)
comp_stats = composite_pipeline.run(
input_dir=Path("geotiffs/"),
output_dir=Path("output/composites"),
metadata_dir=Path("metadata/"), # Optional
percentile_stretch=(2, 98),
dpi=150
)
print(f"Composites: {comp_stats['processed']} processed, {comp_stats['failed']} failed")
# 5. Generate unified dashboard
dashboard_pipeline = DashboardPipeline(settings)
dashboard_result = dashboard_pipeline.run(
output_dir=Path("output/dashboard"),
composites_dir=Path("output/composites"),
statistics_dir=Path("output/statistics"),
title="Forest Change Detection Dashboard"
)
print(f"Dashboard: {dashboard_result['html_path']}")
# 6. Evaluate baseline methods (optional)
baselines_pipeline = BaselinesPipeline(settings)
baseline_result = baselines_pipeline.run_batch(
input_dir=Path("geotiffs/"),
output_dir=Path("output/baselines"),
label_dir=Path("geotiffs/"), # Optional, for metrics
method="ndvi_diff",
threshold=0.1
)
print(f"Baseline: {baseline_result['stats']['processed']} processed")
if baseline_result['metrics']:
avg_iou = sum(m['metrics']['pixel_level']['iou'] for m in baseline_result['metrics']) / len(baseline_result['metrics'])
print(f"Average IoU: {avg_iou:.3f}")
Create config.yaml
:
ee_project_id: your-project-id
n_points_total: 200
biomes:
- name: Tropical Rainforest
percentage: 30
min_distance_km: 15
tree_cover_min: 70
regions:
- name: Amazon
bounds:
lat_min: -10.0
lat_max: -1.0
lon_min: -70.0
lon_max: -55.0
weight: 0.5
- name: Congo Basin
bounds:
lat_min: -4.0
lat_max: 4.0
lon_min: 10.0
lon_max: 30.0
weight: 0.3
- name: Borneo
bounds:
lat_min: -4.0
lat_max: 7.0
lon_min: 108.0
lon_max: 119.0
weight: 0.2
cities_exclude:
- name: Manaus
lat: -3.1
lon: -60.0
buffer_km: 50
Load with:
settings = Settings.from_file("config.yaml")
┌─────────────────────────────────────────┐
│ CLI (Typer) │
│ - generate-aoi │
│ - export │
│ - generate-composites │
│ - generate-statistics (unified) │
│ - generate-dashboard │
│ - evaluate-baselines │
│ - train-unet (NEW) │
│ - validate-config │
└─────────────────────────────────────────┘
│
┌─────────────────────────────────────────┐
│ PIPELINES │
│ - AOI Generation (resumable) │
│ - Export Pipeline (resumable) │
│ - Composite Pipeline (Drive mode) │
│ - Statistics Pipeline (dual mode) │
│ * Full: 10+ visualizations + HTML │
│ * Compact: dashboard only │
│ - Dashboard Pipeline │
│ * Unified HTML/CSS web dashboard │
│ * Integrates composites + statistics │
│ - Baselines Pipeline │
│ * NDVI/spectral/threshold methods │
│ * U-Net deep learning (NEW) │
│ * Pixel & event-level metrics │
└─────────────────────────────────────────┘
│
┌─────────────────────────────────────────┐
│ MODELS (NEW) │
│ - U-Net architecture (~7.8M params) │
│ - PyTorch Dataset loader │
│ - Training pipeline (BCE+Dice loss) │
│ - Checkpointing & validation │
└─────────────────────────────────────────┘
│
┌─────────────────────────────────────────┐
│ CORE │
│ - Biome sampling, validation │
│ - Cloud masking, labeling │
│ - State management │
│ - Metrics (pixel & event-level) │
└─────────────────────────────────────────┘
│
┌─────────────────────────────────────────┐
│ IO │
│ - CSV Reader (auto-detect) │
│ - EE Data Source (S2, GFC) │
│ - Drive I/O (download/upload) │
│ - Drive Exporter (resumable tasks) │
│ - GeoTIFF Reader/Writer │
└─────────────────────────────────────────┘
│
┌─────────────────────────────────────────┐
│ PLUGINS │
│ - Biome Samplers (entry points) │
│ - Cloud Mask Strategies │
│ - Export Backends │
└─────────────────────────────────────────┘
Create custom biome sampler:
from forest_change.plugins.interfaces import BiomeSamplerPlugin
class CustomAmazonSampler(BiomeSamplerPlugin):
def sample(self, biome_config, n_points, existing_points):
# Your custom logic here
return sampled_points
Register in pyproject.toml
:
[project.entry-points."forest_change.biome_samplers"]
custom_amazon = "my_package.samplers:CustomAmazonSampler"
# Clone repository
git clone https://github.com/yourusername/forest-change.git
cd forest-change
# Install in editable mode with dev dependencies
pip install -e .[dev]
# Install pre-commit hooks
pre-commit install
# Run tests
pytest
# Run tests with coverage
pytest --cov=forest_change --cov-report=html
# Type check
mypy forest_change/
# Lint
ruff check forest_change/
# Format
black forest_change/
If you have existing code using aoi_gen.py
, export.py
, or create_composites.py
, see MIGRATION.md for detailed migration guide with 1:1 function mappings.
See CONTRIBUTING.md for guidelines.
Apache License 2.0. See LICENSE for details.
If you use this toolkit in your research, please cite:
@software{forest_change_toolkit,
title = {Forest Change Detection Toolkit},
author = {Contributors},
year = {2025},
url = {https://github.com/yourusername/forest-change}
}
- Google Earth Engine: Cloud-based geospatial analysis platform
- Hansen GFC: Global Forest Change dataset (Hansen et al., 2013)
- Sentinel-2: ESA Copernicus Earth observation mission