feat(data): GEE tile downloader, band mapping, and cloud masking#7
Merged
Merged
Conversation
added 4 commits
April 15, 2026 19:22
- Downloads real Sentinel-2 composites via Google Earth Engine - Reads required bands from config.yaml per analysis_type - Includes SCL band for downstream cloud masking - Synthetic fallback with explicit is_synthetic flag when GEE unavailable - Fix .gitignore so src/climatevision/data/ is no longer ignored
- get_bands_for_analysis() reads correct bands from config.yaml - get_band_indices() maps band names to canonical 13-band stack positions - is_analysis_enabled() and list_enabled_analysis_types() for config validation - Includes SCL band helpers for downstream cloud masking
- apply_scl_cloud_mask() masks cloudy pixels using Sentinel-2 SCL band - Default clear labels: vegetation, bare soils, water, snow - Update __init__.py to expose gee_downloader and band_mapping utilities
- Remove duplicated config logic in gee_downloader.py; import from band_mapping - Cache config.yaml load in band_mapping.py via lru_cache - Read synthetic tile size from config.yaml instead of hardcoding 256 - Remove unused json import in gee_downloader.py - Add shape validation in apply_scl_cloud_mask
4 tasks
mvanhorn
pushed a commit
to mvanhorn/ClimateVision
that referenced
this pull request
May 17, 2026
…mate-Vision#7) * feat(data): add GEE tile downloader with analysis-aware band selection - Downloads real Sentinel-2 composites via Google Earth Engine - Reads required bands from config.yaml per analysis_type - Includes SCL band for downstream cloud masking - Synthetic fallback with explicit is_synthetic flag when GEE unavailable - Fix .gitignore so src/climatevision/data/ is no longer ignored * feat(data): add analysis-specific Sentinel-2 band mapping utilities - get_bands_for_analysis() reads correct bands from config.yaml - get_band_indices() maps band names to canonical 13-band stack positions - is_analysis_enabled() and list_enabled_analysis_types() for config validation - Includes SCL band helpers for downstream cloud masking * feat(data): integrate SCL cloud masking and export new pipeline modules - apply_scl_cloud_mask() masks cloudy pixels using Sentinel-2 SCL band - Default clear labels: vegetation, bare soils, water, snow - Update __init__.py to expose gee_downloader and band_mapping utilities * refactor(data): address PR review feedback - Remove duplicated config logic in gee_downloader.py; import from band_mapping - Cache config.yaml load in band_mapping.py via lru_cache - Read synthetic tile size from config.yaml instead of hardcoding 256 - Remove unused json import in gee_downloader.py - Add shape validation in apply_scl_cloud_mask --------- Co-authored-by: Adeolu Mary Oshadare <adeolu@placeholder.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds analysis-aware Sentinel-2 tile downloading, band mapping utilities, and SCL-based cloud masking to the data pipeline.
Changes
gee_downloader.py: download real Sentinel-2 composites from GEE per analysis_type, with explicit synthetic fallbackband_mapping.py: single source of truth for analysis-specific bands and model config from config.yamlpreprocessing.py:apply_scl_cloud_mask()for masking cloudy pixels before inferenceChecklist
is_synthetic: True