Skip to content

feat(data): GEE tile downloader, band mapping, and cloud masking#7

Merged
Oshgig merged 4 commits into
developfrom
feature/data-gee-downloader-and-pipeline
Apr 15, 2026
Merged

feat(data): GEE tile downloader, band mapping, and cloud masking#7
Oshgig merged 4 commits into
developfrom
feature/data-gee-downloader-and-pipeline

Conversation

@Oshgig
Copy link
Copy Markdown
Collaborator

@Oshgig Oshgig commented Apr 15, 2026

Summary

Adds analysis-aware Sentinel-2 tile downloading, band mapping utilities, and SCL-based cloud masking to the data pipeline.

Changes

  • gee_downloader.py: download real Sentinel-2 composites from GEE per analysis_type, with explicit synthetic fallback
  • band_mapping.py: single source of truth for analysis-specific bands and model config from config.yaml
  • preprocessing.py: apply_scl_cloud_mask() for masking cloudy pixels before inference
  • Wired all new modules into `data/init.py"
  • Fixed .gitignore so src/climatevision/data/ is tracked correctly

Checklist

  • GEE downloader reads correct bands from config.yaml
  • Synthetic fallback tagged with is_synthetic: True
  • Config loading cached via lru_cache
  • PR review feedback applied

Adeolu Mary Oshadare added 4 commits April 15, 2026 19:22
- Downloads real Sentinel-2 composites via Google Earth Engine
- Reads required bands from config.yaml per analysis_type
- Includes SCL band for downstream cloud masking
- Synthetic fallback with explicit is_synthetic flag when GEE unavailable
- Fix .gitignore so src/climatevision/data/ is no longer ignored
- get_bands_for_analysis() reads correct bands from config.yaml
- get_band_indices() maps band names to canonical 13-band stack positions
- is_analysis_enabled() and list_enabled_analysis_types() for config validation
- Includes SCL band helpers for downstream cloud masking
- apply_scl_cloud_mask() masks cloudy pixels using Sentinel-2 SCL band
- Default clear labels: vegetation, bare soils, water, snow
- Update __init__.py to expose gee_downloader and band_mapping utilities
- Remove duplicated config logic in gee_downloader.py; import from band_mapping
- Cache config.yaml load in band_mapping.py via lru_cache
- Read synthetic tile size from config.yaml instead of hardcoding 256
- Remove unused json import in gee_downloader.py
- Add shape validation in apply_scl_cloud_mask
@Oshgig Oshgig merged commit 0f4a362 into develop Apr 15, 2026
@Oshgig Oshgig deleted the feature/data-gee-downloader-and-pipeline branch April 15, 2026 19:51
mvanhorn pushed a commit to mvanhorn/ClimateVision that referenced this pull request May 17, 2026
…mate-Vision#7)

* feat(data): add GEE tile downloader with analysis-aware band selection

- Downloads real Sentinel-2 composites via Google Earth Engine
- Reads required bands from config.yaml per analysis_type
- Includes SCL band for downstream cloud masking
- Synthetic fallback with explicit is_synthetic flag when GEE unavailable
- Fix .gitignore so src/climatevision/data/ is no longer ignored

* feat(data): add analysis-specific Sentinel-2 band mapping utilities

- get_bands_for_analysis() reads correct bands from config.yaml
- get_band_indices() maps band names to canonical 13-band stack positions
- is_analysis_enabled() and list_enabled_analysis_types() for config validation
- Includes SCL band helpers for downstream cloud masking

* feat(data): integrate SCL cloud masking and export new pipeline modules

- apply_scl_cloud_mask() masks cloudy pixels using Sentinel-2 SCL band
- Default clear labels: vegetation, bare soils, water, snow
- Update __init__.py to expose gee_downloader and band_mapping utilities

* refactor(data): address PR review feedback

- Remove duplicated config logic in gee_downloader.py; import from band_mapping
- Cache config.yaml load in band_mapping.py via lru_cache
- Read synthetic tile size from config.yaml instead of hardcoding 256
- Remove unused json import in gee_downloader.py
- Add shape validation in apply_scl_cloud_mask

---------

Co-authored-by: Adeolu Mary Oshadare <adeolu@placeholder.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant