Complete Python project for collecting, processing, and exporting satellite imagery from Google Earth Engine. Optimized for GAN and VLM training datasets.
- ✓ GEE Authentication - Secure OAuth2 authentication
- ✓ Image Collection - Sentinel-2 satellite imagery retrieval
- ✓ Cloud Masking - Automatic cloud detection and removal
- ✓ Image Processing - NDVI, NDBI, NDWI calculation
- ✓ Visualization - Interactive maps with geemap
- ✓ Export to Drive - Batch export to Google Drive
- ✓ Configurable - Environment-based settings
gee_data_collection/
├── scripts/ # Main Python scripts
│ ├── 01_authenticate.py # GEE authentication
│ ├── 02_initialize.py # GEE initialization
│ ├── 03_collect_images.py # Image collection
│ ├── 04_process_images.py # Image processing
│ ├── 05_visualize.py # Visualization
│ ├── 06_export_to_drive.py # Export to Google Drive
│ └── gee_utils.py # Utility functions
├── config/
│ └── settings.py # Configuration settings
├── notebooks/
│ └── visualization.ipynb # Jupyter notebook for interactive visualization
├── data/ # Local data storage
├── requirements.txt # Python dependencies
├── .env.example # Environment variables template
├── .gitignore # Git ignore rules
└── README.md # This file
| Item | Version | Notes |
|---|---|---|
| Python | 3.6 - 3.12 | Python 3.13 not supported yet |
| Google Account | Any | Gmail or institutional email |
| GEE Account | - | Free at signup.earthengine.google.com |
| Internet Connection | - | Required for GEE access |
# Navigate to your projects directory
cd your_projects_folder# Windows
python -m venv gee_env
gee_env\Scripts\activate
# macOS/Linux
python3 -m venv gee_env
source gee_env/bin/activatepip install -r requirements.txt# Copy environment template
cp .env.example .env
# Edit .env with your settings
# - Set GEE_PROJECT_ID (from GEE Code Editor)
# - Adjust AOI bounds as needed
# - Update other parameters if desiredpython scripts/01_authenticate.pyThis opens your browser to authorize GEE access.
python scripts/02_initialize.pyVerifies connection with your GEE project.
python scripts/03_collect_images.pyRetrieves Sentinel-2 imagery for your area of interest.
python scripts/04_process_images.pyCreates composites and calculates vegetation indices.
# Interactive map (if in Jupyter)
python scripts/05_visualize.py
# Or use Jupyter notebook
jupyter notebook notebooks/visualization.ipynbpython scripts/06_export_to_drive.pyExports processed images to your Google Drive folder.
# GEE Project
GEE_PROJECT_ID=ee-yourusername
# Area of Interest (Latitude/Longitude bounds)
AOI_MIN_LONGITUDE=88.0
AOI_MIN_LATITUDE=20.0
AOI_MAX_LONGITUDE=92.5
AOI_MAX_LATITUDE=26.5
# Data Collection
START_DATE=2023-01-01
END_DATE=2023-12-31
CLOUD_COVER_THRESHOLD=20
MAX_IMAGES=50
# Export Settings
EXPORT_SCALE=10 # meters (Sentinel-2 native)
EXPORT_CRS=EPSG:4326 # Lat/Lon
EXPORT_FORMAT=GeoTIFFEdit config/settings.py or .env to change:
- Latitude/Longitude bounds
- Date range
- Cloud cover threshold
- Number of images
from scripts.gee_utils import *
from config.settings import *
initialize_gee()
aoi = create_aoi_from_bounds(AOI_BOUNDS)
collection = ee.ImageCollection(SATELLITE_DATASET) \
.filterBounds(aoi) \
.filterDate('2023-01-01', '2023-12-31') \
.filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE', 20))from scripts.gee_utils import calculate_ndvi, initialize_gee
import ee
initialize_gee()
image = ee.Image('COPERNICUS/S2_SR_HARMONIZED/20230615T040559_20230615T041557_T45RUI')
ndvi = calculate_ndvi(image)from scripts.gee_utils import create_export_task, start_export_task
from config.settings import *
task = create_export_task(
image=your_image,
description='my_export',
folder='GEE_Exports',
aoi=aoi,
scale=10
)
start_export_task(task)| Issue | Solution |
|---|---|
ModuleNotFoundError: earthengine-api |
Run: pip install -r requirements.txt |
| Authentication fails | Run: python scripts/01_authenticate.py --force |
| No images found | Check date range, cloud threshold, and AOI bounds |
| Map not displaying | Install: pip install geemap |
| Python 3.13 error | Use Python 3.12 or earlier |
| Google Drive folder not found | Create the folder in Google Drive first |
After downloading images from Google Drive:
-
Organize by category:
data/ ├── flood/ ├── urban/ ├── forest/ └── agricultural/ -
Resize images:
# To 256x256 convert image.tif -resize 256x256 image_256.tif # To 128x128 convert image.tif -resize 128x128 image_128.tif
-
Create captions:
- Basic:
flood,urban,forest - Enhanced:
Satellite view of flooded agricultural area
- Basic:
-
Use for training:
- GAN training: Pairs of image→label or style transfer
- VLM fine-tuning: Image+caption pairs for vision-language models
| Dataset | ID | Resolution | Bands |
|---|---|---|---|
| Sentinel-2 | COPERNICUS/S2_SR_HARMONIZED |
10m | 11 bands |
| Landsat 8 | USGS/LANDSAT_8_SR |
30m | 11 bands |
| Landsat 9 | USGS/LANDSAT_9_SR |
30m | 11 bands |
| Band | Name | Wavelength | Resolution |
|---|---|---|---|
| B2 | Blue | 490 nm | 10 m |
| B3 | Green | 560 nm | 10 m |
| B4 | Red | 665 nm | 10 m |
| B5 | Vegetation Edge | 705 nm | 20 m |
| B8 | NIR | 842 nm | 10 m |
| B11 | SWIR | 1610 nm | 20 m |
| B12 | SWIR | 2190 nm | 20 m |
MIT License - See LICENSE file for details
For issues, errors, or questions:
- Check the Troubleshooting section
- Review GEE documentation
- Check error messages carefully
- Verify your project ID and authentication
Built for GAN+VLM training pipeline with satellite imagery from Google Earth Engine.
Last Updated: May 2026 Python Version: 3.6 - 3.12 GEE API Version: Latest