## Satellite Image Pre-processing Pipeline

This Jupyter notebook implements a preprocessing pipeline for satellite imagery analysis in Córdoba, Argentina, focusing on land use change detection.

## Features

- **Google Earth Engine Integration**: Connects to GEE API for satellite data access
- **Interactive Map Interface**: Allows users to draw Areas of Interest (AOI)
- **Sentinel-2 Data Processing**: Retrieves and processes Sentinel-2 satellite imagery
- **Cloud Masking**: Implements cloud and cirrus cloud masking for clearer imagery
- **NDVI Calculation**: Computes Normalized Difference Vegetation Index
- **Forest Cover Analysis**: Assesses vegetation density and forest cover percentage

## Requirements

- Google Earth Engine account
- Python packages:
  - `earthengine-api`
  - `geemap`
  - `folium`

## Usage

1. **Authentication**: Run the first cells to authenticate with Google Earth Engine
2. **AOI Selection**: Use the interactive map to draw your Area of Interest
3. **Date Configuration**: Enter the following parameters when prompted:
   - Start date (YYYY-MM-DD)
   - End date (YYYY-MM-DD)
   - Change detection interval (1-12 months)
4. **Analysis**: The notebook will:
   - Fetch Sentinel-2 imagery
   - Apply cloud masking
   - Calculate NDVI
   - Assess forest cover
   - Display results on the map

## Output

- Interactive map with:
  - NDVI visualization
  - AOI boundary
  - Forest cover statistics
- Analysis results including:
  - Mean NDVI values
  - Forest cover percentage
  - Suitability assessment for deforestation analysis

## Notes

- The pipeline uses Sentinel-2 Surface Reflectance (SR) data
- Cloud masking uses the QA60 band for quality assessment
- NDVI calculation uses bands B8 (NIR) and B4 (Red)
- A 20% forest cover threshold is used for deforestation analysis suitability


Part of the Córdoba Argentina Chapter Monitoring Land Use Transformation project

In [1]:
import ee
import geemap
import folium
from folium import plugins

In [2]:
import geemap
import folium
from folium import plugins

In [3]:
ee.Authenticate(auth_mode='colab')

True

In [4]:
ee.Initialize(project='ee-mujtabanaqvi29')

In [5]:
# Create an interactive map using geemap
Map = geemap.Map(center=(0, 0), zoom=2)  # Initialize the map
Map.add_basemap('SATELLITE')  # Add satellite imagery as the basemap

# Add draw control for user to draw an AOI
Map.add_draw_control()  # Allows drawing shapes like polygons, points, etc.

# Display the map for the user to interact with
Map

Map(center=[0, 0], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=SearchDataGUI(childr…

In [6]:
# Get the user-drawn AOI
drawn_feature = Map.user_roi  # Retrieve the drawn feature
if drawn_feature is None:
    raise ValueError("No AOI selected! Please draw a polygon or rectangle on the map.")
else:
    AOI = ee.Geometry.Polygon(drawn_feature.getInfo()['coordinates'])  # Convert to EE Geometry
    print("AOI defined.")

AOI defined.


In [10]:
# Input start and end dates
start_date = input("Enter the start date for analysis (YYYY-MM-DD): ")
end_date = input("Enter the end date for analysis (YYYY-MM-DD): ")

# Input the duration for change detection
interval_months = int(input("Enter the duration of change detection in months (e.g., 6): "))
print(f"\nAnalysis Configuration:")
print(f"Start Date: {start_date}")
print(f"End Date: {end_date}")
print(f"Change Detection Interval: {interval_months} months")


Analysis Configuration:
Start Date: 2022-01-01
End Date: 2023-12-25
Change Detection Interval: 6 months


In [11]:
# Load Sentinel-2 ImageCollection for the first 30 days after the start date
sentinel2_30days = ee.ImageCollection('COPERNICUS/S2_SR') \
    .filterDate(start_date, ee.Date(start_date).advance(30, 'day')) \
    .filterBounds(AOI)

# Cloud masking function for Sentinel-2
def mask_clouds(image):
    qa = image.select('QA60')
    cloud_bit_mask = 1 << 10  # Bit 10 represents clouds
    cirrus_bit_mask = 1 << 11  # Bit 11 represents cirrus clouds
    mask = qa.bitwiseAnd(cloud_bit_mask).eq(0).And(qa.bitwiseAnd(cirrus_bit_mask).eq(0))
    return image.updateMask(mask)

# Apply cloud masking and create a median composite for the first 30 days
composite_30days = sentinel2_30days.map(mask_clouds).median().clip(AOI)

# Calculate NDVI for the 30-day composite
ndvi_30days = composite_30days.normalizedDifference(['B8', 'B4']).rename('NDVI')

# Calculate mean NDVI for the AOI
mean_ndvi_30days = ndvi_30days.reduceRegion(
    reducer=ee.Reducer.mean(),
    geometry=AOI,
    scale=10,  # Sentinel-2 resolution
    maxPixels=1e9
).get('NDVI')

# Fetch the NDVI value for Python
mean_ndvi_value = mean_ndvi_30days.getInfo()

# Print the forest cover percentage and check the 20% threshold
if mean_ndvi_value is not None:
    forest_percentage = mean_ndvi_value * 100
    print(f"Mean NDVI (Forest Cover Proxy) for the first 30 days: {forest_percentage:.2f}%")

    if forest_percentage < 20:
        print("The forestation rate is less than 20%. The area is not suitable for deforestation analysis.")
    else:
        print("The forestation rate is sufficient for deforestation analysis.")
else:
    print("Could not compute forest cover for the selected AOI.")

# Visualize NDVI for the 30-day composite on the map
Map.addLayer(
    ndvi_30days,
    {'min': 0, 'max': 1, 'palette': ['white', 'green']},  # White = low vegetation, Green = dense vegetation
    'NDVI - First 30 Days'
)

# Add AOI boundary to the map
Map.addLayer(AOI, {'color': 'red'}, 'AOI Boundary')

# Display the map
print("Sentinel-2 NDVI map for the first 30 days is displayed below:")
Map


Mean NDVI (Forest Cover Proxy) for the first 30 days: 40.37%
The forestation rate is sufficient for deforestation analysis.
Sentinel-2 NDVI map for the first 30 days is displayed below:


Map(bottom=36965.0, center=[-20.925527866647226, -61.50421142578126], controls=(WidgetControl(options=['positi…

Following cell calculated the dynamic Deforesation check (based on selected area size)

In [23]:
def assess_forest_suitability(ndvi_image, aoi_geometry, Map):
    """
    Assess forest suitability using dual thresholds: percentage and absolute area.
    """
    try:
        print("Starting forest assessment...")
        
        # Calculate total AOI area in hectares
        aoi_area_ha = aoi_geometry.area().divide(10000).getInfo()
        print(f"Total AOI area: {aoi_area_ha:.2f} hectares")
        
        # Create forest mask based on NDVI threshold
        NDVI_FOREST_THRESHOLD = 0.3
        forest_mask = ndvi_image.gte(NDVI_FOREST_THRESHOLD)
        print("Created forest mask based on NDVI threshold")
        
        # Calculate forest statistics
        forest_stats = forest_mask.multiply(ee.Image.pixelArea()).reduceRegion(
            reducer=ee.Reducer.sum(),
            geometry=aoi_geometry,
            scale=10,
            maxPixels=1e9
        ).getInfo()
        print("Calculated forest statistics")
        print(f"Raw forest stats: {forest_stats}")
        
        # Calculate forest area
        forest_area_ha = forest_stats.get('area', 0) / 10000  # Convert m² to ha
        print(f"Forest area: {forest_area_ha:.2f} hectares")
        
        # Calculate forest percentage
        forest_percentage = (forest_area_ha / aoi_area_ha) * 100
        print(f"Forest percentage: {forest_percentage:.2f}%")
        
        # Define thresholds
        MIN_FOREST_AREA_HA = 30
        MIN_FOREST_PERCENTAGE = 15
        
        # Assess suitability
        is_suitable = (forest_area_ha >= MIN_FOREST_AREA_HA and 
                      forest_percentage >= MIN_FOREST_PERCENTAGE)
        
        # Print results
        print("\nForest Assessment Results:")
        print(f"Total AOI Area: {aoi_area_ha:.2f} hectares")
        print(f"Forest Area: {forest_area_ha:.2f} hectares")
        print(f"Forest Coverage: {forest_percentage:.2f}%")
        print(f"Minimum Required: {MIN_FOREST_AREA_HA} hectares and {MIN_FOREST_PERCENTAGE}%")
        
        if is_suitable:
            print("\n✓ Area is suitable for deforestation analysis")
        else:
            print("\n✗ Area is not suitable for deforestation analysis")
            if forest_area_ha < MIN_FOREST_AREA_HA:
                print(f"  - Insufficient forest area ({forest_area_ha:.2f} < {MIN_FOREST_AREA_HA} ha)")
            if forest_percentage < MIN_FOREST_PERCENTAGE:
                print(f"  - Insufficient coverage ({forest_percentage:.2f}% < {MIN_FOREST_PERCENTAGE}%)")
        
        # Visualize forest mask
        Map.addLayer(
            forest_mask,
            {'min': 0, 'max': 1, 'palette': ['white', 'darkgreen']},
            'Forest Areas (NDVI >= 0.3)'
        )
        
        return is_suitable, forest_area_ha, forest_percentage
        
    except Exception as e:
        print(f"Error in forest assessment: {str(e)}")
        import traceback
        traceback.print_exc()
        return False, 0, 0

# Call the function
print("\nStarting new forest suitability assessment...")
is_suitable, forest_area, forest_percentage = assess_forest_suitability(ndvi_30days, AOI, Map)

# Display the map
print("\nDisplaying map with forest areas...")
Map


Starting new forest suitability assessment...
Starting forest assessment...
Total AOI area: 149724.38 hectares
Created forest mask based on NDVI threshold
Calculated forest statistics
Raw forest stats: {'NDVI': 996286235.5904474}
Forest area: 0.00 hectares
Forest percentage: 0.00%

Forest Assessment Results:
Total AOI Area: 149724.38 hectares
Forest Area: 0.00 hectares
Forest Coverage: 0.00%
Minimum Required: 30 hectares and 15%

✗ Area is not suitable for deforestation analysis
  - Insufficient forest area (0.00 < 30 ha)
  - Insufficient coverage (0.00% < 15%)

Displaying map with forest areas...


Map(bottom=4673.0, center=[-12.082295837363578, -53.94287109375001], controls=(WidgetControl(options=['positio…

In [16]:
# Import necessary libraries
from datetime import datetime, timedelta
import math

# Function to generate robust time intervals
def generate_time_intervals(start_date, end_date, duration_months):
    intervals = []
    start_date = datetime.strptime(start_date, "%Y-%m-%d")
    end_date = datetime.strptime(end_date, "%Y-%m-%d")

    # Loop through and create intervals
    current_date = start_date
    while current_date < end_date:
        next_date = current_date + timedelta(days=duration_months * 30)  # Approximate 1 month = 30 days
        # Ensure last interval ends exactly at the end_date
        if next_date > end_date:
            next_date = end_date
        intervals.append((current_date.strftime("%Y-%m-%d"), next_date.strftime("%Y-%m-%d")))
        current_date = next_date

    return intervals

# Generate robust intervals
time_intervals = generate_time_intervals(start_date, end_date, interval_months)

# Print the generated intervals
print("Generated Time Intervals:")
for interval in time_intervals:
    print(interval)

# Check Sentinel-2 image availability for each interval
for interval in time_intervals:
    start, end = interval
    image_collection = ee.ImageCollection('COPERNICUS/S2') \
        .filterDate(start, end) \
        .filterBounds(AOI)
    image_count = image_collection.size().getInfo()
    print(f"Interval: {start} to {end} - Sentinel-2 Images Available: {image_count}")


Generated Time Intervals:
('2023-01-01', '2023-06-30')
('2023-06-30', '2023-12-25')
Interval: 2023-01-01 to 2023-06-30 - Sentinel-2 Images Available: 72
Interval: 2023-06-30 to 2023-12-25 - Sentinel-2 Images Available: 70


In [17]:
# Define a cloud masking function
def mask_clouds(image):
    # Check for QA60 band on the server side
    qa_bands = image.bandNames()
    has_qa60 = qa_bands.contains('QA60')

    # Apply cloud masking if QA60 is present
    return ee.Algorithms.If(
        has_qa60,
        # Cloud masking logic
        image.updateMask(
            image.select('QA60').bitwiseAnd(1 << 10).eq(0).And(
                image.select('QA60').bitwiseAnd(1 << 11).eq(0)
            )
        ),
        # If QA60 is missing, return the unmasked image
        image
    )

# Process time intervals
for interval in time_intervals:
    start, end = interval

    # Load Sentinel-2 Surface Reflectance data (Level-2A)
    image_collection = ee.ImageCollection('COPERNICUS/S2_SR') \
        .filterDate(start, end) \
        .filterBounds(AOI) \
        .map(mask_clouds)  # Apply cloud masking

    # Check if there are images available
    image_count = image_collection.size().getInfo()
    if image_count == 0:
        print(f"No images available for interval: {start} to {end}. Skipping.")
        continue

    # Create a median composite
    composite = image_collection.median().clip(AOI)

    # Apply Gaussian smoothing to reduce noise
    smoothed_composite = composite.convolve(ee.Kernel.gaussian(radius=3, sigma=1, units='pixels'))

    # Use a single band to extract CRS and ensure consistent projection
    single_band = composite.select('B4')  # 'B4' is the red band
    crs_string = single_band.projection().crs().getInfo()

    # Reproject the smoothed composite using the extracted CRS
    aligned_composite = smoothed_composite.reproject(
        crs=crs_string,  # CRS string
        scale=10  # Sentinel-2 resolution in meters
    )

    # Visualize the composite for this interval
    Map.addLayer(
        aligned_composite.select(['B4', 'B3', 'B2']),  # RGB bands
        {'min': 0, 'max': 3000, 'gamma': 1.4},
        f"Composite {start} to {end}"
    )

print("All available composites have been processed and displayed on the map.")
Map


All available composites have been processed and displayed on the map.


Map(bottom=2469637.0, center=[-30.40789489142412, -64.39157492230356], controls=(WidgetControl(options=['posit…

In [18]:
# Define the 256x256 pixel grid for the AOI
tile_size_meters = 256 * 10  # Tile size in meters (256 pixels at 10m resolution)

# Generate the grid and clip it to the AOI with an error margin
grid = AOI.coveringGrid(
    proj=ee.Projection(crs_string),  # Use the extracted CRS string as a projection
    scale=tile_size_meters  # Define the scale for each tile
).map(lambda feature: feature.intersection(AOI, ee.ErrorMargin(1)))  # Add an error margin of 1 meter

# Process tiles for each time window
for interval in time_intervals:
    start, end = interval

    # Print current time window being processed
    print(f"Processing tiles for time window: {start} to {end}")

    # Iterate through each tile in the FeatureCollection
    grid_list = grid.toList(grid.size())  # Convert FeatureCollection to a list of features
    for i in range(grid.size().getInfo()):
        tile = ee.Feature(grid_list.get(i))  # Get the tile as a Feature
        tile_geom = tile.geometry()  # Extract the geometry of the tile

        # Add the individual tile to the map
        Map.addLayer(
            tile_geom,
            {'color': 'blue'},  # Display the tile in blue
            f"Tile {i + 1} ({start} to {end})"  # Label the tile with the time window
        )

# Add the entire grid clipped to AOI for visualization
Map.addLayer(
    grid,
    {'color': 'red'},  # Display the grid in red
    "256x256 Tiles (Clipped to AOI)"
)

# Display the map
print("The grid of 256x256 tiles, clipped to the AOI, is displayed for all time windows.")
Map


Processing tiles for time window: 2023-01-01 to 2023-06-30
Processing tiles for time window: 2023-06-30 to 2023-12-25
The grid of 256x256 tiles, clipped to the AOI, is displayed for all time windows.


Map(bottom=4939187.0, center=[-30.415778145780344, -64.3691110610962], controls=(WidgetControl(options=['posit…

The last snippet is to visulaize the best image from a time window verses composites imagee


In [19]:
# Define a cloud masking function
def mask_clouds(image):
    # Check for QA60 band on the server side
    qa_bands = image.bandNames()
    has_qa60 = qa_bands.contains('QA60')

    # Apply cloud masking if QA60 is present
    return ee.Algorithms.If(
        has_qa60,
        # Cloud masking logic
        image.updateMask(
            image.select('QA60').bitwiseAnd(1 << 10).eq(0).And(
                image.select('QA60').bitwiseAnd(1 << 11).eq(0)
            )
        ),
        # If QA60 is missing, return the unmasked image
        image
    )

# Process time intervals
for interval in time_intervals:
    start, end = interval

    # Load Sentinel-2 Surface Reflectance data (Level-2A)
    image_collection = ee.ImageCollection('COPERNICUS/S2_SR') \
        .filterDate(start, end) \
        .filterBounds(AOI) \
        .map(mask_clouds)  # Apply cloud masking

    # Check if there are images available
    image_count = image_collection.size().getInfo()
    if image_count == 0:
        print(f"No images available for interval: {start} to {end}. Skipping.")
        continue

    # Select the best single image with the least cloud cover
    best_image = image_collection.sort('CLOUDY_PIXEL_PERCENTAGE').first()

    # Clip to the AOI
    best_image_clipped = best_image.clip(AOI)

    # Visualize the best single image for this interval
    Map.addLayer(
        best_image_clipped.select(['B4', 'B3', 'B2']),  # RGB bands
        {'min': 0, 'max': 3000, 'gamma': 1.4},
        f"Best Image {start} to {end}"
    )

print("All best single images have been processed and displayed on the map.")
Map


All best single images have been processed and displayed on the map.


Map(bottom=38898.0, center=[-30.477082932837682, -64.57214355468751], controls=(WidgetControl(options=['positi…