# Cloud Masking for Satellite Imagery

This notebook demonstrates how to apply cloud masking to satellite imagery using `rasterio` and `numpy` in Python. Cloud masking identifies and removes cloud-covered areas to improve data quality for analysis, focusing on Sentinel-2 imagery with the Scene Classification Layer (SCL).

## Prerequisites
- Install required libraries: `rasterio`, `numpy`, `matplotlib` (listed in `requirements.txt`).
- A Sentinel-2 Level-2A product with SCL band (e.g., `S2_L2A_product.SAFE`). Replace file paths with your own data.

## Learning Objectives
- Load and interpret the SCL band for cloud detection.
- Create a cloud mask to exclude clouds and cloud shadows.
- Apply the mask to a raster and visualize the result.

In [None]:
# Import required libraries
import rasterio
import numpy as np
import matplotlib.pyplot as plt
import os

## Step 1: Load Sentinel-2 Data and SCL Band

Load the Scene Classification Layer (SCL) and RGB bands from a Sentinel-2 Level-2A product.

In [None]:
# Define paths to the Sentinel-2 Level-2A product
product_path = 'S2_L2A_product.SAFE'
scl_path = os.path.join(product_path, 'GRANULE', '*', 'IMG_DATA', 'R20m', '*_SCL_20m.tif')
red_path = os.path.join(product_path, 'GRANULE', '*', 'IMG_DATA', 'R10m', '*_B04_10m.tif')
green_path = os.path.join(product_path, 'GRANULE', '*', 'IMG_DATA', 'R10m', '*_B03_10m.tif')
blue_path = os.path.join(product_path, 'GRANULE', '*', 'IMG_DATA', 'R10m', '*_B02_10m.tif')

# Load SCL and RGB bands
with rasterio.open(scl_path) as src_scl, rasterio.open(red_path) as src_red, \
     rasterio.open(green_path) as src_green, rasterio.open(blue_path) as src_blue:
    scl = src_scl.read(1)
    red = src_red.read(1).astype(float)
    green = src_green.read(1).astype(float)
    blue = src_blue.read(1).astype(float)
    profile = src_red.profile

# Print basic information
print(f'SCL shape: {scl.shape}')
print(f'RGB shape: {red.shape}')

## Step 2: Create Cloud Mask

Use the SCL band to create a mask for clouds and cloud shadows. Sentinel-2 SCL values: 3 (cloud shadows), 8 (medium clouds), 9 (high clouds), 10 (cirrus).

In [None]:
# Define cloud-related SCL values to mask
cloud_values = [3, 8, 9, 10]  # Cloud shadows, medium/high clouds, cirrus

# Create cloud mask (True for clouds/shadows, False for valid pixels)
cloud_mask = np.isin(scl, cloud_values)

# Visualize cloud mask
plt.figure(figsize=(8, 8))
plt.imshow(cloud_mask, cmap='binary')
plt.title('Cloud Mask')
plt.xlabel('Column')
plt.ylabel('Row')
plt.show()

## Step 3: Apply Cloud Mask to RGB Data

Apply the cloud mask to the RGB bands, setting cloud-covered pixels to NaN.

In [None]:
# Resample SCL to match 10m RGB resolution if needed (SCL is 20m)
from rasterio.warp import reproject, Resampling

if scl.shape != red.shape:
    scl_resampled = np.empty_like(red, dtype=scl.dtype)
    with rasterio.open(scl_path) as src_scl:
        reproject(
            source=scl,
            destination=scl_resampled,
            src_transform=src_scl.transform,
            src_crs=src_scl.crs,
            dst_transform=profile['transform'],
            dst_crs=profile['crs'],
            resampling=Resampling.nearest
        )
    cloud_mask = np.isin(scl_resampled, cloud_values)

# Apply mask to RGB bands
red_masked = np.where(cloud_mask, np.nan, red)
green_masked = np.where(cloud_mask, np.nan, green)
blue_masked = np.where(cloud_mask, np.nan, blue)

# Stack and normalize for visualization
rgb_masked = np.stack([red_masked, green_masked, blue_masked], axis=-1)
rgb_masked = rgb_masked / np.nanpercentile(rgb_masked, 98) if np.nanpercentile(rgb_masked, 98) > 0 else rgb_masked
rgb_masked = np.clip(rgb_masked, 0, 1)

# Visualize masked RGB
plt.figure(figsize=(8, 8))
plt.imshow(rgb_masked)
plt.title('Cloud-Masked RGB Composite (B04, B03, B02)')
plt.xlabel('Column')
plt.ylabel('Row')
plt.show()

## Step 4: Save Masked Raster

Save the cloud-masked RGB bands as a new GeoTIFF file.

In [None]:
# Update profile for 3-band output
output_profile = profile.copy()
output_profile.update(count=3, dtype=rasterio.float32)

# Save masked RGB
with rasterio.open('cloud_masked_rgb.tif', 'w', **output_profile) as dst:
    dst.write([red_masked, green_masked, blue_masked])

print('Cloud-masked RGB saved to: cloud_masked_rgb.tif')

## Next Steps

- Replace `S2_L2A_product.SAFE` with your own Sentinel-2 Level-2A product.
- Adjust cloud mask values based on your needs (e.g., include low clouds, SCL value 2).
- Apply the mask to other bands or derived indices (e.g., NDVI).
- Proceed to the next notebook (`10_mosaicking_rasters.ipynb`) for mosaicking rasters.

## Notes
- Ensure the SCL and RGB band paths are correct.
- SCL is typically at 20m resolution, requiring resampling to match 10m RGB bands.
- Use `np.nan` for masked pixels to preserve data integrity.
- See `docs/installation.md` for troubleshooting library installation.