<a href="https://colab.research.google.com/github/liangchow/zindi-amazon-secret-runway/blob/main/Data_Visualization/Generate_airstrip_masks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Imports and Setup.

In [1]:
%%capture
!pip -q install rasterio

In [41]:
import rasterio
import geopandas as gpd
from rasterio.features import rasterize
import numpy as np
import os
from shapely.geometry import LineString
from shapely.ops import transform
from pyproj import Transformer

### Mount your Google Drive and install project files

First, we'll mount your Google Drive. Then we'll clone the main branch from the GitHub repo so we have access to all of the files from there.

In [3]:
# mount your drive in case you have any new data uploaded there you want to use
from google.colab import drive
drive.mount('/content/drive')

# clone the main branch from GitHub to get all the data and files from there onto the current runtime session
!apt-get install git
!git clone https://github.com/liangchow/zindi-amazon-secret-runway.git
!git pull # pulls the latest changes from repo

Mounted at /content/drive
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
git is already the newest version (1:2.34.1-1ubuntu1.11).
0 upgraded, 0 newly installed, 0 to remove and 49 not upgraded.
Cloning into 'zindi-amazon-secret-runway'...
remote: Enumerating objects: 214, done.[K
remote: Counting objects: 100% (52/52), done.[K
remote: Compressing objects: 100% (51/51), done.[K
remote: Total 214 (delta 15), reused 3 (delta 1), pack-reused 162 (from 1)[K
Receiving objects: 100% (214/214), 1.07 MiB | 3.84 MiB/s, done.
Resolving deltas: 100% (82/82), done.
fatal: not a git repository (or any of the parent directories): .git


**Note: Before you run the following cell**

The training images are store in a shared folder `Sentinel` on Google Drive. To access the shared folder in Google Colab, first create a shortcut to the shared folder. Copy the shortcut to your Google Drive.

Now you can access the folder `Sentinel` like any other folder in your Google Drive.

In [4]:
shared_drive_path = "/content/drive/MyDrive/Zindi-Amazon"

# List files in the shared drive
print(os.listdir(shared_drive_path))

['Inference', 'training']


In [11]:
# Load the polygon shapefile
airstrips_gdf = gpd.read_file("/content/zindi-amazon-secret-runway/Data_Visualization/data/pac_2024_training/pac_2024_training.shp")

In [14]:
# Define buffer distance
buffer_distance = 50

In [36]:
# Create a subset for testing
airstrips_gdf_subset = airstrips_gdf[airstrips_gdf['id'].isin([125, 130, 126])]
airstrips_gdf_subset

Unnamed: 0,id,yr,largo,Activo,geometry
105,125,2018,743.585,0,"LINESTRING (-71.13234 -12.46927, -71.12678 -12..."
106,126,2021,685.435,0,"LINESTRING (-72.73038 -11.04639, -72.72785 -11..."
108,130,2022,925.346,0,"LINESTRING (-73.15258 -10.59165, -73.14414 -10..."


In [37]:
def check_mask_values(mask):
    # Check if both 0 and 1 are present in the mask
    has_zero = np.any(mask == 0)
    has_one = np.any(mask == 1)

    if has_zero and has_one:
        print("The mask contains both 0 and 1.")
        return True
    elif has_zero:
        print("The mask contains only 0.")
        return False
    elif has_one:
        print("The mask contains only 1.")
        return False
    else:
        print("The mask does not contain 0 or 1.")
        return False

In [42]:
# Function to reproject a geometry
def reproject_geometry(geometry, from_crs, to_crs):
    transformer = Transformer.from_crs(from_crs, to_crs, always_xy=True)
    return transform(transformer.transform, geometry)

In [47]:
# Iterate over each polygon to create individual mask rasters
for idx, row in airstrips_gdf.iterrows():
    # Get the airstrip ID
    airstrip_id = row['id']

    # Check for an associated TIFF file
    img = f"{shared_drive_path}/training/images/Sentinel_AllBands_Training_Id_{airstrip_id}.tif"

    if not os.path.exists(img):
        print(f"No TIFF file found for airstrip ID: {airstrip_id}")
    else:
        print(f"Creating mask for airstrip ID: {airstrip_id}")

        # Load the TIFF file and read it
        with rasterio.open(img) as img_src:
            # Get the raster CRS
            img_crs = img_src.crs
            print(f"Raster CRS: {img_crs}")

            # Get the raster metadata (to match extent and resolution)
            img_meta = img_src.meta.copy()
            # Create an affine transformation
            img_transform = img_src.transform
            # Get raster dimensions
            img_width = img_src.width
            img_height = img_src.height

            # Reproject the airstrip to match the raster CRS
            airstrip_crs = airstrips_gdf.crs  # Assuming the GeoDataFrame has a CRS
            reprojected_airstrip = reproject_geometry(row.geometry, airstrip_crs, img_crs)

            # Create a buffer around the reprojected airstrip
            buffered_airstrip = reprojected_airstrip.buffer(buffer_distance)

            # Initialize a blank mask for the current polygon
            mask = np.zeros((img_height, img_width), dtype='uint8')

            # Set mask to 1 where the airstrip buffer is present
            shape = [(buffered_airstrip, 1)]
            mask = rasterize(
                shapes=shape,
                out_shape=(img_height, img_width),
                transform=img_transform,
                fill=0,   # Assign 0 to areas outside the polygon
                dtype='uint8',
            )

            # Update metadata for the output raster
            out_meta = img_meta.copy()
            out_meta.update({
                "count": 1,
                "dtype": "uint8",
                "nodata": None  # Disable nodata (otherwise QGIS will not display the 0s)
            })

            check_mask_values(mask)

            # Save the individual mask raster

            # write to google colab runtime
            # output_raster = f"Mask_Buffer{buffer_distance}m_Id_{airstrip_id}.tif"

            # write to shared folder on google drive
            output_raster = f"{shared_drive_path}/training/masks/Mask_Buffer{buffer_distance}m_Id_{airstrip_id}.tif"
            with rasterio.open(output_raster, "w", **out_meta) as dest:
                dest.write(mask, 1)

            print(f"Saved mask for airstrip ID {airstrip_id} as {output_raster}")

Creating mask for airstrip ID: 1
Raster CRS: EPSG:32719
The mask contains both 0 and 1.
Saved mask for airstrip ID 1 as /content/drive/MyDrive/Zindi-Amazon/training/masks/Mask_Buffer50m_Id_1.tif
Creating mask for airstrip ID: 2
Raster CRS: EPSG:32719
The mask contains both 0 and 1.
Saved mask for airstrip ID 2 as /content/drive/MyDrive/Zindi-Amazon/training/masks/Mask_Buffer50m_Id_2.tif
No TIFF file found for airstrip ID: 3
No TIFF file found for airstrip ID: 4
No TIFF file found for airstrip ID: 5
No TIFF file found for airstrip ID: 6
No TIFF file found for airstrip ID: 7
No TIFF file found for airstrip ID: 8
Creating mask for airstrip ID: 9
Raster CRS: EPSG:32719
The mask contains both 0 and 1.
Saved mask for airstrip ID 9 as /content/drive/MyDrive/Zindi-Amazon/training/masks/Mask_Buffer50m_Id_9.tif
Creating mask for airstrip ID: 10
Raster CRS: EPSG:32719
The mask contains both 0 and 1.
Saved mask for airstrip ID 10 as /content/drive/MyDrive/Zindi-Amazon/training/masks/Mask_Buffer50m