# Working with Raster Bands

## Preparing Your Workspace

### Option 1: (recommended) Run in Google Colab
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/kevinlacaille/presentations/blob/main/scipy2024/6_image_processing.ipynb)

### Option 2: Run local Jupyter instance
You can also choose to open this Notebook in your own local Jupyter instance.

**Prerequisites**

- Install: rasterio, exiftool
- Download data

In [None]:
!pip install rasterio
!apt-get install -y exiftool
!pip install PyExifTool
!wget https://raw.githubusercontent.com/kevinlacaille/presentations/main/scipy2024/data/presentation/8928dec4ddbffff/DJI_0876.JPG
# Download the zip file
!wget https://github.com/kevinlacaille/presentations/raw/main/scipy2024/data/presentation/vancouver_data.zip -O /content/vancouver_data.zip
# Unzip the file
!unzip /content/vancouver_data.zip -d /content/vancouver_data

In [None]:
import rasterio
import numpy as np
import matplotlib.pyplot as plt
import cv2 as cv
import exiftool
import json

In [None]:
np.seterr(divide='ignore', invalid='ignore')

In [None]:
def get_bands(image_path):
    # Open the image and read the bands as numpy arrays
    with rasterio.open(image_path) as src:
        blue = src.read(1)
        green = src.read(2)
        red = src.read(3)
    rgb = np.dstack((blue, green, red))
    return blue, green, red, rgb

In [None]:
def get_metadata(image_path):
    # Get the metadata of the image
    with exiftool.ExifTool() as et:
        metadata = json.loads(et.execute(b'-j', image_path))
    return metadata

In [None]:
def get_gsd(metadata, height=15):
    # Extract the GPS Altitude
    altitude = float(metadata[0].get("XMP:RelativeAltitude"))
    print(f"Altitude: {altitude} m")
    focal_length = metadata[0].get("EXIF:FocalLength")  # in mm
    image_width = metadata[0].get("File:ImageWidth")
    # Size of pixel = sensor width (m) / image width (px)
    pixel_pitch = 6.17e-3 / image_width
    gsd_tree = (altitude - height) * pixel_pitch / (focal_length / 1000)
    gsd = (altitude) * pixel_pitch / (focal_length / 1000)
    return gsd, gsd_tree


In [None]:
def get_kernel_size(gsd_tree):
    diameter_of_tree = 10  # meters

    # Number of pixels the tree will cover
    diameter_of_tree_px = diameter_of_tree / gsd_tree
    area_of_tree = np.pi * (diameter_of_tree_px / 2)**2
    # Kernel size for morphological operations
    kernel_size = int(np.sqrt(diameter_of_tree_px))
    # Ensure kernel size is odd
    if kernel_size % 2 == 0:
        kernel_size += 1
    print(f"Kernel size: {kernel_size}")
    return area_of_tree, kernel_size

In [None]:
def get_vari(blue, green, red):
    # Calculate the VARI index
    vari = (green.astype(float) - red.astype(float)) / (
        green.astype(float) + red.astype(float) - blue.astype(float))
    return vari

In [None]:
def threshold(vari, vari_min=0.1, vari_max=0.5):
    # Generate the vegetation mask
    vegetation_mask = np.full(vari.shape, np.nan)
    vegetation_mask[(vari >= vari_min)] = 1

    # Generate the non-vegetation mask
    non_vegetation_mask = np.full(vari.shape, np.nan)
    non_vegetation_mask[vari < vari_min] = 1

    return vegetation_mask, non_vegetation_mask

In [None]:
def smoothing(mask, kernel_size=7):
    # Apply a Gaussian blur to the mask
    blur = cv.GaussianBlur(mask, (kernel_size, kernel_size), 0)
    return blur

In [None]:
def morphological_operations(mask, kernel_size=18):
    # Apply opening and closing morphological operations to the mask

    opening_kernel = np.ones((kernel_size, kernel_size), np.uint8)
    opening = cv.morphologyEx(mask, cv.MORPH_OPEN, opening_kernel)

    closing_kernel = np.ones((kernel_size, kernel_size), np.uint8)
    closing = cv.morphologyEx(opening, cv.MORPH_CLOSE, closing_kernel)

    return closing

In [None]:
def segmentation(mask, rgb):
    # Apply the mask to the RGB image
    mask_overlay = np.zeros_like(rgb)
    mask_overlay[:, :, 0] = 128  # Red channel for purple
    mask_overlay[:, :, 1] = 0  # Green channel for purple
    mask_overlay[:, :, 2] = 128  # Blue channel for purple

    # Apply the filtered mask to the mask overlay
    mask_overlay[mask != 1] = [0, 0, 0]

    # Create the inverse mask
    inverse_mask = np.logical_not(mask).astype(np.uint8)

    # Create an inverse mask overlay with purple color
    inverse_mask_overlay = np.zeros_like(rgb)
    inverse_mask_overlay[:, :, 0] = 128  # Red channel for purple
    inverse_mask_overlay[:, :, 1] = 0  # Green channel for purple
    inverse_mask_overlay[:, :, 2] = 128  # Blue channel for purple

    # Apply the inverse mask to the inverse mask overlay
    inverse_mask_overlay[mask == 1] = [0, 0, 0]

    return mask_overlay, inverse_mask_overlay

In [None]:
def visualize_segmentation(rgb, mask_overlay, inverse_mask_overlay):
    fig, ax = plt.subplots(1, 3, figsize=(18, 6))

    plt.sca(ax[0])
    plt.imshow(rgb)
    plt.axis('off')
    plt.title('RGB')

    plt.sca(ax[1])
    plt.imshow(rgb)
    plt.imshow(mask_overlay, alpha=0.5)
    plt.axis('off')
    plt.title('vegetation segmentation')

    plt.sca(ax[2])
    plt.imshow(rgb)
    plt.imshow(inverse_mask_overlay, alpha=0.5)
    plt.axis('off')
    plt.title('non-veg segmentation')

    plt.show()

In [None]:
def process_image(image_path):
    print(f"Processing image: {image_path}")
    # Get the bands of the image
    blue, green, red, rgb = get_bands(image_path)
    # Get the metadata of the image
    metadata = get_metadata(image_path)
    # Get the GSD of the image
    gsd, gsd_tree = get_gsd(metadata)
    # Get the kernel size for morphological operations
    area_of_tree, kernel_size = get_kernel_size(gsd_tree)
    # Calculate the VARI index
    vari = get_vari(blue, green, red)
    # Generate the vegetation and non-vegetation masks
    vegetation_mask, non_vegetation_mask = threshold(vari)
    # Apply smoothing and morphological operations to the vegetation mask
    smoothed_mask = smoothing(vegetation_mask)
    # Apply morphological operations to the smoothed mask
    filtered_mask = morphological_operations(smoothed_mask, kernel_size)
    # Apply the mask to the RGB image
    mask_overlay, inverse_mask_overlay = segmentation(filtered_mask, rgb)

    return rgb, gsd, area_of_tree, vegetation_mask, mask_overlay, inverse_mask_overlay


Test the pipeline

In [None]:
import os
# Define both potential file paths
image_path = "/content/DJI_0876.JPG" if os.path.exists(
    "/content/DJI_0876.JPG"
) else "data/presentation/8928dec4ddbffff/DJI_0876.JPG"
# Process the image
rgb, gsd, area_of_tree, vegetation_mask, mask_overlay, inverse_mask_overlay = process_image(
    image_path)
visualize_segmentation(rgb, mask_overlay, inverse_mask_overlay)

### Batch processing data

In [None]:
import glob
import os

# Define primary and secondary paths
primary_path = '/content/'
secondary_path = 'data/presentation/*/*/'

# Find images in both paths
primary_images = glob.glob(os.path.join(primary_path, '**/*.JPG'),
                           recursive=True)
secondary_images = glob.glob(os.path.join(secondary_path, '*.JPG'))

# Combine both lists, ensuring no duplicates
images = list(set(primary_images + secondary_images))

In [None]:
num_images = 0
total_trees = 0
for image_path in images:
    rgb, gsd, area_of_tree, vegetation_mask, mask_overlay, inverse_mask_overlay = process_image(
        image_path)
    num_vegetation_pixels = np.nansum(vegetation_mask)
    n_trees = num_vegetation_pixels / (
        area_of_tree)  # multiplying by 3 to account for overest
    print(f"Number of trees: {n_trees}")

    visualize_segmentation(rgb, mask_overlay, inverse_mask_overlay)

    total_trees += n_trees
    num_images += 1


## Estimate carbon absorbed

In [None]:
# Calculate the number of trees in the image: number of vegetation pixels / area of tree
avg_n_trees = total_trees / num_images
print(f'Average number of trees per scene: {n_trees:.0f}')

# CO2 absorbed per tree in kg (https://ecotree.green/en/how-much-co2-does-a-tree-absorb#:~:text=A%20tree%20absorbs%20approximately%2025kg%20of%20CO2%20per%20year&text=But%20really%20a%20tree%20absorbs,a%20tree%20absorbs%20so%20interesting.)
carbon_absorbed_per_tree = 25
# Calculate the amount of carbon absorbed by the trees
carbon_absorbed = n_trees * carbon_absorbed_per_tree
print(f'Average carbon absorbed per scene: {carbon_absorbed:.0f} kg')

## Extrapolate for a city
First let's find the footprint of the camera

In [None]:
# count total pixels
total_pixels = rgb.size / 3
# Since the camera is pointed down we can assume the footprint is a rectangle
# Calculate the footprint of the camera
footprint_area = total_pixels * gsd**2

print(f'Footprint area: {footprint_area:.2f} m^2')

In [None]:
# Extrapolate the carbon absorbed by the city (Vancouver, BC, Canada)
area = 115.18 * 1000 * 1000  # 115.18km^2

num_scenes = area / footprint_area

print(f'Number of scenes to cover city: {num_scenes:.0f}')


In [None]:
# Calculate the carbon absorbed by the city
carbon_absorbed_city = carbon_absorbed * num_scenes

print(f'Amount of carbon absorbed by the city: {carbon_absorbed_city:.0f} kg')
print(f'Trees absorb {carbon_absorbed_city / 1000:.0f} tonnes of CO2 per year')
print(f'City of Vancouver emits 28,000 tonnes of CO2 per year')
print(
    f'Trees absorb {carbon_absorbed_city / 1000 / 28e3*100:.2f}% of the CO2 emitted by the city'
)

The City of Vancouver emits 28,048 tonnes CO2e ([reference](https://metrovancouver.org/services/air-quality-climate-action/Documents/annual-corporate-energy-and-greenhouse-gas-emissions-management-report-2018-2022.pdf)). That means that these trees are likely absorbing ~16% of Vancouver's greenhouse gases!

### Drawbacks
Here let's talk about and show where this method fails:
- Segmentation method heavily dependant on: colour of the tree, shadows, sensor calibration, time of day, lighting, weather, seasons, tree species, size of tree, shape of tree?
  - Adaptive thresholding helps with variable lighting conditions
  - Temporal analysis would help distinguishing between different types of vegetation
- Difficult to parse trees from grass and shrubs
  - Could use texture analysis to parse these out  
- Counting trees heavily dependant of tree size estimate
- Carbon capture only an estimate
- Working with small number statistics, extrapolation is uncertain