# Features: tree features

In this notebook, we transform the tree canopy data array, which consists of the tree canopy map around each station, into a feature data-frame. The feature data-frame will contain the tree canopy coverage within different buffer distances around each station.

In [None]:
import geopandas as gpd
import numpy as np
import pandas as pd
import xarray as xr

In [None]:
stations_gdf_filepath = "../data/interim/stations-gdf.gpkg"
tree_canopy_filepath = "../data/interim/tree-canopy.nc"
buffer_dists = [10, 30, 60, 90]

dst_canopy_res = 1
tree_threshold = 1
dst_filepath = "../data/interim/tree-features.csv"

In [None]:
stations_gdf = gpd.read_file(stations_gdf_filepath)
tree_canopy_da = xr.open_dataarray(tree_canopy_filepath)

In order to obtain the tree canopy coverage within different buffer distances around each station, we will use circular kernels to mask the tree canopy data array. The kernel will have a radius equal to the buffer distance. We will then sum the values within the masked area and divide by the number of pixels in the kernel to obtain the tree canopy coverage.

In [None]:
def get_kernel(kernel_pixel_radius, dtype="uint8"):
    """Get a circular kernel."""
    # kernel_pixel_radius = int(kernel_radius // self.res)
    kernel_pixel_len = 2 * kernel_pixel_radius  #  + 1

    y, x = np.ogrid[
        -kernel_pixel_radius : kernel_pixel_len - kernel_pixel_radius,
        -kernel_pixel_radius : kernel_pixel_len - kernel_pixel_radius,
    ]
    mask = x * x + y * y <= kernel_pixel_radius * kernel_pixel_radius

    kernel = np.zeros((kernel_pixel_len, kernel_pixel_len), dtype=dtype)
    kernel[mask] = 1
    return kernel


largest_buffer_pixels = int(buffer_dists[-1] / dst_canopy_res)
canopy_df = pd.DataFrame(index=stations_gdf.index)
for buffer_dist in buffer_dists:
    kernel = get_kernel(int(buffer_dist / dst_canopy_res))
    # since the station is located at the center of the array, we use the slice below to
    # select the square area around the station with the size of the buffer distance so
    # that we can then apply the kernel to it
    _slice = slice(
        largest_buffer_pixels - buffer_dist, largest_buffer_pixels + buffer_dist
    )
    canopy_df[buffer_dist] = (
        tree_canopy_da[:, _slice, _slice].where(kernel, other=0) > tree_threshold
    ).sum(dim=("i", "j")) / kernel.sum()

In [None]:
# dump to file
canopy_df.to_csv(dst_filepath)