# About

This notebook adds canopy height features to the points sampled in the notebook `2_sample_pts_from_polygos.ipynb`. The notebook assumes the csv files with the sampled points  are located in the 'temp' folder. The canopy height rasters for Santa Barbara County were obtained from the California Forest Observatory (CFO) using the `1_download_CFO_canopy_height_raster.ipynb` notebook and are located in the 'SantabarbaraCounty_lidar' folder. 

Must be 2016, 2018, or 2020. Ideally, `aoi_year = lidar_year`, but due to data availability it is recommended to make `lidar_year=2016` when `aoi_year` equals 2014 or 2012. 

In the process of adding the canopy height features, this notebook creates four additional temporary rasters in a given year from the CFO canopy height layer *H*. These layers are avg_lidar, max_lidar, min_lidar, and min_max_diff. For a given year, the avg_lidar layer is created by replacing the value of a pixel *p* in *H* by the average of the values of *H* in a 3x3 window centered at *p* (effectively a convolution of the raster *H* with a 3x3 matrix with constant weights 1/9). The max_lidar is created by replacing the value of a pixel *p* in *H* with the maximum value of *H* in a 3x3 window centered at *p*. The min_lidar layer is created similarly, now taking the minimum value over the window. Finally, the min_max_diff layer is the difference between the max_lidar and the min_lidar layers. All the functions to create these raster layers and sample information from them are in `lidar_sampling_functions`. 



**NOTEBOOK VARIABLES:**

- `years` (int array): years of the points which will have lidar features added. Must be a subset of [2012, 2014, 2016, 2018, 2020]

- `aois` (array): the areas of interest of the points which will have lidar features added. Must be a subset of `['campus_lagoon','carpinteria','gaviota', 'point_conception']`.

- `delete_pts` (bool): whether to delete the input files with the original points or not.

To add canopy height features to all points from all aois and all years, you need to set `years = [2012, 2014, 2016, 2018, 2020]` and `aois = ['campus_lagoon','carpinteria','gaviota','point_conception']`.

Notes: there are no points sampled from point_conception on 2016. The notebook automatically excludes this option. 


**OUTPUT:**
For each csv of points from the specified years and aois, the notebook creates a dataframe with the original features from the intial points dataset (see notebook `2_sample_pts_from_polygons`) augmented with the columns: canopy height, avg_lidar, max_lidar, min_lidar, and min_max_diff. For years 2016, 2018 and 2020, the values for these columns are obtained by using the points to sample the canopy height rasters.
Due to data availability, for the years 2012 and 2014 the new canopy height columns are all populated with -1. 
Each dataframe is saved as a csv file in the 'temp' folder (one csv per aoi and year combination).

In [1]:
import os
import pandas as pd
import geopandas as gpd
import rioxarray as riox

import rasterio

import pystac_client
import planetary_computer as pc

import lidar_sampling_functions as lsf

# Specify notebook variables

In [18]:
# ***************************************************
# ************* NOTEBOOK VARIABLES ******************

years = [2012, 2014, 2016, 2018, 2020]

aois = ['campus_lagoon','carpinteria','gaviota','point_conception']

delete_pts = False
    
# ***************************************************
# ***************************************************

In [5]:
no_lidar_years = list(set(years) & set([2012,2014]) )

for year in no_lidar_years:
    for aoi in aois:
        pts_fp = os.path.join(os.getcwd(),
                              'temp',
                              aoi+'_points_'+str(year)+'.csv')
        pts = pd.read_csv(pts_fp)
        # match columns with ones that will result from lidar sampling    
        pts = pts.drop(['y', 'x', 'Unnamed: 0'], axis=1)   
        
        # add null value for all canopy height features
        pts['lidar'] = -1
        pts['max_lidar']= -1
        pts['min_lidar'] = -1
        pts['min_max_diff'] = -1
        pts['avg_lidar'] = -1
        
        ## Save points with added null LIDAR data
        ptslidar_fp = os.path.join(os.getcwd(), 
                                   'temp', 
                                   aoi +'_pts_spectral_lidar_'+str(year)+'.csv')
        pts.to_csv(ptslidar_fp, index=False)

        ## Delete original csv files (points without LIDAR)
        if delete_pts == True:
            os.remove(pts_fp)
    

# Add canopy height data from year `lidar_year` to points from all aois in year `year`

In [9]:
lidar_years = list(set(years) & set([2016,2018,2020]))

for year in lidar_years:
    # ------------------------------
    # Open canopy height raster and create auxiliary min, max, and avg rasters
    lidar_rast_r = rasterio.open(lsf.path_to_lidar(year))

    lsf.save_min_max_rasters(rast_reader = lidar_rast_r, 
                                  folder_path = os.path.join(os.getcwd(),'temp'),
                                  year = year)

    lsf.save_avg_rasters(rast_reader = lidar_rast_r, 
                                  folder_path = os.path.join(os.getcwd(),'temp'),
                                  year = year)
    
    # file paths to auxiliary LIDAR rasters
    # TO DO: maybe the file paths should be returns from the previous functions
    lidar_fps = []
    for tag in ['maxs_', 'mins_', 'avgs_']:
        lidar_fps.append(os.path.join(os.getcwd(),
                                     'temp',
                                     'lidar_'+tag+ str(year)+'.tif'))
    
    # ------------------------------
    # Add lidar data for each aoi
    for aoi in aois:
        if ('point_conception' != aoi) or (year != 2016):  #there's no data for Point Conception on 2016
            pts_fp = os.path.join(os.getcwd(),
                                  'temp',
                                  aoi+'_points_'+str(year)+'.csv')

            ## Obtain CRS from itemid and create pts for sampling
            itemid = pd.read_csv(pts_fp).naip_id[0]
            pts = lsf.geodataframe_from_csv(pts_fp, lsf.crs_from_itemid(itemid))
            pts_xy = lsf.pts_for_lidar_sampling(pts, lidar_rast_r.crs)

            ## Sample canopy_height at point, and max, min and avg canopy height around point
            lidar_samples = lsf.sample_raster(pts_xy, lidar_rast_r)

            maxs_rast_r = rasterio.open(lidar_fps[0])
            max_samples = lsf.sample_raster(pts_xy, maxs_rast_r)

            mins_rast_r = rasterio.open(lidar_fps[1])
            min_samples = lsf.sample_raster(pts_xy, mins_rast_r)

            avg_rast_r = rasterio.open(lidar_fps[2])
            avg_samples = lsf.sample_raster(pts_xy, avg_rast_r)

            ## Add all LIDAR data to pts dataframe
            pts['lidar'] = lidar_samples
            pts['max_lidar']= max_samples
            pts['min_lidar'] = min_samples
            pts['min_max_diff'] = pts.max_lidar - pts.min_lidar  # include difference
            pts['avg_lidar'] = avg_samples

            ## Save points with added LIDAR data
            ptslidar_fp = os.path.join(os.getcwd(), 
                                       'temp', 
                                       aoi +'_pts_spectral_lidar_'+str(year)+'.csv')
            pts.to_csv(ptslidar_fp, index=False)

            ## Delete original csv files (points without LIDAR)
            if delete_pts == True:
                os.remove(pts_fp)

    # ------------------------------
    # Delete auxiliary LIDAR rasters created for this year
    for fp in lidar_fps:
        os.remove(fp)

  arr = construct_1d_object_array_from_listlike(values)
