# Example of Sparse Vegetation Detection Algorithm (SVDA)

This Jupyter Notebook contains an example / tutorial of typical processing steps for the SVDA algorithm. The Notebook calls several functions within the `SVDA_functions.py` python file that are imported at the beginning.

For this tutorial, we are using an existing ATL03 file - we do not provide the original file for this tutorial, because it is very large. You can download it from [https://nsidc.org/data/ATL03](https://nsidc.org/data/ATL03) (see below). ATL03 file are usually several GB large. The output of the processing steps is included in the example directory.

Initital setup and import of required modules. Make sure to change the path to where you store the github repository.

In [None]:
import os, h5py, glob, sys, warnings, tqdm
import pandas as pd
import numpy as np
import geopandas as gp
from pyproj import Transformer
from pyproj import proj

sys.path.append('../python')

#from SVDA_helper_functions import *
from SVDA_functions import *

In [None]:
# Data extraction from ATL03 product (HDF5 formatted files)
# This example uses only a single file, the code can also take a directory full of files

#The ATL03_20200320133708_12950614_003_01.h5 file is 4.7GB and is available at https://nsidc.org/data/ATL03
ATL03_input_path = '../ATL03_20200320133708_12950614_003_01.h5'

# Output
ATL03_output_path = '../ATL03_example_data/hdf'

#Region of interest to be clipped from ATL08 file:
ROI_fname = '../ATL08_example_data/ROI_westernNamibia.shp'
EPSG_Namibia_Code = 'epsg:32733'

# 1. Signal photons extraction from ATL03 data product

First step in SVDA processing: Extracting relevant geographic data from the large ATL03 file. The function `ATL03_signal_photons(fname, ATL03_output_path, ROI_fname, EPSG_Code)` converts an ATL03 H5 file, and extracts the following attributes for each beam (gt1l, gt1r, gt2l, gt2r, gt3l, gt3r):
```
heights/lat_ph
heights/lon_ph
heights/h_ph
heights/dist_ph_along
heights/signal_conf_ph
```

The function extracts along-track distances, converts latitude and longitude to local UTM coordinates (see EPSG code above), filters out land values within the geographic area <ROI_fname>, usually a shapefile in EPSG:4326 coordinates and writes these to a compressed HDF file in <ATL03_output_path> starting with 'Land_' and the date and time of the beam.

In [None]:
ATL03_files = list(glob.glob(ATL03_input_path))
ATL03_files.sort()
for fname in ATL03_files:  
    ATL03_signal_photons(fname, ATL03_output_path, ROI_fname, EPSG_Namibia_Code, reprocess=False)

## Plot of Land Data with plotly (zoom in to area of interest)

Here we use an interactive plotly code to visualize the photons. We create an interactive 2D map with plotly express (see examples near the end of the notebook for an interactive 3D map).

*Note: Because there are usually many photons, we restrict the 2D plot to show only every 100th photon point.*

First, load the data and subset:

In [None]:
ATL03_land_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_Land_*.hdf'))
ATL03_land_files.sort()
ATL03_df = pd.read_hdf(ATL03_land_files[0], mode='r')
ATL03_df = ATL03_df.iloc[::100, :]

In [None]:
ATL03_df.head()

In [None]:
import plotly.express as px
fig = px.scatter_mapbox(ATL03_df, 
                        lat='Latitude', 
                        lon='Longitude',
                        color='Photon_Height', zoom=9)
fig.update_layout(mapbox_style="stamen-terrain")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

# 2. Ground and preliminary canopy photons classification

Function `ATL03_ground_preliminary_canopy_photons`: Takes the output *ATL03_Land_*.hdf* from `ATL03_signal_photons` (created in Step 1) with along-track information and performs an initial ground and preliminary canopy photo classification. Stores results in two new HDF files *ATL03_Ground_*.hdf* and *ATL03_PreCanopy_*.hdf*.

In [None]:
ATL03_land_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_Land_*.hdf'))
ATL03_land_files.sort()
for fname in ATL03_land_files:
    ATL03_ground_preliminary_canopy_photons(fname, ATL03_output_path, reprocess=False)

Load the new data into a pandas dataframe:

In [None]:
ATL03_PreCanopy_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_PreCanopy_*.hdf'))
ATL03_PreCanopy_files.sort()
df_PreCanopy = pd.read_hdf(ATL03_PreCanopy_files[0], mode='r')

ATL03_Ground_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_Ground_*.hdf'))
ATL03_Ground_files.sort()
df_Ground = pd.read_hdf(ATL03_Ground_files[0], mode='r')

The ground-classified data is stored in:

In [None]:
df_Ground.head()

And the preliminary canopy is stored in:

In [None]:
df_PreCanopy.head()

## Plotly Map of Ground Photons and preliminary canopy as 2D Map

In [None]:
import plotly.express as px
fig = px.scatter_mapbox(df_Ground, 
                        lat='Latitude', 
                        lon='Longitude',
                        color='Ground_Height', zoom=9)
fig.update_layout(mapbox_style="stamen-terrain")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

We scale the colorscale to the 5th and 95th percentile of the variable *Canopy Height*:

In [None]:
import plotly.express as px
fig = px.scatter_mapbox(df_PreCanopy, 
                        lat='Latitude', 
                        lon='Longitude',
                        color='PreCanopy_Height', color_continuous_scale='viridis',
                        range_color=list(np.percentile(df_PreCanopy['PreCanopy_Height'], (5, 95))),
                        zoom=9)
fig.update_layout(mapbox_style="stamen-terrain")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

## Plotly Map of preliminary canopy and ground Photons as 3D Map
Also showing all land classified photons (but only every 100th photon).

In [None]:
import plotly.graph_objects as go
ATL03_land_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_Land_*.hdf'))
ATL03_land_files.sort()

df = pd.read_hdf(ATL03_land_files[0], mode='r')
df_Land = df.iloc[::100, :]

ATL03_PreCanopy_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_PreCanopy_*.hdf'))
ATL03_PreCanopy_files.sort()
df_PreCanopy = pd.read_hdf(ATL03_PreCanopy_files[0], mode='r')

ATL03_Ground_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_Ground_*.hdf'))
ATL03_Ground_files.sort()
df_Ground = pd.read_hdf(ATL03_Ground_files[0], mode='r')

fig = go.Figure()
Land_data = go.Scatter3d(name='All Land data',
    x=df_Land['Easting'], y=df_Land['Northing'], z=df_Land['Photon_Height'],
    mode='markers',
    marker=dict(
        size=1,
        color='black',
        opacity=0.8
    )
)

Ground_data = go.Scatter3d(name='Classified Ground data',
    x=df_Ground['Easting'], y=df_Ground['Northing'], z=df_Ground['Ground_Height'],
    mode='markers',
    marker=dict(
        size=5,
        color='red',
        opacity=0.8
    )
)

PreCanopy_data = go.Scatter3d(name='Preliminary Canopy data',
    x=df_PreCanopy['Easting'], y=df_PreCanopy['Northing'], 
                              z=df_PreCanopy['Ground_interp_Height']+df_PreCanopy['PreCanopy_Height'],
    mode='markers',
    marker=dict(
        size=3,
        color=df_PreCanopy['PreCanopy_Height'],
        colorscale='Viridis',
        opacity=0.8
    )
)

Groundi_data = go.Scatter3d(name='Classified Ground data',
    x=df_PreCanopy['Easting'], y=df_PreCanopy['Northing'], z=df_PreCanopy['Ground_interp_Height'],
    mode='lines',
    line=dict(
        color='red',
        width=2
    )
)

fig.add_trace(Land_data)
fig.add_trace(Ground_data)
fig.add_trace(PreCanopy_data)
fig.add_trace(Groundi_data)

# tight layout
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0), title='Land, Ground, and Preliminary Canopy')
fig.show()

# 3. Canopy and Top-of-Canopy photons classification

The function `ATL03_canopy_and_top_of_canopy_photons` takes the output from the previous step 2 (`ATL03_ground_preliminary_canopy_photons`) stored in *ATL03_PreCanopy_*.hdf* and performs refined canopy and Top-of-Canopy (TOC) classification. Output is stored in *ATL03_TOC_*.hdf*.

In [None]:
ATL03_precanopy_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_PreCanopy_*.hdf'))
ATL03_precanopy_files.sort()
for fname in ATL03_precanopy_files:
    ATL03_canopy_and_top_of_canopy_photons(fname, ATL03_output_path, reprocess=False)

Load newly generated files and plot first few lines. The relevant, classified photon data are in *Canopy_Height*:

In [None]:
ATL03_TOC_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_TOC_*.hdf'))
ATL03_TOC_files.sort()
df_TOC = pd.read_hdf(ATL03_TOC_files[0], mode='r')
df_TOC.head()

## Plotly Map of TOC, preliminary canopy, and ground photons as 3D Map
Also showing all land classified photons (but only every 100th photon).

First, load relevant panda dataframes:

In [None]:
ATL03_PreCanopy_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_PreCanopy_*.hdf'))
ATL03_PreCanopy_files.sort()
df_PreCanopy = pd.read_hdf(ATL03_PreCanopy_files[0], mode='r')

ATL03_Ground_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_Ground_*.hdf'))
ATL03_Ground_files.sort()
df_Ground = pd.read_hdf(ATL03_Ground_files[0], mode='r')

ATL03_TOC_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_TOC_*.hdf'))
ATL03_TOC_files.sort()
df_TOC = pd.read_hdf(ATL03_TOC_files[0], mode='r')

In [None]:
import plotly.graph_objects as go

fig = go.Figure()
Land_data = go.Scatter3d(name='All Land data',
    x=df_Land['Easting'], y=df_Land['Northing'], z=df_Land['Photon_Height'],
    mode='markers',
    marker=dict(
        size=1,
        color='black',
        opacity=0.8
    )
)

Ground_data = go.Scatter3d(name='Classified Ground data',
    x=df_Ground['Easting'], y=df_Ground['Northing'], z=df_Ground['Ground_Height'],
    mode='markers',
    marker=dict(
        size=3,
        color='red',
        opacity=0.8
    )
)

PreCanopy_data = go.Scatter3d(name='Preliminary Canopy data',
    x=df_PreCanopy['Easting'], y=df_PreCanopy['Northing'], 
                              z=df_PreCanopy['PreCanopy_Height']+df_PreCanopy['Ground_interp_Height'],
    mode='markers',
    marker=dict(
        size=3,
        color='gray',
        opacity=0.8
    )
)

TOC_data = go.Scatter3d(name='Canopy and Top of the Canopy (TOC) data',
    x=df_TOC['Easting'], y=df_TOC['Northing'], 
                        z=df_TOC['TOC_Height']+df_TOC['Ground_interp_Height'],
    mode='markers',
    marker=dict(
        size=5,
        color=df_TOC['TOC_Height'],
        colorscale='Viridis',
        opacity=0.8
    )
)

Groundi_data = go.Scatter3d(name='Classified Ground data',
    x=df_PreCanopy['Easting'], y=df_PreCanopy['Northing'], z=df_PreCanopy['Ground_interp_Height'],
    mode='lines',
    line=dict(
        color='red',
        width=2
    )
)


fig.add_trace(Land_data)
fig.add_trace(Ground_data)
fig.add_trace(PreCanopy_data)
fig.add_trace(TOC_data)
fig.add_trace(Groundi_data)

# tight layout
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0), title='Land, Ground, Preliminary Canopy, and TOC')
fig.show()


# 4. Grass photons classification

Function `ATL03_GrassHeight_photons` takes the output from the previous step (`ATL03_ground_preliminary_canopy_photons`) stored in *ATL03_PreCanopy_**.hdf* and performs grass height calculations. Output is stored in *ATL03_GrassHeight_**.hdf*.

In [None]:
ATL03_precanopy_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_PreCanopy_*.hdf'))
ATL03_precanopy_files.sort()
for fname in ATL03_precanopy_files:
    ATL03_GrassHeight_photons(fname, ATL03_output_path, reprocess=False)

Load generated data - relevant grass heights are in column *Canopy_Height*.

In [None]:
ATL03_GrassHeight_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_GrassHeight_*.hdf'))
ATL03_GrassHeight_files.sort()
df_GrassHeight = pd.read_hdf(ATL03_GrassHeight_files[0], mode='r')
df_GrassHeight.head()

## Using plotly to plot an interactive map of the canopy height and grass height data

In [None]:
import plotly.graph_objects as go
ATL03_PreCanopy_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_PreCanopy_*.hdf'))
ATL03_PreCanopy_files.sort()
df_PreCanopy = pd.read_hdf(ATL03_PreCanopy_files[0], mode='r')

ATL03_Ground_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_Ground_*.hdf'))
ATL03_Ground_files.sort()
df_Ground = pd.read_hdf(ATL03_Ground_files[0], mode='r')

ATL03_TOC_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_TOC_*.hdf'))
ATL03_TOC_files.sort()
df_TOC = pd.read_hdf(ATL03_TOC_files[0], mode='r')

ATL03_GrassHeight_files = glob.glob(os.path.join(ATL03_output_path, 'ATL03_GrassHeight_*.hdf'))
ATL03_GrassHeight_files.sort()
df_GrassHeight = pd.read_hdf(ATL03_GrassHeight_files[0], mode='r')

fig = go.Figure()
Land_data = go.Scatter3d(name='All Land data',
    x=df_Land['Easting'], y=df_Land['Northing'], z=df_Land['Photon_Height'],
    mode='markers',
    marker=dict(
        size=1,
        color='black',
        opacity=0.8
    )
)

Ground_data = go.Scatter3d(name='Classified Ground data',
    x=df_Ground['Easting'], y=df_Ground['Northing'], z=df_Ground['Ground_Height'],
    mode='markers',
    marker=dict(
        size=3,
        color='red',
        opacity=0.8
    )
)

PreCanopy_data = go.Scatter3d(name='Preliminary Canopy data',
    x=df_PreCanopy['Easting'], y=df_PreCanopy['Northing'], 
                              z=df_PreCanopy['PreCanopy_Height']+df_PreCanopy['Ground_interp_Height'],
    mode='markers',
    marker=dict(
        size=3,
        color='gray',
        opacity=0.8
    )
)

Groundi_data = go.Scatter3d(name='Classified Ground data',
    x=df_PreCanopy['Easting'], y=df_PreCanopy['Northing'], z=df_PreCanopy['Ground_interp_Height'],
    mode='lines',
    line=dict(
        color='red',
        width=2
    )
)

GrassHeight_data = go.Scatter3d(name='Grass Height',
    x=df_GrassHeight['Easting'], y=df_GrassHeight['Northing'], 
                                z=df_GrassHeight['Ground_interp_Height']+df_GrassHeight['Grass_Height'],
    mode='markers',
    marker=dict(
        size=5,
        color=df_GrassHeight['Grass_Height'],
        colorscale='Viridis',
        opacity=0.8
    )
)

fig.add_trace(Land_data)
fig.add_trace(Ground_data)
fig.add_trace(PreCanopy_data)
fig.add_trace(GrassHeight_data)
fig.add_trace(Groundi_data)

# tight layout
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0), title='Land, Ground, and Grass Height')
fig.show()
