# Visualization for LEVEL ONE Data
Here we will visualize the GEDI lv 1 data using the Geoviews library. We will first load the data from the GEDI files and then plot the data on a map.
Then, we will reconstruct an individual waveform.

In [2]:
import os
import h5py
import numpy as np
import pandas as pd
import geopandas as gp
import geoviews as gv
from geoviews import opts, tile_sources as gvts
import holoviews as hv
gv.extension('bokeh', 'matplotlib')
from shapely.geometry import Point
import warnings
from shapely.errors import ShapelyDeprecationWarning
warnings.filterwarnings("ignore", category=ShapelyDeprecationWarning) 

In [3]:
inDir = os.getcwd() + "\\GEDI_sample_files"
print(inDir)

C:\Users\jingb\OneDrive\Documents\GEDI Outlier Detection\GEDI_sample_files


In [4]:
gedi_lv1 = [g for g in os.listdir(inDir) if g.startswith('GEDI01') and g.endswith('.h5')]  # List all GEDI level 1 files in inDir
gedi_lv1

['GEDI01_B_2022004042652_O17343_04_T10772_02_005_02_V002.h5',
 'GEDI01_B_2022207041426_O20491_04_T09293_02_005_03_V002.h5']

In [5]:
gedi_lv2 = [g for g in os.listdir(inDir) if g.startswith('GEDI02') and g.endswith('.h5')]  # List all GEDI level 1 files in inDir
gedi_lv2

['GEDI02_A_2021050140102_O12405_02_T10912_02_003_02_V002.h5',
 'GEDI02_A_2021086153349_O12964_03_T08275_02_003_02_V002.h5']

### Overview of a LEVEL ONE file's data structure

In [6]:
eg_file_path = os.path.join(inDir, gedi_lv1[1])  # Select an example file
eg_file = h5py.File(eg_file_path, 'r')
print('Loading file: ' + eg_file_path)
print('The file contains the following groups: ' + str(list(eg_file.keys())))

print("The file's metadata contains the following attributes: ")
for g in eg_file['METADATA']['DatasetIdentification'].attrs: print(g) 

print(eg_file['METADATA']['DatasetIdentification'].attrs['purpose'])
print(eg_file['METADATA']['DatasetIdentification'].attrs['abstract'])

beamNames = [g for g in eg_file.keys() if g.startswith('BEAM')]

print("The file contains the following beams: ")
for b in beamNames: 
    print(f"{b} is a {eg_file[b].attrs['description']}")



Loading file: C:\Users\jingb\OneDrive\Documents\GEDI Outlier Detection\GEDI_sample_files\GEDI01_B_2022207041426_O20491_04_T09293_02_005_03_V002.h5
The file contains the following groups: ['BEAM0000', 'BEAM0001', 'BEAM0010', 'BEAM0011', 'BEAM0101', 'BEAM0110', 'BEAM1000', 'BEAM1011', 'METADATA']
The file's metadata contains the following attributes: 
PGEVersion
VersionID
abstract
characterSet
creationDate
credit
fileName
language
originatorOrganizationName
purpose
shortName
spatialRepresentationType
status
topicCategory
uuid
The purpose of the L1B dataset is to provide geolocated waveforms and supporting datasets for each laser shot for all eight GEDI beams.  This includes corrected and smoothed waveforms, geolocation parameters, and geophysical corrections.
The GEDI L1A and L1B standard data product contains precise latitude, longitude, and height above the reference ellipsoid of the laser shot received bounce point for all laser shots.  Geophysical corrections are applied to the surfa

In [7]:
eg_file_objs = []
eg_file.visit(eg_file_objs.append)                                           # Retrieve list of datasets
gediSDS = [o for o in eg_file_objs if isinstance(eg_file[o], h5py.Dataset)]  # Search for relevant SDS inside data file
[i for i in gediSDS if beamNames[0] in i][0:10]                              # Print the first 10 datasets for selected beam

['BEAM0000/all_samples_sum',
 'BEAM0000/ancillary/master_time_epoch',
 'BEAM0000/ancillary/mean_samples',
 'BEAM0000/ancillary/smoothing_width',
 'BEAM0000/beam',
 'BEAM0000/channel',
 'BEAM0000/geolocation/altitude_instrument',
 'BEAM0000/geolocation/altitude_instrument_error',
 'BEAM0000/geolocation/bounce_time_offset_bin0',
 'BEAM0000/geolocation/bounce_time_offset_bin0_error']

### Loading all beam data into a Pandas dataframe

In [8]:
lonSample, latSample, shotSample, srfSample, degradeSample, beamSample = [], [], [], [], [], []  # Set up lists to store data

for b in beamNames:
    # Open the SDS
    lats = eg_file[f'{b}/geolocation/latitude_bin0'][()]
    lons = eg_file[f'{b}/geolocation/longitude_bin0'][()]
    shots = eg_file[f'{b}/shot_number'][()]
    srf = eg_file[f'{b}/stale_return_flag'][()]
    degrade = eg_file[f'{b}/geolocation/degrade'][()]
    
    # Take every 100th shot and append to list
    for i in range(len(shots)):
        if i % 100 == 0:
            shotSample.append(str(shots[i]))
            lonSample.append(lons[i])
            latSample.append(lats[i])
            srfSample.append(srf[i])
            degradeSample.append(degrade[i])
            beamSample.append(b)
                
# Write all of the sample shots to a dataframe
latslons = pd.DataFrame({'Beam': beamSample, 'Shot Number': shotSample, 'Longitude': lonSample, 'Latitude': latSample,
                         'Stale Return Flag': srfSample, 'Degrade': degradeSample})

print("Output shape is: " + str(latslons.shape))
latslons

Output shape is: (4727, 6)


Unnamed: 0,Beam,Shot Number,Longitude,Latitude,Stale Return Flag,Degrade
0,BEAM0000,204910000400278627,-61.841223,0.407430,0,0
1,BEAM0000,204910000400278727,-61.811346,0.364843,1,0
2,BEAM0000,204910000400278827,-61.781708,0.322684,1,0
3,BEAM0000,204910000400278927,-61.751786,0.280149,0,0
4,BEAM0000,204910000400279027,-61.721563,0.237148,0,0
...,...,...,...,...,...,...
4722,BEAM1011,204911100400328870,-43.400519,-23.704680,0,0
4723,BEAM1011,204911100400328970,-43.363103,-23.745582,0,0
4724,BEAM1011,204911100400329070,-43.326729,-23.785303,0,0
4725,BEAM1011,204911100400329170,-43.290709,-23.824590,0,0


In [9]:
# Take the lat/lon dataframe and convert each lat/lon to a shapely point
latslons['geometry'] = latslons.apply(lambda row: Point(row.Longitude, row.Latitude), axis=1)
# Convert to a Geodataframe
latslons = gp.GeoDataFrame(latslons)
latslons = latslons.drop(columns=['Latitude','Longitude'])
latslons['geometry']

0         POINT (-61.84122 0.40743)
1         POINT (-61.81135 0.36484)
2         POINT (-61.78171 0.32268)
3         POINT (-61.75179 0.28015)
4         POINT (-61.72156 0.23715)
                   ...             
4722    POINT (-43.40052 -23.70468)
4723    POINT (-43.36310 -23.74558)
4724    POINT (-43.32673 -23.78530)
4725    POINT (-43.29071 -23.82459)
4726    POINT (-43.25472 -23.86393)
Name: geometry, Length: 4727, dtype: geometry

### Visualizing GEDI beam path

In [10]:
# Define a helper function for visualizing GEDI points
def pointVisual(features, vdims):
    return (gvts.EsriImagery * gv.Points(features, vdims=vdims).options(tools=['hover'], height=500, width=900, size=5, 
                                                                        color='yellow', fontsize={'xticks': 10, 'yticks': 10, 
                                                                                                  'xlabel':16, 'ylabel': 16}, 
                                                                        title = f'Beams of {eg_file_path}'))

In [11]:
# Defining the vdims below will allow you to hover over specific shots and view information about them.
vdims = []
for f in latslons:
    if f not in ['geometry']:
        vdims.append(f)
vdims


['Beam', 'Shot Number', 'Stale Return Flag', 'Degrade']

In [12]:
# Visualize the GEDI points over a map
pointVisual(latslons, vdims)

### Reconstructing waveforms from a few shots from the dataframe
I specifically chose the waveforms at the edge of the continent to see how the elevation and plant coverage changes.

In [20]:
shots = [204910500400327385, 204911100400326970, 204910000400334827, 204910600400788689]
wave_indices = dict()

for shot in shots:
    # indices[shot] = dict()
    print('Looking for shot number ' + str(shot))
    # Loop through each beam name
    for beam in beamNames:
        print('Looking in beam ' + beam)
        # Find the index of the shot in the current beam
        shot_numbers = eg_file[f'{beam}/shot_number'][()]
        try:
            index = np.where(shot_numbers == shot)[0][0]
            wave_indices[shot] = [beam, index]
        except IndexError:
            continue

wave_indices

Looking for shot number 204910500400327385
Looking in beam BEAM0000
Looking in beam BEAM0001
Looking in beam BEAM0010
Looking in beam BEAM0011
Looking in beam BEAM0101
Looking in beam BEAM0110
Looking in beam BEAM1000
Looking in beam BEAM1011
Looking for shot number 204911100400326970
Looking in beam BEAM0000
Looking in beam BEAM0001
Looking in beam BEAM0010
Looking in beam BEAM0011
Looking in beam BEAM0101
Looking in beam BEAM0110
Looking in beam BEAM1000
Looking in beam BEAM1011
Looking for shot number 204910000400334827
Looking in beam BEAM0000
Looking in beam BEAM0001
Looking in beam BEAM0010
Looking in beam BEAM0011
Looking in beam BEAM0101
Looking in beam BEAM0110
Looking in beam BEAM1000
Looking in beam BEAM1011
Looking for shot number 204910600400788689
Looking in beam BEAM0000
Looking in beam BEAM0001
Looking in beam BEAM0010
Looking in beam BEAM0011
Looking in beam BEAM0101
Looking in beam BEAM0110
Looking in beam BEAM1000
Looking in beam BEAM1011


{204910500400327385: ['BEAM0101', 56100],
 204911100400326970: ['BEAM1011', 56100],
 204910000400334827: ['BEAM0000', 56200],
 204910600400788689: ['BEAM0110', 56100]}

In [43]:
wf_data_vis = dict()
wf_graphs = dict()
for shot in shots:
    index = wave_indices[shot][1]
    beam = wave_indices[shot][0]
    
    # From the SDS list, use list comprehension to find sample_count, sample_start_index, and rxwaveform
    sdsCount = eg_file[[g for g in gediSDS if g.endswith('/rx_sample_count') and beam in g][0]]
    sdsStart = eg_file[[g for g in gediSDS if g.endswith('/rx_sample_start_index') and beam in g][0]]
    sdsWaveform = [g for g in gediSDS if g.endswith('/rxwaveform') and beam in g][0]

    print(f"Shot {shot} has the following information: ")
    print(sdsCount, sdsStart, sdsWaveform)

    # Grabbing additional information about the shot, including the unique `shot_number`, and lat/lon location.
    wfCount = sdsCount[index]           # Number of samples in the waveform
    wfStart = int(sdsStart[index] - 1)  # Subtract one because python is zero-indexed
    
    wfShot = eg_file[f'{beam}/shot_number'][index]
    wfLat = eg_file[f'{beam}/geolocation/latitude_bin0'][index]
    wfLon = eg_file[f'{beam}/geolocation/longitude_bin0'][index]
    
    print(f"This waveform (shot ID: {wfShot}, index {index}) is located at {str(wfLat)}, {str(wfLon)}. It is from beam {beam} \
    and is stored in rxwaveform beginning at index {wfStart} and ending at index {wfStart + wfCount}")
    print(f"The beam {beam} has " + "{:,}".format(eg_file[sdsWaveform].shape[0]) + " values stored in its waveform data.")

    # Grabbing the elevation recorded at the start and end of the full waveform capture
    zStart = eg_file[f'{beam}/geolocation/elevation_bin0'][index]   # Height of the start of the rx window
    zEnd = eg_file[f'{beam}/geolocation/elevation_lastbin'][index]  # Height of the end of the rx window
    
    # Retrieve the waveform sds layer using the sample start index and sample count information to slice the correct dimensions
    waveform = eg_file[sdsWaveform][wfStart: wfStart + wfCount]
    # Find elevation difference from start to finish and divide into equal intervals based on sample_count
    zStretch = np.add(zEnd, np.multiply(range(wfCount, 0, -1), ((zStart - zEnd) / int(wfCount))))
    # print(zStretch)
    
    # match the waveform amplitude values with the elevation and convert to Pandas df
    wvDF = pd.DataFrame({'Amplitude (DN)': waveform, 'Elevation (m)': zStretch})
    wf_data_vis[shot] = wvDF
    
    hv.Curve(wvDF) # Basic line graph plotting the waveform
    # Create a holoviews interactive Curve plot with additional parameters defining the plot aesthetics 
    wfVis = hv.Curve(wvDF).opts(color='darkgreen', tools=['hover'], height=500, width=400,
               xlim=(np.min(waveform) - 10, np.max(waveform) + 10), ylim=(np.min(zStretch), np.max(zStretch)),
               fontsize={'xticks':10, 'yticks':10,'xlabel':16, 'ylabel': 16, 'title':13}, line_width=2.5, title=f'{str(wfShot)}')
    wf_graphs[shot] = wfVis
    
print(wf_data_vis)
print(wf_graphs)

Shot 204910500400327385 has the following information: 
<HDF5 dataset "rx_sample_count": shape (58409,), type "<u2"> <HDF5 dataset "rx_sample_start_index": shape (58409,), type "<u8"> BEAM0101/rxwaveform
This waveform (shot ID: 204910500400327385, index 56100) is located at -22.93657643755092, -44.06742759992179. It is from beam BEAM0101     and is stored in rxwaveform beginning at index 79662000 and ending at index 79662876
The beam BEAM0101 has 82,940,780 values stored in its waveform data.
Shot 204911100400326970 has the following information: 
<HDF5 dataset "rx_sample_count": shape (58496,), type "<u2"> <HDF5 dataset "rx_sample_start_index": shape (58496,), type "<u8"> BEAM1011/rxwaveform
This waveform (shot ID: 204911100400326970, index 56100) is located at -22.951903527514236, -44.08043385321297. It is from beam BEAM1011     and is stored in rxwaveform beginning at index 79662000 and ending at index 79662829
The beam BEAM1011 has 83,064,320 values stored in its waveform data.
Sho

In [44]:
wf_graphs[shots[0]].opts(width=240) + wf_graphs[shots[1]].opts(width=240, labelled=[]) + wf_graphs[shots[2]].opts(width=240, labelled=[]) + wf_graphs[shots[3]].opts(width=240, labelled=[])