# Access to ICESat

This notebook is for providing codes for downloading and accessing to ICESat data.

## Basic concept of ICESat

It does not provide water product (ATL13) unlike ICESat2, but it provides GLAH14 which shows land surface. According to literature reviews, they extracted the levels using water mask.

I found 1) a py file for downloading ICESat directly.. and 2) a package I need to figure out how to use it.



### Py file from NSIDC

### Package nsidc-subsetter

There is a package to provide us to download ICESat 1 and 2, and ICEBridge: https://github.com/tsutterley/nsidc-subsetter

In [None]:
cd

In [None]:
!git clone https://github.com/choms516/nsidc-subsetter.git

This codes does not provide GLAH14... I am thinking of adding GLAH14 in this file if it is possible.

First, just use its code to see whether it works well.

In [None]:
cd ICESat_water_level/extraction/

In [None]:
cd nsidc-subsetter/

In [None]:
import codes.nsidc_subset_altimetry as ns

In [None]:
ff = './ICESat1'
prod = 'GLAH12'
vs = 34
us_id = 'whaudtlr516@gmail.com'
bound_box = [-50.33333,68.56667,-49.33333,69.56667]
times = ['2009-01-01T00:00:00','2009-12-31T23:59:59']
fmt = 'NetCDF4' 

In [None]:
ns.nsidc_subset_altimetry(ff, prod, vs, USER=us_id,
    BBOX=bound_box, TIME=times, FORMAT=fmt, VERBOSE=True, UNZIP=True)

Failed today...

## Open ICESat

In [1]:
import os
from pathlib import Path
import h5py
import pandas as pd

Make a list of downloaded files

In [2]:
##### load files
## set the directory
data_home = Path('/home/jovyan/ICESat_water_level/extraction/icesat/')
## list them up and check them
files= list(data_home.glob('*.H5'))

There are 89 files: which are within Tonle Sap Lake

In [9]:
len(list(files))

89

Just open an file

In [3]:
test = files[1]

In [17]:
f = h5py.File(test, 'r')
print(f)

<HDF5 file "GLAH14_634_2103_002_0211_0_01_0001.H5" (mode r)>


There are 5 groups: ['ANCILLARY_DATA', 'BROWSE', 'Data_1HZ', 'Data_40HZ', 'METADATA']

In [5]:
for grs in list(f):
    print(grs)
    print(list(f[grs]))
    print(' ')

ANCILLARY_DATA
[]
 
BROWSE
['Image_00', 'Image_01', 'Image_02', 'Image_03']
 
Data_1HZ
['Time', 'Geolocation', 'Packet_data', 'Quality', 'Transmit_Energy', 'Reflectivity', 'Elevation_Flags', 'Atmosphere', 'DS_UTCTime_1']
 
Data_40HZ
['Time', 'Geolocation', 'Elevation_Surfaces', 'Elevation_Corrections', 'Elevation_Angles', 'Elevation_Offsets', 'Quality', 'Elevation_Flags', 'Transmit_Energy', 'Geophysical', 'Reflectivity', 'Waveform', 'Atmosphere', 'DS_DEMhiresArElv', 'DS_PeakNumber', 'DS_UTCTime_40']
 
METADATA
['COLLECTIONMETADATA', 'INVENTORYMETADATA', 'PROVENANCE']
 


In [33]:
f['Data_40HZ']['Elevation_Surfaces']['d_elev'][:]

array([17.875, 17.923, 17.921, ..., 56.454, 58.095, 57.457])

In [21]:
f['METADATA']['COLLECTIONMETADATA']['Temporal'][:]

AttributeError: 'slice' object has no attribute 'encode'

In [23]:
list(f['METADATA']['COLLECTIONMETADATA']['ECSCollection'])

[]

I need to find information on GLAH14, specifically differences between 'Data_1HZ', 'Data_40HZ'. Also, I need to subset the files within TSL boundaries.

In my understanding 'Data_40HZ' is the original information. Let's include useful parameters, then I will subset it by ROI.

Groups
- Data_40HZ/Geolocation
`d_lat`: latitude of shots
`d_lon`: longitude of shots

- Data_40HZ/Elevation_Surfaces
`d_elev`: elevation without any corrections

- Data_40HZ/Elevation_Corrections
`d_satElevCorr`: saturation elevation correction (correction to elevation for sturated waveforms.. To apply it to the elevations it must be added to the elevation estimates

- Data_40HZ/Quality
`sat_corr_flg`: the saturation correction flag - the possible quality of the elevation data
`i_satNdx`:understanding of concerns on data quality from saturation effects

- Data_40HZ/Geophysical
- `d_DEM_elv`: DEM elevation at the footprint location from the SRTM30

- ANCILLARY_DATA
- `internal_time_delay_date`: time

In [8]:
def glah14_to_df(filename):    
    f = h5py.File(filename, 'r')
    lat = f['Data_40HZ']['Geolocation']['d_lat'][:]
    lon = f['Data_40HZ']['Geolocation']['d_lon'][:]
    elev = f['Data_40HZ']['Elevation_Surfaces']['d_elev'][:]
    sec = f['Data_40HZ']['Elevation_Corrections']['d_satElevCorr'][:]
    scf = f['Data_40HZ']['Quality']['sat_corr_flg'][:]
    satndx = f['Data_40HZ']['Quality']['i_satNdx'][:]
    dem = f['Data_40HZ']['Geophysical']['d_DEM_elv'][:]
    
    glah14_df = pd.DataFrame({'Latitude':lat,'Longitude':lon,'Elevation':elev,
                            's_El_Corr':sec, 's_Corr_f':scf,'in_sat':satndx,
                            'DEM':dem})
    return glah14_df

In [24]:
test_df = glah14_to_df(test)
print(test_df)

          Latitude   Longitude  Elevation  s_El_Corr  s_Corr_f  in_sat  \
0        75.376446  298.446866     17.875        0.0         0       0   
1        75.377939  298.445066     17.923        0.0         0       0   
2        75.379429  298.443276     17.921        0.0         0       0   
3        75.380913  298.441503     17.842        0.0         0       0   
4        75.382396  298.439736     17.891        0.0         0       0   
...            ...         ...        ...        ...       ...     ...   
1082195  37.859500  334.629687     57.231        0.0         0       0   
1082196  37.861056  334.629417     58.130        0.0         0       0   
1082197  37.862613  334.629147     56.454        0.0         0       0   
1082198  37.864169  334.628876     58.095        0.0         0       0   
1082199  37.865726  334.628605     57.457        0.0         0       0   

                   DEM  
0        1.797693e+308  
1        1.797693e+308  
2        1.797693e+308  
3        1.

### Subsetting to the ROI.

In [28]:
### bounds of Tonle Sap Lake
sp_ex = [103.643, 104.667, 12.375, 13.287]

In [38]:
test_df_subset = test_df.loc[(test_df['Longitude']>=sp_ex[0]) 
                  & (test_df['Longitude']<=sp_ex[1])
                  & (test_df['Latitude']>=sp_ex[2])
                  & (test_df['Latitude']<=sp_ex[3])]