<center>
<img src='./img/nsidc_logo.png'/>

# **Using Coiled to Produce ICESat-2 Sea Ice Height Time Series**

</center>

---

## **1. Tutorial Introduction/Overview**

Tutorial designed for the "DAAC data access in the cloud hands-on experience" session at the 2023 NSIDC DAAC User Working Group (UWG) Meeting. This is a copy of the `2_ATL07_timeseries` notebook for use with Coiled.


TODOS:
* Explain Coiled
* Question for Luis: Why would I use the decorator function (` @coiled.function()`) vs:

```
cluster = coiled.Cluster(n_workers=20, region="us-west-2")
client = cluster.get_client()
client
```
* How do we incorporate https://medium.com/coiled-hq/processing-a-250-tb-dataset-with-coiled-dask-and-xarray-574370ba5bde ? 


### **Credits**

The notebook was created by Andy Barrett, Luis Lopez, and Amy Steiker, all of NSIDC.

For questions regarding the notebook, or to report problems, please create a new issue in the [NSIDC-Data-Tutorials repo](https://github.com/nsidc/NSIDC-Data-Tutorials/issues).

### **Learning Objectives**

*After completing this notebook you will be able to...* 

### **Prerequisites**

TBD - Need to include prereqs for Coiled (how to gain access, etc.) 


*To get the most out of this tutorial notebook, you should be familiar with the following concepts/data sets/programming languages...*

*The main packages/libraries that will be used in this notebook are...*

*The GIS concepts applied in this tutorial are...*

### **Example of end product (recommended, not required)** 

Include a figure that illustrates the end product of the notebook.  This could be a data plot, map or some other type of visualization.

Please include figures in an "img" folder located at the same level as the notebook within your tutorial folder.

<div>
<img align="left" width="50%" height="100px" src='./img/example_end_product.png'/>
</div>

### **Time requirement**

*TBD...*

## **2. Tutorial steps**

### **Import Packages**

In [1]:
# For Coiled cloud compute
import coiled

# For searching NASA data
import earthaccess

# For reading data, analysis and plotting
import xarray as xr
# import numpy as np
# import geopandas as gpd
# import pandas as pd
# import hvplot.xarray
# import pprint
# from affine import Affine
# from pyproj import CRS



### **Authenticate**

In [2]:
auth = earthaccess.login()

EARTHDATA_USERNAME and EARTHDATA_PASSWORD are not set in the current environment, try setting them or use a different strategy (netrc, interactive)
You're now authenticated with NASA Earthdata Login
Using token with expiration date: 10/06/2023
Using .netrc file for EDL


### **Search for ICESat-2 ATL07 data**

Using spatial/temporal range from https://icesat-2-2023.hackweek.io/tutorials/sea_ice/1_sea_ice_tutorial.html :


```
# Spatial extent: Ross Sea, Antarctica
spatial_extent = [-180, -78, -160, -74]

# Time range
date_range = ['2019-09-16','2019-09-16'] # first time period
# date_range = ['2019-11-13','2019-11-13'] # second time period
```

In [3]:
results = earthaccess.search_data(
    short_name = 'ATL10',
    version = '006',
    cloud_hosted = True,
    bounding_box = (-180, -78, -160, -74),
    temporal = ('2019-09-16','2019-09-23'),
)

Granules found: 14


In [4]:
[display(r) for r in results]

[None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None]

### **Extract freeboard segments**

We now create a geopandas dataset from our results. 

Because ATL10 is not a gridded prduct we need to extract coordinates and variables from their groups inside the HDF5 file.

#### Open the files using the `open` method. 

The auth object created at the start of the notebook is used to provide Earthdata Login authentication and AWS credentials.

In [5]:
files = earthaccess.open(results)

 Opening 14 granules, approx size: 0.0 GB


SUBMITTING | :   0%|          | 0/14 [00:00<?, ?it/s]

PROCESSING | :   0%|          | 0/14 [00:00<?, ?it/s]

COLLECTING | :   0%|          | 0/14 [00:00<?, ?it/s]

#### Geopandas Read function 

The function below extracts latitude, longitude, segment distance, segment length, surface type, and freeboard height. See the [NSIDC's ATL10 User Guide](https://nsidc.org/sites/default/files/documents/user-guide/atl10-v006-userguide.pdf) for more details on these variables.

In [8]:
## Based on the READ function form Younghyun Koo for the sea ice tutorial at the IS2 hackweek

## Luis: Can we modify this function so that it reads many files from earthaccess and puts them into a single gdf?
@coiled.function()
def read_atl10(filename):

    # Create a list for saving ATL10 beam track data
    tracks = []

    with h5py.File(filename,'r') as f:

        # Check the orbit orientation
        orient = f['orbit_info/sc_orient'][0]

        if orient == 0:
            strong_beams = [f"gt{i}l" for i in [1, 2, 3]]
        elif orient == 1:
            strong_beams = [f"gt{i}r" for i in [1, 2, 3]]
        else:
            strong_beams = []

        for beam in strong_beams:

            lat = f[beam]['freeboard_segment/latitude'][:]
            lon = f[beam]['freeboard_segment/longitude'][:]
            seg_x = f[beam]['freeboard_segment/seg_dist_x'][:] / 1000 # (m to km)
            seg_len = f[beam]['freeboard_segment/heights/height_segment_length_seg'][:]
            fb = f[beam]['freeboard_segment/beam_fb_height'][:]
            surface_type = f[beam]['freeboard_segment/heights/height_segment_type'][:]
            fb[fb>100] = np.nan

            df = pd.DataFrame({'lat': lat, 'lon': lon, 'seg_x': seg_x, 'seg_len': seg_len,
                              'freeboard': fb, 'stype': surface_type})
            df['beam'] = beam
            df = df.dropna().reset_index(drop = True)
            gdf = gpd.GeoDataFrame(
                    df, geometry=gpd.points_from_xy(df.lon, df.lat), crs="EPSG:4326"
            )
            tracks.append(gdf)
        return tracks


AttributeError: module 'coiled' has no attribute 'function'

In [35]:
file = files[0]
tracks = read_atl10(file)

Exception ignored in: <function CachingFileManager.__del__ at 0x7f9751c03f70>
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 243, in __del__
    ref_count = self._ref_counter.decrement(self._key)
AttributeError: 'CachingFileManager' object has no attribute '_ref_counter'


### The ATL10 granule was loaded into 3 different geopandas dataframes, one for each strong beam

In [38]:
tracks[0]

Unnamed: 0,lat,lon,seg_x,seg_len,freeboard,stype,beam,geometry
0,-59.741469,3.516870,26728.174842,13.376839,0.181045,1,gt1r,POINT (3.51687 -59.74147)
1,-59.741522,3.516859,26728.180798,12.673558,0.182037,1,gt1r,POINT (3.51686 -59.74152)
2,-59.741583,3.516847,26728.187580,13.375313,0.196155,1,gt1r,POINT (3.51685 -59.74158)
3,-59.741633,3.516837,26728.193151,12.669574,0.253846,1,gt1r,POINT (3.51684 -59.74163)
4,-59.741687,3.516826,26728.199205,12.666320,0.272020,1,gt1r,POINT (3.51683 -59.74169)
...,...,...,...,...,...,...,...,...
128113,-68.014697,-171.706977,32558.087404,39.016903,0.188267,1,gt1r,POINT (-171.70698 -68.01470)
128114,-68.014502,-171.707036,32558.109229,38.899364,0.237892,7,gt1r,POINT (-171.70704 -68.01450)
128115,-68.014354,-171.707080,32558.125831,38.138302,0.258053,1,gt1r,POINT (-171.70708 -68.01435)
128116,-68.014154,-171.707138,32558.148318,40.912815,0.226246,9,gt1r,POINT (-171.70714 -68.01415)


### **Calculate grid indices of segment centers**

Using pyproj and Affine

### **Assign to grid and calculate grid cell mean**

## **3. Learning outcomes recap (optional)**

Provide a brief summary of the learning outcomes of the tutorial


## **4. Additional resources (optional)**

List some additional resources for users to consult, if applicable/desired.

________

### **When your tutorial is ready for review,  please read our [Contributor Guide](https://github.com/nsidc/NSIDC-Data-Tutorials/blob/main/contributor_guide.md) for next steps.**