<span style='color:#009999'> <span style='font-family:serif'> <font size="15"> **ECCOv4 from NASA Earth Data Cloud**<span style='color:#0066cc'> 

<img src="img/ECCOv4.png" alt="drawing" width="750"/>    


<span style='color:#0066cc'><font size="5"> **About the "Estimating the Circulation and Climate of the Ocean" [ECCO](https://ecco-group.org/) consortium**
1. <font size="3"><span style='color:Black'> Uses a combination of assimilated data from (various sources) to constrain the simulated global ocean and atmosphere model.
3. <font size="3"><span style='color:Black'> Remains widely used by climate scientists.
4. <font size="3"><span style='color:Black'> ECCO belongs to a hierarchy of global models that all share the same grid topology (Cube Sphere), but differ in horizontal resolution. 



<span style='color:#ff6666'><font size="5">**Requirements**
1. <font size="3"><span style='color:Black'> Have a Bearer Token for EarthData in the Cloud (See `GetStarted` Notebook).
2. <font size="3"><span style='color:Black'> Upload the Bearer Token from local file `token.json`



<font size="3"><span style='color:Black'> This notebook makes use of [xarray](https://xarray.dev/) with [pydap](https://pydap.github.io/pydap/) as an  engine` to enable parallelism. However, this notebook also provides OPeNDAP savy approach that can accelerate scientific workflows when remote dataset are available via Hyrax in the Cloud (cloud OPeNDAP urls).


 <span style='color:#ff6666'><font size="5"> **Objectives**
 
 
- <font size="3"><span style='color:Black'> Demostrate how to use NASA's `Common Metadata Repository` ([CMR](https://cmr.earthdata.nasa.gov/search)) to find `cloud OPeNDAP URLS` associated with a collection.
- <font size="3"><span style='color:Black'> Demonstrate the use of `Constraint Expressions` to reduce metadata during Virtual Dataset creation
- <font size="3"><span style='color:Black'> Use <span style='color:#ff6666'>**PyDAP**<span style='color:black'>'s `consolidate_metadata` to accelerate data cube creation via `xarray.open_mfdataset`.
- <font size="3"><span style='color:Black'> Demonstrate an advanced workflow for remote access and plotting of **Level 4** with complex Topology ECCOv4 Data available via Hyrax in the Cloud / cloud OPeNDAP.



<span style='color:#ff6666'><font size="5">**Browsing Data**:

<font size="3"><span style='color:Black'> Broad information about the dataset can be found in the PODAAC website (see [here](https://podaac.jpl.nasa.gov/cloud-datasets?view=list&ids=Projects&values=ECCO)).


<font size="3"><span style='color:Black'> Some Collections of interest can be found following the links below:

- <font size="3"> [Native grid](https://podaac.jpl.nasa.gov/dataset/ECCO_L4_GEOMETRY_LLC0090GRID_V4R4)
- <font size="3"> [Temperature and Salinity](https://podaac.jpl.nasa.gov/dataset/ECCO_L4_TEMP_SALINITY_LLC0090GRID_MONTHLY_V4R4)
- <font size="3"> [Velocities](https://podaac.jpl.nasa.gov/dataset/ECCO_L4_OCEAN_VEL_LLC0090GRID_MONTHLY_V4R4)
- <font size="3"> [Mixed layer depth](https://podaac.jpl.nasa.gov/dataset/ECCO_L4_MIXED_LAYER_DEPTH_LLC0090GRID_MONTHLY_V4R4)


In [None]:
import matplotlib.pyplot as plt
import numpy as np
import requests
from pydap.client import open_url
from pydap.net import create_session
import json
import cartopy.crs as ccrs
import xarray as xr
import datetime as dt
from pydap.client import consolidate_metadata

<span style='color:#ff6666'><font size="5">**Finding Cloud OPeNDAP URLs with NASA's CMR**:

<span style='font-family:serif'> <font size="3"><span style='color:Black'> Below we illustrate how to find OPeNDAP URLs via the **CMR**

<span style='color:#0066cc'><font size="3.5"> **To find (cloud) OPeNDAP URL you will need:**

* One of `Collection Concept ID` or `dataset DOI`
* Time Range


Here, we will use the Collection Concept ID associated with the [Temperature and Salinity](https://podaac.jpl.nasa.gov/dataset/ECCO_L4_TEMP_SALINITY_LLC0090GRID_MONTHLY_V4R4). For example:

<img src="img/ECCO_conceptID_doi.png" alt="drawing" width="750"/>    





In [None]:
session = requests.Session()

In [None]:
# CMR API base url
cmrurl='https://cmr.earthdata.nasa.gov/search/'
doi = '10.5067/ECL5M-OTS44'
doisearch = cmrurl + 'collections.json?doi=' + doi
print(doisearch)

concept_id = session.get(doisearch).json()['feed']['entry'][0]['id']
print(concept_id)

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Specify time range**

<font size="3"><span style='color:Black'> This dataset covers `01-01-1992` to `01-18-2018`. 


In [None]:
start_date =  dt.datetime(1992, 1, 1)
end_date = dt.datetime(2017, 12, 31)

print(start_date,end_date,sep='\n')

dt_format = '%Y-%m-%dT%H:%M:%SZ' # format requirement for datetime search
temporal_str = start_date.strftime(dt_format) + ',' + end_date.strftime(dt_format)
print(temporal_str)

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Get all available cloud OPeNDAP URLs via CMR**

The cell below will search/find all OPeNDAP URLs associated with the Collection concept ID.

The results wll be stored in the variable `granules_urls`.
    

In [None]:
def get_opendap_urls(concept_id, time_range, _session=None):
    """
    Queries NASA's `Common Metadata Repository` to identify all OPeNDAP URLS
    given collection concept ID and temporal time range.
    """
    cmr_url = 'https://cmr.earthdata.nasa.gov/search/granules'
    if not _session:
        _session = requests.Session() 
    cmr_response = _session.get(cmr_url, params={'concept_id': concept_id,'temporal': time_range,'page_size': 500}, headers={'Accept': 'application/json'})
    granules = cmr_response.json()['feed']['entry']
    granules_urls = []
    
    # Filter and only retain the OPeNDAP URLs
    for granule in granules:
        item = next((item['href'] for item in granule['links'] if "opendap" in item["href"]), None)
        if item != None:
            granules_urls.append(item)
    return granules_urls

In [None]:
%%time
granules_urls = get_opendap_urls(concept_id, temporal_str)

In [None]:
print("WE found: ", len(granules_urls), " total Cloud OPeNDAP URLS associated with this collection!")

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Pydap Approach**

<span style='font-family:serif'> <font size="3.5"> We can use <span style='color:#ff6666'>**PyDAP**<span style='color:black'> to inspect the metadata associated with each of the urls.

<span style='font-family:serif'> <font size="3.5">Below we illustrate the use of <span style='color:#ff6666'>**PyDAP**<span style='color:black'> with Token authentication to access OPeNDAP metadata.

<span style='font-family:serif'> <font size="3.5"> This will be useful when accessing OPeNDAP URLs via xarray.


<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Import Token Authorization and create Session**
 


In [None]:
# load token json data
with open('token.json', 'r') as fp:
    token = json.load(fp)

# pass Token Authorization to a new Session.
my_session = create_session(session_kwargs=token)

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Lazy access to remote data via pydap's client API**

<font size="3"> <span style='color:#ff6666'>**PyDAP**<span style='color:black'> exploits the OPeNDAP's separation between metadata and data, to create lazy dataset objects that point to the data. These lazy objects contain all the attributes detailed in OPeNDAP's metadata files (DMR)

In [None]:
%%time
pyds = open_url(granules_urls[0], session=my_session, protocol='dap4')

In [None]:
pyds.tree()

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Not all Variables are of interest. Lets use Constraint Expressions!**

<font size="3">  Consider that we only want
- `THETA`
- `SALT`

<font size="3">  and their `dimensions`. 

In [None]:
print("dimension of THETA:" , pyds['THETA'].dims)
print("dimension of SALT:" , pyds['SALT'].dims)

<span style='color:#0066cc'><font size="5"> **Construct Constraint Expression**

<font size="3"> That will instruct the Hyrax Data Server to only give use our desired variables.

<font size="3">  This variable will be named `CE`. We will add it to each (granule) cloud OPeNDAP URL. THis will allow us to construct a `Data Cube`


In [None]:
dims = pyds['SALT'].dims
Vars = ['/THETA', '/SALT'] + dims

# Below construct Contraint Expression
CE = "?dap4.ce="+(";").join(Vars)
print("constraint expression: ", CE)

In [None]:
print(" Each Cloud OPeNDAP URL will look like: \n", granules_urls[0]+CE)

<span style='color:#0066cc'><font size="5"> **Construct DAP4 URLS:**
 

<font size="3"> A DAP4 url begins with `dap4` as a scheme. 

<font size="3"> **NOTE**: This is only for xarray and <span style='color:#ff6666'>**PyDAP**<span style='color:black'>.


In [None]:
new_urls = [url.replace("https", "dap4")+CE for url in granules_urls]
new_urls[:4]

<span style='color:#0066cc'><font size="5"> **Consolidate all URL Metadata Associated with the Data URL of cloud OPeNDAP URLs**

<font size="3"> You can construct a persistent reference to all Cloud OPeNDAP urls for later use!!!! 


In [None]:
cached_session = create_session(use_cache=True, session_kwargs=token)

In [None]:
# clear just in case
cached_session.cache.clear()

In [None]:
%%time
consolidate_metadata(new_urls, cached_session)

## Create a datacube with xarray and pydap as an engine!




In [None]:
%%time
ds = xr.open_mfdataset(new_urls, engine='pydap', session=cached_session, parallel=True, combine='nested', concat_dim='time')

In [None]:
ds

## Download some data

So far, only metadata has been downloaded. Below we plot some data in the NorthAtlantic ocean





In [None]:
%%time
ds['THETA'].isel(time=0, k=0, tile=2).plot(cmap='RdBu_r', vmin=-4, vmax=30);

In [None]:
ds['THETA'].isel(time=0, k=0, tile=2).attrs