<span style='color:#0066cc'> <span style='font-family:serif'> <font size="13"> **Accessing MERRA-2 Data with OPeNDAP**<span style='color:#0066cc'>

<font size="3"><span style='color:Black'> MERRA (Modern-Era Retrospective analysis for Research and Applications) data, 

<span style='color:#0066cc'><font size="5"> **About the "Modern-Era Retrospective analysis for Research and Applications" Version 2 [MERRA-2](https://gmao.gsfc.nasa.gov/reanalysis/MERRA-2/docs/) data**
1. <font size="3"><span style='color:Black'> Assimilates observation types not available to its predecessor, MERRA, and includes updates to the Goddard Earth Observing System (GEOS) model and analysis scheme so as to provide a viable ongoing climate analysis beyond MERRA’s terminus.
2. <font size="3"><span style='color:Black'>The Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2), is the latest atmospheric reanalysis of the modern satellite era produced by NASA’s Global Modeling and Assimilation Office (GMAO).
3. <font size="3"><span style='color:Black'> Other improvements in the quality of MERRA-2 compared with MERRA include the reduction of some spurious trends and jumps related to changes in the observing system and reduced biases and imbalances in aspects of the water cycle.

**Source**: https://doi.org/10.1175/JCLI-D-16-0758.1



<span style='color:#ff6666'><font size="5">**Requirements**
1. <font size="3"><span style='color:Black'> Have a Bearer Token for EarthData in the Cloud (See `GetStarted` Notebook).
2. <font size="3"><span style='color:Black'> Upload the Bearer Token from local file `token.json`



In [None]:
from pydap.net import create_session
from pydap.client import get_cmr_urls, consolidate_metadata, open_url
import xarray as xr
import datetime as dt
import json
import matplotlib.pyplot as plt

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Import Token Authorization and create Session**
 


<font size="3.5"> Here we use the Bearer Token to create an authenticated session. The Bearer token should be stored on a local json file, after completed the `GetStarted` Notebook.



In [None]:
# load token json data
with open('token.json', 'r') as fp:
    token = json.load(fp)

# pass Token Authorization to a new Session.
my_session = create_session(use_cache=True, session_kwargs=token)

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Query opendap urls using NASA's CMR API**

In [None]:
merra2_doi = "10.5067/VJAFPLI1CSIV" # available e.g. GES DISC MERRA-2 documentation 
                                    # https://disc.gsfc.nasa.gov/datasets/M2T1NXSLV_5.12.4/summary?keywords=MERRA-2

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Filter data via Temporal Searches**

<font size="3.5"> Users can specify date ranges  NASA's CMR can 

<font size="3.5"> There are two ways to specify formats.

    1. Using `python`'s datetime package. It follows the `year-month-day` formatting
    2. A string with the following format: YYYY-MM-DDTHH:MM:SSZ


In [None]:
time_range=[dt.datetime(2023, 1, 1), dt.datetime(2023, 1, 31)] # One month of data

In [None]:
url_limits = 100 # controls the max number of urls returns. Default is 50

In [None]:
urls = get_cmr_urls(doi=merra2_doi,time_range=time_range, limit=url_limits) # you can incread the limit of results
len(urls)

### You can inspect OPeNDAP's server data request form by clicking on each individual data url.

In [None]:
urls[:2]

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Server-side Metadata reduction**

<font size="3.5"> Many of NASA's files contain too many variables, beyond those of interest, and processing their metadata can add unnecessary time to data analysis workflows. Below, we use pydap directly to add query parameters (Constraint Expressions) that instruct the remote NASA OPeNDAP server which variables of interest we need.


In [None]:
new_urls = [url.replace("https", "dap4") for url in urls] # 

In [None]:
pyds = open_url(new_urls[0], session=my_session)

In [None]:
print("All variables within dataset: \n", list(pyds.variables()))

In [None]:
Keep_vars = ["/T2M", "/U2M", "/V2M", "/SLP"] # this are the variables we want
dims = list(set([dim for var in Keep_vars for dim in pyds[var].dims]))  # retain their dimensions
Keep_vars += dims
CE="?dap4.ce=" + (';').join(Keep_vars) # need to add this to each url

In [None]:
opendap_urls = [url + CE for url in new_urls]
opendap_urls[:2]

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Consolidate metadata**

<font size="3.5"> All URLs belonging to the same Collection share many identical variables and metadata. The following function
reduces redundant metadata


In [None]:
my_session.cache.clear()

In [None]:
%%time
consolidate_metadata(opendap_urls, concat_dim='time', session=my_session)

In [None]:
len(my_session.cache.urls())

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Create Virtual Aggregated Dataset with Xarray**

<font size="3.5"> Now, you can create a virtually aggregated view of the dataset that is ready to analyze with Xarray and Pydap as an engine.


In [None]:
%%time
ds = xr.open_mfdataset(opendap_urls, engine='pydap', session=my_session, combine='nested', concat_dim="time", chunks={"time":1})
ds

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Making a plot downloads data.**

<font size="3.5"> <span style='color:#ff6666'>**NOTE**<span style='color:black'>: When creating the dataset, we specify a chunking in time. Without this, even just downloading a sime time unit downloads the whole (remote) chunk of data (24 time values). 


In [None]:
%%time
fig, ax = plt.subplots(figsize=(12, 6))
ds['SLP'].isel(time=0).plot();

<font size="3.5"> You can inspect the OPeNDAP url used by Xarray to download data below:

In [None]:
my_session.cache.urls()[0].replace("%5B", "[").replace("%5D", "]").replace("%3A", ":") # decoded

<font size="5"> <span style='color:#ff6666'> **dap responses** <span style='color:black'> 

<font size="3.5"> (`.dap`) are OPeNDAP-native, binary-encoded, chunked data streamed over `http` by remote OPeNDAP servers and decoded by `Pydap` to turn them into NumPy arrays. OPeNDAP's `dap` responses are part of the DAP4 protocol and, unlike NetCDF4 datasets, are streamable.