## Harmony's NetCDF-to-Zarr service

Example notebook: https://github.com/nasa/harmony-netcdf-to-zarr/blob/main/docs/NetCDF-to-Zarr-Example-Usage.ipynb

### Authentication prerequisites:

The harmony.Client class will attempt to use credentials from a local .netrc file, located in the home directory of the filesystem where this notebook is running. This will need to contain entries for Earthdata Login (at minimum for the UAT environment):

`machine urs.earthdata.nasa.gov`
    `login <prod_edl_username>`
    `password <prod_edl_password>`

`machine uat.urs.earthdata.nasa.gov`
    `login <uat_edl_username>`
    `password <uat_edl_password>`


In [1]:
from datetime import datetime

from harmony import Client, Collection, Environment, LinkType, Request
from pprint import pprint
from s3fs import S3FileSystem
import matplotlib.pyplot as plt
import xarray as xr

### Setting up a Harmony client:

In this notebook, requests will be made against test data in the UAT environment. First an instance of the harmony.Client class is created, which simplifies the interactions with the Harmony API, including request submission and retrieval of results.

In [2]:
harmony_client = Client()#env=Environment.UAT)

### Setting up an S3 connection:

The s3fs.S3FileSystem class creates a connection to S3, such that typical filesystem commands can be used against the contents of S3 (see documentation here). The same instance will be used to interact with the outputs from the requests in the notebook below. The credentials necessary to access Harmony outputs stored in AWS S3 can be generated using the harmony.Client class:

In [3]:
s3_credentials = harmony_client.aws_credentials()

s3_fs = S3FileSystem(key=s3_credentials['aws_access_key_id'],
                     secret=s3_credentials['aws_secret_access_key'],
                     token=s3_credentials['aws_session_token'],
                     client_kwargs={'region_name':'us-west-2'})

### Converting a single granule:

The request below will make a request against the Harmony Example L2 Data collection. Each granule is a small NetCDF-4 file, with 4 science variables and swath dimension variables.

First, a request is constructed via the harmony.Request class. In this request only the data collection, output format and number of granules will be specified. This request will create a Zarr store from the first granule in the collection.

The request will then be submitted to the Harmony API using the harmony.Client object, and URLs for the results can be retrieved once the job is completed.

In [5]:
collection = Collection(id='C1276812863-GES_DISC')

# Specify a request to create Zarr output for one granule in the collection:
single_granule_request = Request(collection=collection, format='application/x-zarr',
                                 granule_id='G1970102658-GES_DISC')

# Submit the request and wait for it to complete:
single_granule_job_id = harmony_client.submit(single_granule_request)
harmony_client.wait_for_processing(single_granule_job_id, show_progress=True)

# Filter the results to only include links to resources in S3:
single_granule_result_urls = list(harmony_client.result_urls(single_granule_job_id, link_type=LinkType.s3))

KeyError: 'jobID'