# This Notebook is no longer supported, a newer version exists [here](https://github.com/podaac/tutorials/blob/master/notebooks/harmony%20subsetting/Harmony%20L2%20Subsetter.ipynb).
## Cloud Level 2 Subsetter (L2SS) API 
This will demonstrate how to subset swath/L2 data with the data and services hosted on the cloud.


## Before Beginning

Before you beginning this tutorial, make sure you have an account in the Earthdata Login, which is required to access data from the NASA Earthdata system. Please visit https://urs.earthdata.nasa.gov to register for an Earthdata Login account. It is free to create and only takes a moment to set up.

You will also need a netrc file containing your NASA Earthdata Login credentials in order to execute this notebook. A netrc file can be created manually within text editor and saved to your home directory. For additional information see: [Authentication for NASA Earthdata](https://nasa-openscapes.github.io/2021-Cloud-Hackathon/tutorials/04_NASA_Earthdata_Authentication.html#authentication-via-netrc-file).

### Learning Objective: 
- Subset a specific file/granule that has already been found using the podaac L2 subsetter


In [1]:
from harmony import BBox, Client, Collection, Request, Environment, LinkType
from IPython.display import display, JSON
import tempfile
import shutil
import xarray as xr
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import make_axes_locatable
from pandas.plotting import register_matplotlib_converters
import numpy as np

##  Subset of a PO.DAAC Granule

We build onto the root URL in order to actually perform a transformation.  The first transformation is a subset of a selected granule.  _At this time, this requires discovering the granule id from CMR_.  That information can then be appended to the root URL and used to call Harmony with the help of the `request` library.

**Notes:**
  The L2 subsetter current streams the data back to the user, and does not stage data in S3 for redirects. This is functionality we will be adding over time.
  
Create a Harmony-py client

In [2]:
harmony_client = Client(env=Environment.PROD)

With the client created, we can contruct and validate the request. As this is a subsetting + concatenation request, we specify options on the request that define spatial bounds, variables we are interested in, temporal bounds, and indicated the result should be concatenated.

In [3]:
collection = Collection(id='C1940471193-POCLOUD') #Jason-1 GDR SSHA version E NetCDF

request = Request(
    collection=collection,
    spatial=BBox(0,0,1,1), # 1 degree box
    granule_id='G1969371708-POCLOUD' #JA1_GPR_2PeP374_173_20120303_121639_20120303_125911.nc
)

request.is_valid()

True

Now that we have a valid request we simply need to call the `submit` function using the client we created earlier and pass in the request as a parameter.

_Tip:_ if you want to see the request before submitting it, use the `request_as_curl` function on the client to get an equivalent curl command for the request that will be submitted.

In [4]:
print(harmony_client.request_as_curl(request))
job_id = harmony_client.submit(request)
print(f'Job ID: {job_id}')

curl -X GET -H 'Accept: */*' -H 'Accept-Encoding: gzip, deflate' -H 'Connection: keep-alive' -H 'Cookie: urs_user_already_logged=yes; token=*****; _urs-gui_session=0c2f471216e220fc8ef81d7f18a5ddfb' -H 'User-Agent: Windows/10 CPython/3.8.12 harmony-py/0.4.2 python-requests/2.25.1' 'https://harmony.earthdata.nasa.gov/C1940471193-POCLOUD/ogc-api-coverages/1.0.0/collections/all/coverage/rangeset?forceAsync=true&subset=lat%280%3A1%29&subset=lon%280%3A1%29&granuleId=G1969371708-POCLOUD'
Job ID: 8fad49e8-c95f-4a98-8e99-d5b053d86de7


In [5]:
print(harmony_client.status(job_id))

print('\nWaiting for the job to finish')
results = harmony_client.result_json(job_id, show_progress=True)

{'status': 'running', 'message': 'The job is being processed', 'progress': 0, 'created_at': datetime.datetime(2022, 10, 25, 17, 5, 0, 76000, tzinfo=tzutc()), 'updated_at': datetime.datetime(2022, 10, 25, 17, 5, 0, 438000, tzinfo=tzutc()), 'created_at_local': '2022-10-25T10:05:00-07:00', 'updated_at_local': '2022-10-25T10:05:00-07:00', 'data_expiration': datetime.datetime(2022, 11, 24, 17, 5, 0, 76000, tzinfo=tzutc()), 'data_expiration_local': '2022-11-24T09:05:00-08:00', 'request': 'https://harmony.earthdata.nasa.gov/C1940471193-POCLOUD/ogc-api-coverages/1.0.0/collections/all/coverage/rangeset?forceAsync=true&subset=lat(0%3A1)&subset=lon(0%3A1)&granuleId=G1969371708-POCLOUD', 'num_input_granules': 1}

Waiting for the job to finish


 [ Processing:   0% ] |                                                   | [|]


ConnectionError: ('Connection aborted.', TimeoutError(10060, 'A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond', None, 10060, None))

In [None]:
temp_dir = tempfile.mkdtemp()
futures = harmony_client.download_all(job_id, directory=temp_dir, overwrite=True)
file_names = [f.result() for f in futures]
file_names

In [None]:
ds = xr.open_dataset(file_names[0])
ds

In [None]:
ds.ssha.plot()

## Verify the subsetting worked

Bounds are defined earlier 


In [None]:
lat_max = ds.lat.max()
lat_min = ds.lat.min()

lon_min = ds.lon.min()
lon_max = ds.lon.max()


if lat_max < bblat_max and lat_min > bblat_min:
    print("Successful Latitude subsetting")
else:
    assert false

    
if lon_max < bblon_max and lon_min > bblon_min:
    print("Successful Longitude subsetting")
else:
    assert false
    

## Plot swath onto a map

In [None]:
ax = plt.axes(projection=ccrs.PlateCarree())
ax.coastlines()

plt.scatter(ds.lon, ds.lat, lw=2, c=ds.ssha)
plt.colorbar()
plt.clim(-0.3, 0.3)

plt.show()