# Example of erddap dataset access in pyferret

## Basic Description of Content in this Notebook

This notebook identifies a few tasks:
- using pyferret (in a jupyter notebook) : python3, pyferret, pyferret-magic
- accessing erddap data (via opendap in python/xarray and ferret)
    + xarray is a "pandas like library for multi-dimensional array data" and plays very well with CF/COARDS netcdf files
- using erddapy (python) for erddap exploration (**FUTURE WORK**) 
    + see [https://ioos.github.io/erddapy](https://ioos.github.io/erddapy) for basic description and guidance
- searching erddap clients for potential datasets
- some additional tips and tricks

Short list of convenient ERDDAP Servers
- https://coastwatch.pfeg.noaa.gov/erddapinfo/index.html (SWFSC)
- https://ferret.pmel.noaa.gov/pmel/erddap/index.html (PMEL - FOCI Public data would be here)
- https://polarwatch.noaa.gov/erddap/index.html
- http://erddap.aoos.org/erddap/index.html

Github Repo of great Additional Resources (including more servers, multi-search urls, and software/tools developed for erddap)
- https://github.com/IrishMarineInstitute/awesome-erddap

**note** - erddap is currently at v2.11 (as of Jan 2021) and although older versions work just fine, the version being utilized could be interpreted as a proxy for the amount of support/development energy at each institution

In [1]:
import pyferret

In [2]:
%load_ext ferretmagic

#note: in the following cells the `%%ferret` is cell magic for jupyter-labs and pyferret integration 
#  and is not part of the standard pyferret implementation

## List the details of a single dataset using pyferret/ferret

Notice that I am using the full url to a public dataset but with **no** filetype ending:
https://ferret.pmel.noaa.gov/pmel/erddap/tabledap/dy1104_profile_data - This is how ferret recognizes the opendap protocol.

In [7]:
%%ferret

yes? use "http://akutan.pmel.noaa.gov:8080/erddap/tabledap/PSEA1001_profile_data"
show data

## Do the same with xarray

In [4]:
import xarray as xa

In [5]:
dataset = xa.open_dataset("https://ferret.pmel.noaa.gov/pmel/erddap/tabledap/dy1104_profile_data")

In [None]:
dataset

In [7]:
#the above implentaton fails and it is unclear why, but it is not unique to the PMEL ctd tabular data

Notice: The data retrieved via a tabledap is in serial format (notice the S.{parameter}) as the data is in "DSG" formats and is effectively indexed by a unique table row.  I am not sufficiently versed in ferret to know how to manipulate this currently and there are alternatives to ferret to working with this datastyle.

## Plot the data retrieved

Gridded datasets will be more straight forward to use via opendap than the tabular datasets

In [8]:
%%ferret

yes? use "https://coastwatch.pfeg.noaa.gov/erddap/griddap/ncdcOisst21NrtAgg"
show data

***Recent SST from the NCDC OI 0.25deg product***

In [9]:
%%ferret -s 400,400

shade sst[l=253]

# Alternative erddap/opendap ingest to ferret/xarray

Not to confuse the issue, but there are other opendap / thredds programs and information you can glean from the opendap services that run parallel to the dataservice.  ERDDAP can provide data in many different formats (see image of example) and your opendap software may need to communicate with a specific service.

<img src="images/fileoutput.png" width="400" height="200" />


# ERDDAP Website driven tips and Tricks

## Complex Searching

(the following is a screen grab from the '?' to the right of the dataset search bar on an erddap server)

<img src="images/ERDDAPSearchTips.png" width="400" height="200" />


## Multi-ERDDAP search locations

[http://erddap.com](http://erddap.com) - this site is setup via a third party (Irish Institute) 

[https://coastwatch.pfeg.noaa.gov/erddap/download/SearchMultipleERDDAPs.html](https://coastwatch.pfeg.noaa.gov/erddap/download/SearchMultipleERDDAPs.html) - developed by Bob Simmons and SWFSC

## Using advanced search (and erddapy) to refine datasets

The advanced search page is available on each independant erddap client (https://ferret.pmel.noaa.gov/pmel/erddap/search/advanced.html as an example for the pmel public version), the examples below will use a python package (erddapy - https://ioos.github.io/erddapy/v0.9.0/01-longer_intro-output.html) to supply keywords to the search terms RESTfully but the web interface is functionally identical.  I will only be passing "Full Text Search" terms but other search parameters can be supplied as well.

For the first example, I am supplying the Full Text Search term - "foci"
for the seond I am supplying the the Full Text Search term - "foci -Bering" where the '-' removes a results with the word chosen

In [3]:
# using erddapy - an alternative to the opendap protocal of xarray/ferret
from erddapy import ERDDAP
from erddapy.doc_helpers import show_iframe

In [4]:
#specify general erddap url
e = ERDDAP(server="https://ferret.pmel.noaa.gov/pmel/erddap/")

In [5]:
#pass search terms in as you would on the web interface, show the html response in-line
from erddapy.doc_helpers import show_iframe

search_url = e.get_search_url(search_for="foci", response="html")

**search for just for 'foci' on the pmel server**

In [6]:
show_iframe(search_url)

**find all foci datasets and remove any dataset with 'Bering' in the searchable terms (this drops a single dataset from the previous results)**

In [7]:
search_url = e.get_search_url(search_for="foci -Bering", response="html")
show_iframe(search_url)

# ERDDAPY and opendap tabular and gridded

In [8]:
d = ERDDAP(server="https://ferret.pmel.noaa.gov/pmel/erddap/",
    protocol='tabledap',
    response='opendap',
)
d.dataset_id="dy1104_profile_data"

isnc=True
isCSV = False
if isCSV:
    df_m = d.to_pandas(
                #index_col='time (UTC)',
                parse_dates=True,
                skiprows=(1,)  # units information can be dropped.
                )
    df_m.sort_index(inplace=True)
    df_m.columns = [x[1].split()[0] for x in enumerate(df_m.columns)]

if isnc:
    df_m = d.to_xarray()
df_m

In [9]:
from netCDF4 import Dataset

opendap_url = d.get_download_url(
    response="opendap",
)
with Dataset(opendap_url) as nc:
    print(nc)

<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF3_CLASSIC data model, file format DAP2):
    AIR_TEMP: 6.6
    BAROMETER: 0
    CAST: 028
    cdm_data_type: Profile
    cdm_profile_variables: id, cast, cruise, time, longitude, lon360, latitude
    COMPOSITE: 0
    Conventions: CF-1.6, COARDS, ACDD-1.3
    creation_date: June 27, 2018 19:24 UTC
    creator_name: PMEL EcoFOCI
    creator_type: institution
    CRUISE: dy1104
    DATA_CMNT: ,
    DATA_TYPE: CTD
    Easternmost_Easting: -163.8388
    EPIC_FILE_GENERATOR: EcoFOCI_netCDF_write.py 0.4.0
    featureType: Profile
    geospatial_lat_max: 60.07633
    geospatial_lat_min: 56.65417
    geospatial_lat_units: degrees_north
    geospatial_lon_max: -163.8388
    geospatial_lon_min: -172.1755
    geospatial_lon_units: degrees_east
    geospatial_vertical_max: 72.0
    geospatial_vertical_min: 0.0
    geospatial_vertical_positive: down
    geospatial_vertical_units: m
    history: FERRET V7.42 (optimized)  2-Aug-18
2021-01-19T17:36:1

# ERDDAP for OpenDap endpoints

Given Ferret/PyFerret/Xarray's challenge with opendap access of tabular datasets... how to better share the "content" of the netcdf file?

- xarray for gridded data
- erddapy for python (tabular or gridded)
- weblink (and netcdf files inherently), perhaps autodownloaded (or just the final data point?)