# ERDDAP Access
Find OOI and IOOS salinity data in a specified time interval and bounding box using the ERDDAP advanced search and data access RESTful APIs

_Note: This Jupyter notebook originated from [an ERDDAPY notebook from the IOOS gallery](https://ioos.github.io/notebooks_demos/notebooks/2018-03-01-erddapy)_

 [ERDDAP](https://coastwatch.pfeg.noaa.gov/erddapinfo/) is a data server that gives you a simple, consistent way to download data in the format and the spatial and temporal coverage that you want. ERDDAP is a web application with an interface for people to use. It is also a RESTful web service that allows data access directly from any computer program (e.g. Matlab, R, or webpages)."

This notebook uses the python client [erddapy](https://pyoceans.github.io/erddapy) to help construct the RESTful URLs and translate the responses into Pandas and Xarray objects. 

A typical ERDDAP RESTful URL looks like:

[https://data.ioos.us/gliders/erddap/tabledap/whoi_406-20160902T1700.mat?depth,latitude,longitude,salinity,temperature,time&time>=2016-07-10T00:00:00Z&time<=2017-02-10T00:00:00Z &latitude>=38.0&latitude<=41.0&longitude>=-72.0&longitude<=-69.0](https://data.ioos.us/gliders/erddap/tabledap/whoi_406-20160902T1700.mat?depth,latitude,longitude,salinity,temperature,time&time>=2016-07-10T00:00:00Z&time<=2017-02-10T00:00:00Z&latitude>=38.0&latitude<=41.0&longitude>=-72.0&longitude<=-69.0)

Let's break it down to smaller parts:

- **server**: https://data.ioos.us/gliders/erddap/
- **protocol**: tabledap
- **dataset_id**: whoi_406-20160902T1700
- **response**: .mat
- **variables**: depth,latitude,longitude,temperature,time
- **constraints**:
    - time>=2016-07-10T00:00:00Z
    - time<=2017-02-10T00:00:00Z
    - latitude>=38.0
    - latitude<=41.0
    - longitude>=-72.0
    - longitude<=-69.0

In [None]:
import pandas as pd
from erddapy import ERDDAP
from erddapy.utilities import urlopen
import hvplot.xarray
import hvplot.pandas
import holoviews as hv
import numpy as np

## 1. Search ERDDAP "catalog"

In [None]:
# ERDDAP for OOI
server = 'http://erddap.dataexplorer.oceanobservatories.org/erddap'
protocol = 'tabledap'

In [None]:
ooi = ERDDAP(server=server, protocol=protocol)

A search for everything looks like this. The only effective filtering parameters being passed are `protocol=tabledap` and `response=csv`.

In [None]:
df = pd.read_csv(urlopen(ooi.get_search_url(response='csv', search_for='all')))
len(df)

Now we'll refine our search by adding temporal, bounding box and variable constraints. 

In [None]:
min_time = '2018-07-01T00:00:00Z'
max_time = '2018-07-15T00:00:00Z'
min_lon, max_lon = -127, -122
min_lat, max_lat = 44, 48
standard_name = 'sea_water_practical_salinity'
cdm_data_type = 'timeseries'

kw = {
    'standard_name': standard_name,  
    'min_lon': min_lon,'max_lon': max_lon,'min_lat': min_lat,'max_lat': max_lat, 
    'min_time': min_time,'max_time': max_time, 'cdm_data_type':cdm_data_type
}

In [None]:
search_url = ooi.get_search_url(response='csv', **kw)
search_df = pd.read_csv(urlopen(search_url))
search_df = search_df[['Institution', 'Dataset ID','tabledap']]
search_df

## 2. Read data from one dataset, manually
Let's inspect a specific `dataset_id`.

In [None]:
dataset_id = 'ooi-ce01issm-sbd17-06-ctdbpc000'

Construct the ERDDAP URL to get the data

In [None]:
ooi.dataset_id = dataset_id
ooi.constraints = {'time>=': min_time,'time<=': max_time}
ooi.response='csv'
ooi.variables = [ 'time', ooi.get_var_by_attr(dataset_id=dataset_id, standard_name=standard_name)[0]]
print(ooi.get_download_url())

Read the data into Xarray

In [None]:
ds = ooi.to_xarray(decode_times=True)
#ds = ds.swap_dims({'row':'time'})
#[ds[var].plot() for var in ds.data_vars];

In [None]:
ds.sea_water_practical_salinity.hvplot(grid=True)

## 3.  Read data from all datasets, automatically 

Let's narrow this down by only taking the "CTDBP" data

In [None]:
ctdbp = search_df[search_df['Dataset ID'].str.contains("ctdbp")].reset_index()
print(len(ctdbp))
ctdbp

In [None]:
def alllonlat(e, cdm_data_type, min_time, max_time):
    url='{}/tabledap/allDatasets.csv?datasetID%2CminLongitude%2CminLatitude&cdm_data_type=%22{}%22&minTime%3C={}&maxTime%3E={}'.format(e.server,
                        cdm_data_type,max_time,min_time)
    print(url)
    df = pd.read_csv(urlopen(url), skiprows=[1])
    return df

In [None]:
dfll = alllonlat(ooi, 'TimeSeries', min_time, max_time)
# extract lon,lat values of matching datasets from allDatasets dataframe
dfr = dfll[dfll['datasetID'].isin(search_df['Dataset ID'])]

In [None]:
dfr

In [None]:
dfr.hvplot.points(x='minLongitude', y='minLatitude', geo=True, 
                  tiles='OSM', color='red', alpha=0.2, hover_cols=['datasetID'],
                  xlim=(min_lon, max_lon), padding=20, title='OOI Stations with Salinity')

Loop through datasets extracting data 

In [None]:
df_list = []
hv_list = []
for dataset_id in ctdbp['Dataset ID'].values:
    ooi.dataset_id = dataset_id
    ooi.variables = [ 
        'time', 
        ooi.get_var_by_attr(dataset_id=dataset_id,  standard_name=standard_name)[0]
    ]
    try: 
        ds = ooi.to_xarray(decode_times=True)
        df_list.append(ds)
        print(dataset_id)
        hv_list.append(ds[ooi.variables[-1]].hvplot(label=dataset_id))
    except:
        pass

In [None]:
hv.Overlay(hv_list).opts(legend_position='right', width=900, legend_offset=(0,0))

### Find all the IOOS salinity data
Let's do the same query for IOOS

In [None]:
ioos = ERDDAP(server="http://erddap.sensors.ioos.us/erddap", protocol=protocol)

In [None]:
search_url = ioos.get_search_url(response='csv', **kw)
search_df = pd.read_csv(urlopen(search_url))
search_df = search_df[['Institution', 'Dataset ID','tabledap']]
search_df

In [None]:
ioos.constraints = {'time>=': min_time,'time<=': max_time}
ioos.response='csv'

In [None]:
df_list = []
hv_list = []

for dataset_id in search_df['Dataset ID'].values:
    ioos.dataset_id = dataset_id
    ioos.variables = [ 
        'time', 
        ioos.get_var_by_attr(dataset_id=dataset_id,  standard_name=standard_name)[0]
    ]
    try: 
        ds = ioos.to_xarray(decode_times=True)
        df_list.append(ds)
        print(dataset_id)
        hv_list.append(ds[ioos.variables[-1]].hvplot(label=dataset_id))
    except:
        pass

In [None]:
hv.Overlay(hv_list).opts(legend_position='right', width=900, legend_offset=(0,0))

In [None]:
dfll = alllonlat(ioos, 'TimeSeries', min_time, max_time)
# extract lon,lat values of matching datasets from allDatasets dataframe
dfr = dfll[dfll['datasetID'].isin(search_df['Dataset ID'])]

In [None]:
dfr.hvplot.points(x='minLongitude', y='minLatitude', geo=True, 
                  tiles='OSM', color='red', alpha=0.4, 
                  xlim=(min_lon, max_lon), padding=20, title='IOOS Stations with Salinity')