# ERDDAP FUNCTIONALITY

## Table of contents
### [Error resolution](#errorresolution)
### [Functions](#functions)
* [create_erddap_url](#createerddapurl)
* [erddap_pull](#erddap_pull)
### [Examples](#examples)
* [Tabledap](#tabledap)
  * [eMOLT](#emolt)
  * [EcoMon](#ecomon)
* [Griddap](#griddap)
  * [ACSPO](#acspo)
  * [GHRSST](#ghrsst)

# TO DO
* tabledap --> get so you dont need to input all variables individually
  * look into incorporating ERDDAPY here
* incorporating other servers
  * do we want 1 really flexible function that pulls from ERDDAP/THREDDS/etc
  * or do we want separate functions depending on the server type? 

In [16]:
import xarray as xr
import requests
import pandas as pd

# Error resolution</font> <a class="anchor" id="errorresolution"></a>
If you try to use xarray to open a netcdf using an ERDDAP link, the error ```OSError: [Errno -75] NetCDF: Malformed or unexpected Constraint ``` occurs (as shown below). ERDDAP is looking for a SSL certificate verification. If you remove the ```verify=False``` from the ```requests.get()``` function, it will throw a ```SSL3_GET_SERVER_CERTIFICATE:certificate verify failed``` error. 


In [13]:
#error
url = 'https://comet.nefsc.noaa.gov/erddap/griddap/noaa_coastwatch_acspo_v2_reanalysis.nc?sea_surface_temperature%5B(2024-01-31T00:00:00Z):1:(2024-01-01T00:00:00Z)%5D%5B(35):1:(46)%5D%5B(-76):1:(-63)%5D,sst_dtime%5B(2024-01-31T00:00:00Z):1:(2024-01-01T00:00:00Z)%5D%5B(35):1:(46)%5D%5B(-76):1:(-63)%5D'
xr.open_dataset(url)

OSError: [Errno -75] NetCDF: Malformed or unexpected Constraint: 'https://comet.nefsc.noaa.gov/erddap/griddap/noaa_coastwatch_acspo_v2_reanalysis.nc?sea_surface_temperature%5B(2024-01-31T00:00:00Z):1:(2024-01-01T00:00:00Z)%5D%5B(35):1:(46)%5D%5B(-76):1:(-63)%5D,sst_dtime%5B(2024-01-31T00:00:00Z):1:(2024-01-01T00:00:00Z)%5D%5B(35):1:(46)%5D%5B(-76):1:(-63)%5D'

In [12]:
#fixed
url_new = requests.get(url,verify=False).content
xr.open_dataset(url_new, decode_timedelta=True)




# <font color='black'>Functions</font> <a class="anchor" id="functions"></a>

### <font color='blue'>```create_erddap_url```</font> <a class="anchor" id="createerddapurl"></a>
**Purpose:** 
* Create an ERDDAP url for subsetting a dataset
* NOTE: must have the variable names that you want to grab from the dataset

**Arguments:** 
* <u>dataID</u> (string): dataset ID from ERDDAP
* <u>file_type</u> (string): either ```'nc'``` or ```'csv'```
* <u>daptype</u> (string): either ```'tabledap'``` if using point data, or ```'griddap'``` if using gridded data
* <u>var</u> (list of strings): list of variables from dataset
* <u>latmin</u> (int/float) *optional*: minimum latitude for bounding box. If not explicitly defined, default is to subset to the NWA)
* <u>latmax</u> (int/float) *optional*: maxmimum latitude for bounding box. If not explicitly defined, default is to subset to the NWA)
* <u>lonmin</u> (int/float) *optional*: minimum longitude for bounding box. If not explicitly defined, default is to subset to the NWA)
* <u>lonmax</u> (int/float) *optional*: maximum longitude for bounding box. If not explicitly defined, default is to subset to the NWA)
* <u>date_end</u> (string): end date for data slice. Should be formatted as ```'yyyy-mm-dd'```
* <u>date_start</u> (string): start date for data slice. Should be formatted as ```'yyyy-mm-dd'```
* <u>base_url</u> (string) *optional*: beginning part of erddap url. Defaults to ```'https://comet.nefsc.noaa.gov/'```

**Sample Usage:** <br>
```
url =create_erddap_url(dataID='ocdbs_v_erddap1', file_type='csv', daptype='tabledap', var=['sea_surface_temperature'], date_end='2024-06-01', date_start='2024-01-31')
```

**History:** <br>
>* 2/26/25 function initialized

In [1]:
def create_erddap_url(dataID, file_type, daptype, var, latmin=35,latmax=46, lonmin=-76, lonmax=-63, date_end=None, date_start=None, base_url='https://comet.nefsc.noaa.gov/'):
    base = base_url + 'erddap/' + daptype + '/'+ dataID + '.' + file_type +'?'
    if daptype=='griddap':
        varz= []
        for v in var: 
            varz.append(v + '%5B(' + date_start+'T00:00:00Z):1:('+date_end+'T00:00:00Z)%5D%5B(' + str(latmin)+'):1:('+str(latmax)+')%5D%5B('+str(lonmin)+'):1:('+str(lonmax)+')%5D')
        url =base+ ",".join(varz)
    elif daptype=='tabledap':
        varz=[]
        for v in var:
            varz.append(v)
            if v.__contains__('lat'):
                latvar=v
            elif v.__contains__('lon'):
                lonvar=v
            elif v.__contains__('time') or v.__contains__('TIME'):
                timevar=v
        #var.remove(timevar)
        #for v in var:
        #    varz.append(v)#+'%2C')
        url=base+ "%2C".join(varz)+'&'+timevar+'%3E='+date_start+'&'+timevar+'%3C='+date_end+'&'+latvar+'%3C='+str(latmin)+'&'+latvar+'%3C='+str(latmax)+'&'+lonvar+'%3C='+str(lonmin)+'&'+lonvar+'%3C='+str(lonmax)        
    return url

### <font color='blue'>```erddap_pull```</font> <a class="anchor" id="erddappull"></a>
**Purpose:** 
* lazy-load data from ERDDAP url 

**Arguments:** 
* <u>file_type</u> (string): file type. either ```'nc'``` or ```'csv'```
* <u>url</u> (string): erddap url. Either directly pulled from erddap's website or created using ```creatae_erddap_url```

**Sample Usage:** <br>
```
data = erddap_pull('nc',url)
```

**History:** <br>
>* 2/26/25 function initialized

In [None]:
import requests
def erddap_pull(file_type, url):
    if file_type=='nc':
        url = requests.get(url, verify=False).content
        data=xr.open_dataset(url)
    elif file_type =='csv':
        data=pd.read_csv(url)
    return data

# <font color='black'>Examples</font> <a class="anchor" id="examples"></a>

## <font color='black'>Tabledap</font> <a class="anchor" id="tabledap"></a>

##### <font color='black'>eMOLT</font> <a class="anchor" id="emolt"></a>

In [17]:
url= create_erddap_url(dataID='eMOLT_realtime_bottom_temps_and_profiles',file_type='csv',daptype='tabledap',var=['latitude','longitude','time','temperature'], 
                  date_start='2015-01-01', date_end='2024-06-01')

erddap_pull('csv',url)

Unnamed: 0,latitude,longitude,time,temperature
0,degrees_north,degrees_east,UTC,degree_C
1,34.6238,-76.224,2021-01-14T19:35:26Z,9.87
2,34.6254,-76.2214,2021-01-14T19:36:56Z,9.81
3,34.6268,-76.2186,2021-01-14T19:38:26Z,9.79
4,34.6283,-76.2158,2021-01-14T19:39:56Z,9.8
5,34.6297,-76.2129,2021-01-14T19:41:26Z,9.84
6,34.6312,-76.2102,2021-01-14T19:42:56Z,9.85
7,34.6327,-76.2073,2021-01-14T19:44:26Z,9.9
8,34.6343,-76.2047,2021-01-14T19:45:56Z,10.02
9,34.6361,-76.202,2021-01-14T19:47:26Z,10.13


##### <font color='black'>EcoMon</font> <a class="anchor" id="ecomon"></a>

In [18]:
url= create_erddap_url(dataID='ocdbs_v_erddap1',file_type='csv',daptype='tabledap',var=['latitude','longitude','UTC_DATETIME','sea_water_temperature'], 
                  date_start='2024-01-01', date_end='2024-06-01')
erddap_pull('csv',url)

Unnamed: 0,latitude,longitude,UTC_DATETIME,sea_water_temperature
0,degrees_north,degrees_east,UTC,degrees_C
1,27.6417,-80.1983,2024-04-14T07:46:00Z,22.61
2,27.6417,-80.1983,2024-04-14T07:46:00Z,22.61
3,27.6417,-80.1983,2024-04-14T07:46:00Z,22.63
4,27.6417,-80.1983,2024-04-14T07:46:00Z,22.65
...,...,...,...,...
769,34.7083,-76.29,2024-05-17T17:51:00Z,19.07
770,34.7083,-76.29,2024-05-17T17:51:00Z,19.08
771,34.7083,-76.29,2024-05-17T17:51:00Z,19.08
772,34.7083,-76.29,2024-05-17T17:51:00Z,19.09


## <font color='black'>Griddap</font> <a class="anchor" id="griddap"></a>

##### <font color='black'>ACSPO</font> <a class="anchor" id="acspo"></a>

In [21]:
url =create_erddap_url(dataID='noaa_coastwatch_acspo_v2_reanalysis', file_type='nc', daptype='griddap', var=['sea_surface_temperature','sst_dtime'], date_end='2024-01-01', date_start='2024-01-31')
erddap_pull('nc',url)

  data=xr.open_dataset(url)


##### <font color='black'>SST GHRSST</font> <a class="anchor" id="coastwatch"></a>

In [22]:
#griddap coastwatch
url =create_erddap_url(dataID='noaacwBLENDEDsstDLDaily', file_type='nc', daptype='griddap', var=['analysed_sst'], date_start='2024-01-01', 
                       date_end='2024-01-31', base_url='https://coastwatch.noaa.gov/')


erddap_pull('nc',url)

