# Marine EOV Broker



In [1]:
from lib.MarineRiBroker import *
import logging
import matplotlib.pyplot as plt

logger = logging.getLogger()
logger.setLevel(logging.INFO)
# logger.setLevel(logging.DEBUG)

print(ERDDAP_OUTPUT_FORMATS)
print(EOV_LIST)

['csv', 'geoJson', 'json', 'nc', 'ncCF', 'odvTxt']
['EV_OXY', 'EV_SEATEMP', 'EV_SALIN', 'EV_CURR', 'EV_CHLA', 'EV_CO2', 'EV_NUTS']


## Start the broker

It will take some time (though it still needs improvements on performances). This is because the broker will :
* load vocabularies upon startup
* load erddap datasets metadata from all erddap servers


**Question :**
Do we want to work with all datasets on Erddap servers ? Or do we want to build a fixed list for them ?

In [2]:
%%time
broker = MarineBroker()

INFO:root:Querying vocabulary server for EOV : EV_OXY
INFO:root:Querying vocabulary server for EOV : EV_SEATEMP
INFO:root:Querying vocabulary server for EOV : EV_SALIN
INFO:root:Querying vocabulary server for EOV : EV_CURR
INFO:root:Querying vocabulary server for EOV : EV_CHLA
INFO:root:Querying vocabulary server for EOV : EV_CO2
INFO:root:Querying vocabulary server for EOV : EV_NUTS


CPU times: user 3.2 s, sys: 288 ms, total: 3.49 s
Wall time: 18.9 s


## Create a request to the broker :
The user must provide the EOVs, min/max date/lat/lon, output format desired.

When creating a query, the broker :
* first looks at every dataset to see if they match any eov requested by the user
* then checks if the datasets match the time/bbox requested by the user

In [3]:
%%time
# logger.setLevel(logging.DEBUG)
# queries = broker.submit_request(["EV_SALIN", "EV_OXY", "EV_SEATEMP"], "2011-01-01", "2011-01-31", -10, 30, 30, 60, "csv")
response = broker.submit_request(["EV_SALIN", "EV_OXY", "EV_SEATEMP", "EV_CO2", "EV_CHLA"], "2011-01-01", "2011-01-31", -10, 30, 30, 60, "csv")

CPU times: user 471 ms, sys: 17.8 ms, total: 489 ms
Wall time: 25.1 s


## Results

The interesting part !
The broker provides a BrokerResponse object. It contains the variable **queries** which is a Pandas DataFrame.

The pandas DataFrame contains all the global attributes, query URL and ErddapRequest object for each dataset found for the user request.

In [4]:
response.queries

Unnamed: 0,query_url,abstract,acknowledgement,area_keywords,area_keywords_urn,bathymetry_source,cdm_data_type,Conventions,creator_name,data_access,...,references,software_version,subsetVariables,testOutOfDate,user_manual_version,creator_email,geospatial_vertical_max,geospatial_vertical_min,geospatial_vertical_positive,geospatial_vertical_units
SDC_BAL_CLIM_TS_V2_m,https://www.ifremer.fr/erddap/griddap/SDC_BAL_...,...,...,Baltic Sea,SDN:C19::2,The GEBCO Digital Atlas published by the Briti...,Grid,"CF-1.6, COARDS, ACDD-1.3",Swedish Meteorological and Hydrological Institute,http://sdn.oceanbrowser.net/data/SeaDataCloud-...,...,,,,,,,,,,
SDC_BAL_CLIM_TS_V2_s,https://www.ifremer.fr/erddap/griddap/SDC_BAL_...,...,...,Baltic Sea,SDN:C19::2,The GEBCO Digital Atlas published by the Briti...,Grid,"CF-1.6, COARDS, ACDD-1.3",Swedish Meteorological and Hydrological Institute,http://sdn.oceanbrowser.net/data/SeaDataCloud-...,...,,,,,,,,,,
ArgoFloats-synthetic-BGC,https://www.ifremer.fr/erddap/tabledap/ArgoFlo...,,,,,,TrajectoryProfile,"Argo-3.1 CF-1.6, COARDS, ACDD-1.3",Argo,,...,http://www.argodatamgt.org/Documentation,1.11 (version 30.06.2020 for ARGO_simplified_p...,"data_type, data_centre, platform_type, wmo_ins...",now-5days,1.0,,,,,
ArgoFloats,https://www.ifremer.fr/erddap/tabledap/ArgoFlo...,,,,,,TrajectoryProfile,"Argo-3.1, CF-1.6, COARDS, ACDD-1.3",Argo,,...,http://www.argodatamgt.org/Documentation,,,,3.1,support@argo.net,,,,
SDC_GLO_AGG_V2,https://www.ifremer.fr/erddap/tabledap/SDC_GLO...,,,,,,Point,"COARDS, CF-1.6, ACDD-1.3",,,...,,,"obs_date_qc, position_qc, depth_qc, temp_qc, p...",,,,7959.866209999999,-3283.04761,down,m


**Or just the list of datasets ID**

In [5]:
response.get_datasets_list()

['SDC_BAL_CLIM_TS_V2_m',
 'SDC_BAL_CLIM_TS_V2_s',
 'ArgoFloats-synthetic-BGC',
 'ArgoFloats',
 'SDC_GLO_AGG_V2']

### Access a dataset with its dataset ID and check its parameters

In [6]:
response.get_dataset("ArgoFloats").found_eovs

{'EV_SALIN': 'psal', 'EV_OXY': 'doxy', 'EV_SEATEMP': 'temp'}

### Execute a query & get the result as a Pandas DataFrame...

In [7]:
df = response.query_to_pandas_dataframe("SDC_BAL_CLIM_TS_V2_s")
df

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Water_body_salinity,ITS_90_water_temperature
time,depth,latitude,longitude,Unnamed: 4_level_1,Unnamed: 5_level_1
2011-01-16,0.0,53.0,9.0000,,
2011-01-16,0.0,53.0,9.0625,,
2011-01-16,0.0,53.0,9.1250,,
2011-01-16,0.0,53.0,9.1875,,
2011-01-16,0.0,53.0,9.2500,,
2011-01-16,...,...,...,...,...
2011-01-16,300.0,60.0,29.7500,,
2011-01-16,300.0,60.0,29.8125,,
2011-01-16,300.0,60.0,29.8750,,
2011-01-16,300.0,60.0,29.9375,,


### ... or an Xarray dataset

In [8]:
ds_sdc = response.query_to_xarray(response.queries.index[0])
ds_sdc

### Only retrieve a specific EOV :

In [9]:
ds_argo = response.query_to_pandas_dataframe("ArgoFloats", "EV_SEATEMP")
ds_argo

trajectory  profile  obs 
0           0        0       15.715
                     1       15.904
                     2       15.814
                     3       15.840
                     4       15.840
                              ...  
28          103      8095     8.880
                     8096     8.880
                     8097     8.880
                     8098     8.880
                     8099     8.881
Name: temp, Length: 24429600, dtype: float32

### Download a dataset as a NetCDF file

In [10]:
response.query_to_file_download("SDC_BAL_CLIM_TS_V2_s", "nc")

True