In [1]:
import xarray as xr
import hvplot.pandas # needed for hvplot to work with pandas DataFrames
import hvplot.xarray  # needed for hvplot to work with xarray DataArrays
from datetime import datetime, timedelta, date
import requests
from IPython.core.display import HTML

# harfbuzz-devel 
# on https://aqua.usegalaxy.eu
    # select Jupyter Interactive GIS Tool 
    # conda install xarray netcdf4

# Overview - North Sea

* Metadata 
    - Discovery metadata - [ACDD](https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3)
    - Use metadata - [Climate and Forecast Convention](https://cfconventions.org/)
* Data
    - [OPeNDAP](https://www.opendap.org/)
    - Many client libraries python, R, java, C++
    - timeseries & trajectories
    - Easy subsetting and lazy loading

```mermaid
flowchart TD
  tds["NIVA THREDDS"] -- "metadata" --> adc["adc.csw.met.no"]
  adc -- "metadata" --> ddas["DDAS(Data Discovery and Access Service)"]
  ddas -- "metadata" --> AIP["AIP(AquaINFRA Interaction Platform)"]
  tds --"data - OPeNDAP" --> galaxy["USEGALAXY"]
  tds --"data - OPeNDAP" --> binder["Other++"]
```

# Examples

* [Glomma River Logger](https://thredds.niva.no/thredds/catalog/subcatalogs/loggers.html?dataset=no.niva:af047ff6-e92a-47a0-a9ab-1b2d1e011092)
* Color Fantasy Ferrybox
    - [Color Fantasy - Daily](https://thredds.niva.no/thredds/catalog/subcatalogs/ferryboxes.html?dataset=no.niva:af11ba01-dfe3-4432-b9d2-4e6fd10714db)
    - [Color Fantasy NorSoop - historical](https://thredds.niva.no/thredds/catalog/subcatalogs/ferryboxes.html?dataset=no.niva:14bb8759-81d8-4a1a-948a-14219d374fab)


# AquaINFRA Interaction Platform

Searching https://aquainfra.dev.52north.org used to work:/ 

>Currently a minor bug in DDAS see https://vm4072.kaj.pouta.csc.fi/ddas/oapir/search?q=glomma&limit=100&collections=arcticdatacentre


## Usegalaxy - setup
1. Go to [Use Galaxy - Interactive JupyterGIS Notebook](https://aqua.usegalaxy.eu/?tool_id=interactive_tool_jupytergis_notebook&version=latest) and launch
2. In a terminal run `conda install xarray netcdf4`
3. Follow along in a notebook

## [Glomma dataset example](https://thredds.niva.no/thredds/catalog/subcatalogs/loggers.html?dataset=no.niva:af047ff6-e92a-47a0-a9ab-1b2d1e011092)

This example shows how to connect to the Glomma dataset, how to subset the data and how to plot data variables.

### Connect to dataset R

```R
library(tidync)

url=paste0('https://thredds.niva.no/thredds/dodsC/datasets/loggers/glomma/baterod.nc')
          
# Read the netcdf file from the url
data_glomma = tidync(url)
```

## Connect to dataset Python

In [2]:
ds_glomma = xr.open_dataset(
    "https://thredds.niva.no/thredds/dodsC/datasets/loggers/glomma/baterod.nc"
)
ds_glomma

### Subset the data

If you don't need the full dataset it is important to subset the dataset to avoid downloading all the data. Select July data from 2023

In [3]:
ds_glomma_jul = ds_glomma.sel(time="2023-07")

### Plot data variables on a graph

In this dataset, temperature is called __temp_water_avg__ and ph is called __phvalue_avg__. Let's plot these variables for the month of July 2023:

In [5]:
ds_glomma_jul.temp_water_avg.hvplot()

In [6]:
ds_glomma_jul.phvalue_avg.hvplot()

## Color Fantasy dataset example

This example shows how to visualize data variables on a map

### Connect to dataset and subset the data

As in the previous example, connect to the dataset

[Color Fantasy - Daily](https://thredds.niva.no/thredds/catalog/subcatalogs/ferryboxes.html?dataset=no.niva:af11ba01-dfe3-4432-b9d2-4e6fd10714db)

In [7]:
ds_daily = xr.open_dataset("https://thredds.niva.no/thredds/dodsC/datasets/nrt/color_fantasy.nc")
yesterday = (datetime.now() - timedelta(days=1)).date()
ereyesterday = (datetime.now() - timedelta(days=2)).date()
ds_latest = ds_daily.sel(time=slice(ereyesterday, yesterday))
ds_latest

### Plot variable on a graph and on a map

In [9]:
ds_latest.salinity.hvplot(x='time')

In [10]:
df_latest = ds_latest.to_dataframe()
df_latest.hvplot.points(
    "longitude", "latitude", color="salinity",
    geo=True, tiles="OSM", frame_width=600, frame_height=400, title="Sea Water Salinity {ereyesterday} to {yesterday}".format(
        ereyesterday=ereyesterday.strftime("%Y-%m-%d"),
        yesterday=yesterday.strftime("%Y-%m-%d")
))

# Summary

* Available data products
    - Sensor data from ferrybox on Color Fantasy available
    - Sensor data from glomma
* WIP data products
    - Sampling and modeling rivers
    - Ramses sensor(spectral imaging radiometer) data Color Fantasy
* Need bugfix on DDAS
* useGalaxy support for OPeNDAP links would be nice!

## Extra - Color Fantasy historical

[Color Fantasy NorSoop - historical](https://thredds.niva.no/thredds/catalog/subcatalogs/ferryboxes.html?dataset=no.niva:14bb8759-81d8-4a1a-948a-14219d374fab)

In [None]:
ds = xr.open_dataset("https://thredds.niva.no/thredds/dodsC/datasets/norsoop/color_fantasy/merged_acdd_color_fantasy.nc")

To plot one of the variables on a map, isolate data equivalent to one round trip. This example uses 2022-06-01 to 2022-06-04 as time range

In [None]:
ds_one_trip = ds.sel(time=slice("2022-06-01", "2022-06-04"))
ds_one_trip

### Plot data variable on a map

Transform the array to a pandas dataframe

Plot temperature using hvplot

In [None]:
df = ds_one_trip.to_dataframe()
df.head(3)

Unnamed: 0_level_0,latitude,longitude,temperature,salinity,oxygen_sat,temperature_qc,salinity_qc,oxygen_sat_qc,trajectory_name
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2022-06-01 00:00:00,57.582778,11.39399,13.96,26.916,104.24,1,1,1,b'color_fantasy'
2022-06-01 00:01:00,57.588303,11.390728,14.03,26.8,104.14,1,1,1,b'color_fantasy'
2022-06-01 00:02:00,57.593637,11.387573,14.03,26.814,104.09,1,1,1,b'color_fantasy'


In [None]:
df.hvplot.points(
    "longitude", "latitude", color="temperature",
    geo=True, tiles="OSM", frame_width=600, frame_height=400
)

