# Meteor M85/1 cruise vs. FOCI NEMO Test data

## Description

We use an [Intake driver](https://github.com/ESM-VFC/intake_pangaeapy) for [`pangaeapy`](https://github.com/pangaea-data-publisher/pangaeapy) to load hydrographic observational data from Meteor cruise M85 and
- plot positions on a map
- create a [temperature-salinity diagram](https://en.wikipedia.org/wiki/Temperature%E2%80%93salinity_diagram) of the whole cruise.

We load a NEMO test dataset that covers the same time on the calendar and
- plot surface temperature on a map together with the observed temperature
- select data for the same locations and time stamps as in the observational data set and repeate the temperature-salinity diagrams
 
Along the way, there's a few obstacles:
- Selecting NEMO data on a curvilinear horizontal grid is not directly implemented in xarray, so we use [`xorca_lonlat2ij`](git.geomar.de/python/xorca_lonlat2ij) to find closest indices on the sphere.
- We need to un-elegantly mask the data using the fact that over land, the values never change from an exact `0`, because the mask info is in a different file (the mesh-mask) than the actual data.

_**Note** that we cannot expect a lot of similarity between the in-sity observational data and a free running climate model._

## Parameters

In [None]:
# parameters

esm_vfc_data_dir = "../esm-vfc-data/"
nemo_catalog_url = "https://raw.githubusercontent.com/ESM-VFC/esm-vfc-catalogs/master/catalogs/NEMO_ORCA05_FOCI_Test_Minimal.yaml"
meteor_catalog_url = "https://raw.githubusercontent.com/ESM-VFC/esm-vfc-catalogs/master/catalogs/METEOR_cruises.yaml"

## Tech preamble

In [None]:
import numpy as np
import xarray as xr

In [None]:
# set up intake catalog
import intake
from esmvfc_cattools import download_zenodo_files_for_entry
import os

os.environ["ESM_VFC_DATA_DIR"] = esm_vfc_data_dir

In [None]:
# set up plotting
import hvplot.pandas
import hvplot.xarray
import geoviews.feature as gf
from cartopy import crs

In [None]:
# install and import a tool for looking up nemo indices
import xorca_lonlat2ij as xll2ij

## Get obs data, extract near-surface measurements, plot positions

In [None]:
meteor_catalog = intake.open_catalog(meteor_catalog_url)
list(meteor_catalog)

In [None]:
obs_df = meteor_catalog["M85_1_bottles"].read()

In [None]:
# restrict to measurements at minimal depth per Event (= station)
near_surface_obs = obs_df.loc[
    obs_df.groupby("Event")["Depth water"].idxmin()
]
near_surface_obs = near_surface_obs.set_index("Event")
near_surface_obs

In [None]:
(
    near_surface_obs.hvplot(
        "Longitude", "Latitude", geo=True, kind="points", hover=False)
    * gf.coastline
)

_**FIXME:** Hover tool shows wrong values ("Latitude: 7945355th"???)._

In [None]:
(
    obs_df.hvplot.scatter("Tpot", "Sal", alpha=0.2, label="all data", hover=False)
    * near_surface_obs.hvplot.scatter("Tpot", "Sal", alpha=0.8, label="surface data", hover=False)
)

## Load catalog and fetch data

In [None]:
model_data_cat = intake.open_catalog(nemo_catalog_url)
download_zenodo_files_for_entry(
    model_data_cat["NEMO_ORCA05_FOCI_Test_Minimal_grid_T"]
)
download_zenodo_files_for_entry(
    model_data_cat["NEMO_ORCA05_FOCI_Test_Minimal_mesh_mask"]
)

## Restrict to North Atlantic, calc mean SST, plot with obs positions

In [None]:
# hydrographic data
model_dataset = model_data_cat["NEMO_ORCA05_FOCI_Test_Minimal_grid_T"].to_dask()
model_dataset = model_dataset.set_coords(["nav_lat", "nav_lon"])
model_dataset = model_dataset.isel(x=slice(410, 620), y=slice(320, 450))
model_dataset = xr.decode_cf(model_dataset)

# Need the grid definitions
model_meshmask = model_data_cat["NEMO_ORCA05_FOCI_Test_Minimal_mesh_mask"].to_dask()
model_meshmask = model_meshmask.isel(x=slice(410, 620), y=slice(320, 450))
model_meshmask = xr.decode_cf(model_meshmask)

In [None]:
display(model_dataset)

In [None]:
display(model_meshmask)

In [None]:
# need compute / cast to numpy array here in order for datashade to work
# (see https://datashader.org/user_guide/Performance.html)
model_mean_sst = model_dataset.sosstsst.mean("time_counter").compute()
model_mean_sst = model_mean_sst.where(model_mean_sst != 0)

In [None]:
(
    model_mean_sst.hvplot.quadmesh(
        "nav_lon", "nav_lat",
        geo=True, datashade=True, hover=False)
    * near_surface_obs.hvplot(
        "Longitude", "Latitude",
        geo=True, kind="points", color="red", hover=False)
    * gf.coastline
)

## Extract model data along ship track (surface positions)

In [None]:
xll2ij.get_ij?

In [None]:
positions = list(zip(
    near_surface_obs["Latitude"],
    near_surface_obs["Longitude"],
))

depths = near_surface_obs["Depth water"].to_xarray()
depths

times = near_surface_obs["Date/Time"].to_xarray()

lat_ind, lon_ind = xll2ij.get_ij(
    model_meshmask, positions, 't', xgcm=False, xarray_out=True)
lat_ind = lat_ind.rename({"location": "Event"})
lon_ind = lon_ind.rename({"location": "Event"})

In [None]:
# select
ship_track_data = model_dataset.isel(y=lat_ind, x=lon_ind)
ship_track_data = ship_track_data.sel(deptht=depths, method="nearest")
ship_track_data = ship_track_data.sel(time_counter=times, method="nearest")

# mask
ship_track_data = ship_track_data.where(ship_track_data.votemper != 0)

display(ship_track_data)

In [None]:
(
    ship_track_data.to_dataframe().hvplot.scatter("votemper", "vosaline", label="surface data, model", hover=False)
    * near_surface_obs.hvplot.scatter("Tpot", "Sal", alpha=0.8, label="surface data, obs", hover=False)
)

## Extract model data along ship track (all depths)

In [None]:
positions = list(zip(
    obs_df["Latitude"],
    obs_df["Longitude"],
))

depths = obs_df["Depth water"].to_xarray()
depths

times = obs_df["Date/Time"].to_xarray()

lat_ind, lon_ind = xll2ij.get_ij(
    model_meshmask, positions, 't', xgcm=False, xarray_out=True)
lat_ind = lat_ind.rename({"location": "index"})
lon_ind = lon_ind.rename({"location": "index"})

In [None]:
# select
ship_track_data = model_dataset.isel(y=lat_ind, x=lon_ind)
ship_track_data = ship_track_data.sel(deptht=depths, method="nearest")
ship_track_data = ship_track_data.sel(time_counter=times, method="nearest")

# mask
ship_track_data = ship_track_data.where(ship_track_data.votemper != 0)

display(ship_track_data)

In [None]:
(
    ship_track_data.to_dataframe().hvplot.scatter(
        "votemper", "vosaline", alpha=0.4, label="surface data, model", hover=False
    )
    * obs_df.hvplot.scatter(
        "Tpot", "Sal", alpha=0.4, label="surface data, obs", hover=False
    )
)