# Pre-matched data with auxiliary data



In [None]:
import modelskill as ms
import numpy as np
import pandas as pd
import mikeio

In [None]:
fn = "examples/metocean/eur_matched.dfs0"
mikeio.read(fn)

The function `from_matched()` takes a dataframe, a dfs0 or a mikeio.Dataset of already matched data and returns a Comparer object.

In [None]:
cmp = ms.from_matched(fn, obs_item=1, mod_items=0, aux_items=[2,3])
cmp.aux_names

In [None]:
# NOTE: we rename data_vars to avoid spaces in names
cmp = cmp.rename({"Wind speed": "wind_speed", "Wind Direction": "wind_dir"})

In [None]:
cmp.aux_names

In [None]:
cmp

In [None]:
cmp.data

In [None]:
cmp.skill()

In [None]:
cmp.plot.scatter(quantiles=0, figsize=(6,6));
cmp.plot.timeseries();

## Filter 

Filter on auxiliary data using `query()` or `where()`. Below, we consider only wave data when the wind speed is above 15 m/s.

In [None]:
cmp.query("wind_speed > 15.0")

In [None]:
cmp2 = cmp.where(cmp.data.wind_speed>15.0)
cmp2

In [None]:
# notice that the model data is kept, but the observations are filtered
cmp2.plot.timeseries();

More auxiliary data can be added, e.g. as derived data from the original data. 

In [None]:
cmp.data["residual"] = cmp.data["Hm0, model"] - cmp.data["Observation"]

In [None]:
large_residuals = np.abs(cmp.data.residual)>0.3
cmp3 = cmp.where(large_residuals)
# cmp3.plot.scatter(figsize=(6,6));
cmp3.plot.timeseries();

In [None]:
cmp3.data.data_vars

In [None]:
cmp3.data.Observation.values

## Aggregate

Let's split the data based on wind direction sector and aggregate the skill calculation of the significant wave height predition for each sector.

In [None]:
# Note: in this short example wind direction is between 274 and 353 degrees
df = cmp.data.wind_dir.to_dataframe()
cmp.data["windsector"] = pd.cut(df.wind_dir, [255, 285, 315, 345, 360], labels=["W", "WNW", "NNW", "N"])

In [None]:
ss = cmp.skill(by="windsector")
ss.style()

In [None]:
ss["rmse"].plot.bar(title="Hm0 RMSE by wind sector");

In [None]:
cmp.where(cmp.data.windsector=="W").plot.timeseries();