# Multi model comparison

We often want to compare the result of multiple models. 

**Calibration**. We have several "runs" of the same model with different settings. We would like to find the best. 

**Validation**. We would like to compare our model with alternative models, e.g. a regional DHI model or an external model. 

In this notebook, we will consider several wave models for the Southern North Sea and compare to both point measurements and satellite altimetry data. 


In [None]:
import numpy as np
from fmskill import ModelResult
from fmskill import PointObservation, TrackObservation, Connector

## Define observations

In [None]:
o1 = PointObservation('data/SW/HKNA_Hm0.dfs0', item=0, x=4.2420, y=52.6887, name="HKNA")
o2 = PointObservation("data/SW/eur_Hm0.dfs0", item=0, x=3.2760, y=51.9990, name="EPL")
o3 = TrackObservation("data/SW/Alti_c2_Dutch.dfs0", item=3, name="c2")

## Define models

In [None]:
mr1 = ModelResult('data/SW/HKZN_local_2017_DutchCoast.dfsu', name='SW_1', item=0)
mr2 = ModelResult('data/SW/HKZN_local_2017_DutchCoast_v2.dfsu', name='SW_2', item=0)
mr3 = ModelResult('data/SW/ERA5_DutchCoast.nc', name='ERA5', item="swh")

## Connect observations and model results

In [None]:
con = Connector([o1, o2, o3], [mr1, mr2, mr3])
con

In [None]:
con.modelresults

In [None]:
con.plot_observation_positions();

In [None]:
con.plot_temporal_coverage();

In [None]:
cc = con.extract()    # returns a collection of comparisons

In [None]:
cc["EPL"]   # select a single comparer from the collection like this

## Perform analysis
You can perform simple filtering on specific `observation` or specific `model`. You can refer to observations and models using their _name_ or _index_. 

The main analysis methods are:
* skill()
* mean_skill()
* scatter()
* taylor()

In [None]:
cc.skill()

In [None]:
cc.skill(observation="c2")

In [None]:
cc.mean_skill(model=0, observation=[0,"c2"])

In [None]:
cc.scatter(model='SW_1', cmap='OrRd')

In [None]:
cc.taylor(normalize_std=True, aggregate_observations=False)

### Time series plot (specifically for point comparisons)
If you select an comparison from the collection which is a PointComparer, you can do a time series plot

In [None]:
cc['EPL'].plot_timeseries(figsize=(12,4));

## Filtering on time
Use the `start` and `end` arguments to do your analysis on part of the time series

In [None]:
cc.skill(model="SW_1", end='2017-10-28')

In [None]:
cc.scatter(model='SW_2', start='2017-10-28', cmap='OrRd', figsize=(6,7))

## Filtering on area
You can do you analysis in a specific `area` by providing a bounding box or a closed polygon

In [None]:
bbox = np.array([0.5,52.5,5,54])
polygon = np.array([[6,51],[0,55],[0,51],[6,51]])

In [None]:
ax = con.plot_observation_positions();
ax.plot([bbox[0],bbox[2],bbox[2],bbox[0],bbox[0]],[bbox[1],bbox[1],bbox[3],bbox[3],bbox[1]]);
ax.plot(polygon[:,0],polygon[:,1]);

In [None]:
cc.skill(model="SW_1", area=bbox)

In [None]:
cc.scatter(model="SW_2", area=polygon) # , backend='plotly'

## Skill object

The skill() and mean_skill() methods return a skill object that can visualize results in various ways. The primary methods of the skill object are:

* style()
* plot_bar()
* plot_line()
* plot_grid()
* sel()

In [None]:
s = cc.skill()

In [None]:
s.style()

In [None]:
s.style(columns='rmse')

In [None]:
s.plot_bar('rmse');

In [None]:
s = cc.skill(by=['model','freq:12H'], metrics=['bias','rmse','si'])

In [None]:
s.style()

In [None]:
s.plot_line('rmse', title='Hm0 rmse [m]');

In [None]:
s.plot_grid('si', fmt='0.1%', title='Hm0 Scatter index');

### The sel() method can subset the skill object 

A new skill object will be returned

In [None]:
s = cc.skill()
s.style()

In [None]:
s.sel(model='SW_1').style()

In [None]:
s.sel(observation='HKNA').style()

In [None]:
s.sel('rmse>0.25').style()

In [None]:
s.sel('rmse>0.3', columns=['rmse','mae']).style()