# Selecting data

The primary data filtering method of ModelSkill is the `sel()` method
which is accesible on most ModelSkill data structures. The `sel()`
method is a wrapper around
[`xarray.Dataset.sel`](https://xarray.pydata.org/en/stable/generated/xarray.Dataset.sel.html#xarray.Dataset.sel)
and can be used to select data based on time, location and/or variable.
The `sel()` method returns a new data structure of the same type with
the selected data.

## TimeSeries data

Point and track timeseries data of both observation and model result
kinds are stored in
[`TimeSeries`](../api/TimeSeries.html#modelskill.TimeSeries) objects
which uses
[`xarray.Dataset`](https://xarray.pydata.org/en/stable/generated/xarray.Dataset.html#xarray.Dataset)
as data container. The
[`sel()`](../api/TimeSeries.html#modelskill.TimeSeries.sel) method can
be used to select data based on time and returns a new
[`TimeSeries`](../api/TimeSeries.html#modelskill.TimeSeries) object with
the selected data.

In [1]:
import modelskill as ms
o = ms.observation("../data/obs.nc", item="waterlevel", gtype='point')
o_1month = o.sel(time=slice("2018-01-01", "2018-02-01"))
o_1month

<PointObservation>: obs
Location: nan, nan
Time: 2018-01-01 00:00:00 - 2018-02-01 23:00:00
Quantity:  []

## Comparer objects

[`Comparer`](../api/Comparer.html#modelskill.Comparer) and
[`ComparerCollection`](../api/ComparerCollection.html#modelskill.ComparerCollection)
contain matched data from observations and model results. The `sel()`
method can be used to select data based on time, model, quantity or
other criteria and returns a new comparer object with the selected data.

In [2]:
o = ms.observation("../data/SW/HKNA_Hm0.dfs0", item=0,
                    x=4.2420, y=52.6887,
                    name="HKNA")
m1 = ms.model_result("../data/SW/HKZN_local_2017_DutchCoast.dfsu", 
                      item="Sign. Wave Height",
                      name="m1")
m2 = ms.model_result("../data/SW/CMEMS_DutchCoast_2017-10-28.nc", 
                      item="VHM0",
                      name="m2")

In [3]:
cmp = ms.match(o, [m1, m2])
cmp_1month = cmp.sel(time=slice('2018-01-01', '2018-02-01'))
cmp_m1 = cmp.sel(model='m1')

## Skill objects

The [`skill()`](../api/Comparer.html#modelskill.Comparer.skill) and
[`mean_skill()`](../api/ComparerCollection.html#modelskill.ComparerCollection.mean_skill)
methods return a
[`SkillTable`](../api/SkillTable.html#modelskill.SkillTable) object with
skill scores from comparing observation and model result data using
different metrics (e.g. root mean square error). The data of the
[`SkillTable`](../api/SkillTable.html#modelskill.SkillTable) object is
stored in a (MultiIndex)
[`pandas.DataFrame`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html#pandas.DataFrame)
which can be accessed via the `data` attribute. The `sel()` method can
be used to select specific rows and returns a new
[`SkillTable`](../api/SkillTable.html#modelskill.SkillTable) object with
the selected data.

In [4]:
sk = cmp.skill()
sk_m1 = sk.sel(model='m1')