# Demo: Custom observation types
In this demo, you can find a demonstration on how to use Observation types.

In [None]:
%config InlineBackend.print_figure_kwargs = {'bbox_inches':None} #else the legend is cutoff in ipython inline plots

In [None]:
import metobs_toolkit

#Initialize an empty Dataset
your_dataset = metobs_toolkit.Dataset()

## Default observation types

An observation record must always be linked to an *observation type* which is specified by the ``Obstype`` class. 
An Obstype represents one observation type (i.g. temperature), and it handles unit conversions and string representations of an observation type. 

By default, a set of standard observationtypes are stored in a Dataset:

In [None]:
your_dataset.show()

From the output it is clear that an Obstype holds a **standard unit**. This standard unit is the preferred unit to store and visualize the data in. The toolkit will convert all observations to their standard unit, on all import methods. *(This is also true for the Modeldata, which is converted to the standard units upon import)*.

A **description** (optional) holds a more detailed description of the observation type. 

Multiple **known units** can be defined, as long as the conversion to the standard unit is defined. 

**Aliases** are equivalent names for the same unit. 

At last, each Obstype has a unique **name** for convenions. You can use this name to refer to the Obstype in the Dataset methods.

As an example take a look at the temperature observation and see what the standard unit, other units and aliases look like:


In [None]:
temperature_obstype = your_dataset.obstypes['temp'] #temp is the name of the observationtype
print(temperature_obstype)

temperature_obstype.get_info()

## Creating and Updating observations
If you want to create a new observationtype you can do this by creating an Obstype and adding it to your (empty) Dataset:

In [None]:
co2_concentration = metobs_toolkit.Obstype(obsname='co2',
                                           std_unit='ppm')

#add other units to it (if needed)
co2_concentration.add_unit(unit_name='ppb',
                           conversion=['x / 1000'], #1 ppb = 0.001 ppm
                          )

#Set a description
co2_concentration.set_description(desc='The CO2 concentration measured at 2m above surface')

#add it to your dataset
your_dataset.add_new_observationtype(co2_concentration)

#You can see the CO2 concentration is now added to the dataset
your_dataset.show()
                                           

You can also update (the units) of the know observationtypes :

In [None]:
your_dataset.add_new_unit(obstype = 'temp', 
                          new_unit= 'your_new_unit',
                          conversion_expression = ['x+3', 'x * 2'])
# The conversion means: 1 [your_new_unit] = (1 + 3) * 2 [°C]
your_dataset.obstypes['temp'].get_info()

## Obstypes for Modeldata
### ModelObstype
An extension to the `Obstype` class is the `ModelObstype` class which is used for interacting with GEE dataset. In addition to a regular `Obstype` a `ModelObstype` contains the info which band (of the GEE dataset) represents the observation, and handles the unit conversion. 

*Note:* All methods that work on `Obstype` do also work on `ModelObstype`.


A `ModelObstype` is specific to one GEE dataset. Therefore the known modelobstypes are stored in each `GeeDynamicDataset`. As a default, there is an ERA5-land `GeeDynamicDataset` stored in all Datasets.

In [None]:
your_dataset.gee_datasets

In [None]:
era5_model = your_dataset.gee_datasets['ERA5-land']
era5_model

To see all the known `ModelObstypes` of the era5_model, we can look in the attribute or use the `GeeDynamicDataset.get_info()` method. 

In [None]:
print(era5_model.modelobstypes)
# or
era5_model.get_info()

As an example, we will create a new ModelObstype that represents the accumulated precipitation as is present in the ERA5_land GEE dataset. We extract precipitation timeseries as a demo.

In [None]:
import pandas as pd
from datetime import datetime
#Create a new observation type
precipitation = metobs_toolkit.Obstype(obsname='cumulated_precip',
                                      std_unit='m',
                                      description='Cumulated total precipitation since midnight per squared meter')

#Create the ModelObstype
precip_in_era5 = metobs_toolkit.ModelObstype(
                        obstype=precipitation,
                        model_band='total_precipitation', #look this up: https://developers.google.com/earth-engine/datasets/catalog/ECMWF_ERA5_LAND_HOURLY#bands 
                        model_unit='m',
               )
# Add it to the ERA5 model
era5_model.add_modelobstype(precip_in_era5)

era5_model.modelobstypes


In [None]:

# import metadata in your dataset 
your_dataset.import_data_from_file(
                input_data_file=metobs_toolkit.demo_datafile,
                input_metadata_file=metobs_toolkit.demo_metadatafile,
                template_file=metobs_toolkit.demo_template)
metadf = your_dataset.metadf
metadf


In [None]:

# Now we add the metadata to the model 
era5_model.set_metadf(metadf)

# Define a time period
tstart = datetime(2023,1,12)
tend = datetime(2023,1,15)


#Extract timeseries data at the location of the stations
era5_model.extract_timeseries_data(
                    obstypes=['cumulated_precip'],
                    startdt_utc=tstart,
                    enddt_utc=tend)
                

era5_model.get_info()


In [None]:
era5_model.modeldf

In [None]:
era5_model.make_plot(obstype_model='cumulated_precip')

### ModelObstype_Vectorfield
At a specific height, the wind can be seen (by approximation) as a 2D vector field. The vector components are often stored in different bands/variables in a model.

For example, if you want the 10m windspeed from ERA5 you cannot find a band for the windspeed. There are bands for the
u and v component of the wind. 

The `ModelObstype_Vectorfield` class represents a modelobstype, for which there does not exist a band, but can be constructed from (orthogonal) components. The vector amplitudes and direction are computed, and the corresponding `ModelObstype`'s are created.

By default, the *wind* is added as a `ModelObstype_vectorfield` for the ERA5-land `GeeDynamicDataset`.

In [None]:
era5_model.modelobstypes

So we can see that *wind* corresponds with two bands (the u and v component).

When extracting the wind data from era5 it will
 1. Download the u and v wind components for your period and locations.
 2. Convert each component to its standard units (m/s for the wind components).
 3. Compute the amplitude and the direction (in degrees from North, clockwise).
 4. Add a `ModelObstype` for the amplitude and one for the direction.

In [None]:
era5_model.set_metadf(metadf)

tstart = datetime(2023,1,12)
tend = datetime(2023,1,15)

era5_model.extract_timeseries_data(
                    obstypes=['wind'],
                    startdt_utc=tstart,
                    enddt_utc=tend)
                

era5_model.modelobstypes


We can see that *wind_speed* and *wind_direction* are added to the known modelobstypes of the era5_model. The modeldata contains these columns.

In [None]:
era5_model.modeldf

In [None]:
era5_model.make_plot(obstype_model='wind_speed')