Skip to content

Commit

Permalink
Improve documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
gutzbenj committed Oct 1, 2022
1 parent 2bbc8b2 commit d5abcf7
Show file tree
Hide file tree
Showing 37 changed files with 2,647 additions and 1,754 deletions.
19 changes: 19 additions & 0 deletions CONTRIBUTING.rst
@@ -0,0 +1,19 @@
Contributions
#############

Thank you for considering contributing to wetterdienst! We are an open community that works respectfully together on
environmental data. We are colorful mix of people: some of us have environmental background, others come from computer
science related fields. This also means that our contributions may differ in quality and preciseness however we are
welcoming you to contribute in any possible way whether it be

- requesting the implementation of a dataset by providing urls to the data/metadata source and rough descriptions of
the data itself and its meaning to you
- sketching out new weather services and a possible way to access them
- "my notebook is running hot and my system seems to have crashed"
- you have spotted a very specific caching issue and know exactly how to handle it

This also means that contributions can be issues but also pull requests with specific code changes. Working on code you
may follow our development guide (documentation) to reduce the time used to setup your environment.

Whenever you reach out to us, probably Andreas Motl or me (Benjamin Gutzmann) will respond to you within a few days and
try to resolve your problem quickly.
1,540 changes: 1,037 additions & 503 deletions THIRD_PARTY_NOTICES

Large diffs are not rendered by default.

8 changes: 7 additions & 1 deletion docs/contribution/development.rst
@@ -1,6 +1,10 @@
Development
###########

Whether you work on an issue, try to implement a new feature or work on adding a new weather service, you'll need a
proper working environment. The following describes how to setup such an environment and how to enable you to
satisfy our code quality rules which would ultimately fail on Github CI and block a PR.

1. Clone the library and install the environment.

This setup procedure will outline how to install the library and the minimum
Expand Down Expand Up @@ -76,8 +80,10 @@ Development
5. Push your changes and submit them as pull request

Thank you in advance!
That's it, you're almost done! We'd already like to thank you for your commitment!

6. Wait for our feedback. We'll probably come back to you in a few days and let you know if there's anything that may
need some more polishing.

.. note::

Expand Down
Empty file.
3 changes: 1 addition & 2 deletions docs/contribution/index.rst
Expand Up @@ -3,6 +3,5 @@ Contribution
.. toctree::
:maxdepth: 1

introduction
development
implementing_services
services
12 changes: 0 additions & 12 deletions docs/contribution/introduction.rst

This file was deleted.

263 changes: 263 additions & 0 deletions docs/contribution/services.rst
@@ -0,0 +1,263 @@
Services
########

The core of wetterdienst is to provide data but as we don't collect the data ourselves we rely on consuming data of
already existing services - mostly governmental services. To simplify the implementation of weather services we created
enumerations and classes that should be used in order to adapt whatever API is offered by a service to the general
scheme of wetterdienst with some handful of attributes to define each API and streamline internal workflows. The
following paragraphs describe how we can/may/should implement a new weather service in wetterdienst as far as our own
experience goes. We'll give examples based on the DwdObservationRequest implementation.


Step 1: Enumerations
********************

The basis for the implementation of a new service are enumerations. A weather service requires 5 enumerations:
- parameter enumeration
- unit enumeration
- dataset enumeration
- resolution enumeration
- period enumeration

Parameter enumeration
=====================

The parameter enumeration could look like this:

.. code-block:: python
from wetterdienst import Resolution
from wetterdienst.util.parameter import DatasetTreeCore
class DwdObservationParameter(DatasetTreeCore):
# the string "MINUTE_1" has the match the name of a resolution, here Resolution.MINUTE_1
class MINUTE_1(DatasetTreeCore):
# precipitation
class PRECIPITATION(Enum):
QUALITY = "qn"
PRECIPITATION_HEIGHT = "rs_01"
PRECIPITATION_HEIGHT_DROPLET = "rth_01"
PRECIPITATION_HEIGHT_ROCKER = "rwh_01"
PRECIPITATION_INDEX = "rs_ind_01"
PRECIPITATION_HEIGHT = PRECIPITATION.PRECIPITATION_HEIGHT
PRECIPITATION_HEIGHT_DROPLET = PRECIPITATION.PRECIPITATION_HEIGHT_DROPLET
PRECIPITATION_HEIGHT_ROCKER = PRECIPITATION.PRECIPITATION_HEIGHT_ROCKER
PRECIPITATION_INDEX = PRECIPITATION.PRECIPITATION_INDEX
class ANOTHER_DATASET(Enum):
QUALITY = "qn"
# this parameter can't be accessed via MINUTE_1.PRECIPITATION_HEIGHT
# but has to be queried with something like
# parameter=(MINUTE_1.ANOTHER_DATASET.PRECIPITATION_HEIGHT, MINUTE_1.ANOTHER_DATASET)
PRECIPITATION_HEIGHT = "rs_01"
.. hint::

Here `MINUTE_1` represents the resolution of the data and it has to match one of the resolution names of the core
resolution (here it matches Resolution.MINUTE_1). It has to match as we access the possible parameters e.g. via
the requested resolution.

As the DWD observations are offered in datasets, `DwdObservationParameter` has two layers of parameters:
- flat layer of all available parameters for a given resolution with favorites for parameters if two of the same name
exist in different datasets
- deep layer of a dataset and its own parameters

Here we have a dataset `PRECIPITATION` in `MINUTE_1` resolution, which has four parameters and one quality column.
Those parameters are flattened out by adding links to them on the resolution level. This way we can now access
parameters as follows:

.. code-block:: python
# PRECIPITATION_HEIGHT of PRECIPITATION dataset
DwdObservationRequest(
parameter=DwdObservationParameter.MINUTE_1.PRECIPITATION_HEIGHT
)
# same as above
DwdObservationRequest(
parameter=DwdObservationParameter.MINUTE_1.PRECIPITATION.PRECIPITATION_HEIGHT
)
# PRECIPITATION_HEIGHT of the exact PRECIPITATION dataset, assuming that there would be another dataset with the
# same parameter
DwdObservationRequest(
parameter=(DwdObservationParameter.MINUTE_1.PRECIPITATION.PRECIPITATION_HEIGHT, DwdObservationParameter.MINUTE_1.PRECIPITATION)
)
.. hint::

The values of the enumerations should represent the original name of the parameter to create renaming mappings.

Unit enumeration
================

The unit enumeration has to match the parameter enumeration except that it should only have deep levels. For the above
example it should look like:

.. code-block:: python
from wetterdienst.util.parameter import DatasetTreeCore
from wetterdienst.metadata.unit import OriginUnit, SIUnit, UnitEnum
class DwdObservationUnit(DatasetTreeCore):
# the string "MINUTE_1" has the match the name of a resolution, here Resolution.MINUTE_1
class MINUTE_1(DatasetTreeCore):
# precipitation
class PRECIPITATION(UnitEnum):
QUALITY = OriginUnit.DIMENSIONLESS.value, SIUnit.DIMENSIONLESS.value
PRECIPITATION_HEIGHT = (
OriginUnit.MILLIMETER.value,
SIUnit.KILOGRAM_PER_SQUARE_METER.value,
)
PRECIPITATION_HEIGHT_DROPLET = (
OriginUnit.MILLIMETER.value,
SIUnit.KILOGRAM_PER_SQUARE_METER.value,
)
PRECIPITATION_HEIGHT_ROCKER = (
OriginUnit.MILLIMETER.value,
SIUnit.KILOGRAM_PER_SQUARE_METER.value,
)
PRECIPITATION_INDEX = (
OriginUnit.DIMENSIONLESS.value,
SIUnit.DIMENSIONLESS.value,
)
Each parameter is represented by a tuple with the original unit and the SI unit. General conversations are easily
possible with the pint unit system and for other more complex conversions we may have to define special mappings.

Other enumerations
==================

The remaining enumerations are simple enumerations. The only thing that has to be considered here is that all the names
are matching the ones from the parameter enumeration, the resolution enumeration and the period enumeration:

.. code-block:: python
from enum import Enum
from wetterdienst import Resolution, Period
class DwdObservationDataset(Enum):
# 1_minute
PRECIPITATION = "precipitation"
class DwdObservationResolution(Enum):
# 1_minute
MINUTE_1 = Resolution.MINUTE_1.value
class DwdObservationPeriod(Enum):
# 1_minute
HISTORICAL = Period.HISTORICAL.value
Step 2: Request class
*********************

The request class represents a request and carries all the required attributes as well as the values class that is
responsible for acquiring the data later on. The implementation is based on `ScalarRequestCore` from `wetterdienst.core`.

Attributes:

.. code-block:: python
@property
@abstractmethod
def provider(self) -> Provider:
"""Optional enumeration for multiple resolutions"""
pass
@property
@abstractmethod
def kind(self) -> Kind:
"""Optional enumeration for multiple resolutions"""
pass
@property
@abstractmethod
def _resolution_base(self) -> Optional[Resolution]:
"""Optional enumeration for multiple resolutions"""
pass
@property
@abstractmethod
def _resolution_type(self) -> ResolutionType:
"""Resolution type, multi, fixed, ..."""
pass
@property
@abstractmethod
def _period_type(self) -> PeriodType:
"""Period type, fixed, multi, ..."""
pass
@property
@abstractmethod
def _period_base(self) -> Optional[Period]:
"""Period base enumeration from which a period string can be parsed"""
pass
@property
@abstractmethod
def _parameter_base(self) -> Enum:
"""parameter base enumeration from which parameters can be parsed e.g.
DWDObservationParameter"""
pass
@property
@abstractmethod
def _data_range(self) -> DataRange:
"""State whether data from this provider is given in fixed data chunks
or has to be defined over start and end date"""
pass
@property
@abstractmethod
def _has_datasets(self) -> bool:
"""Boolean if weather service has datasets (when multiple parameters are stored
in one table/file)"""
pass
@property
def _unique_dataset(self) -> bool:
"""If ALL parameters are stored in one dataset e.g. all daily data is stored in
one file"""
if self._has_datasets:
raise NotImplementedError("define if only one big dataset is available")
return False
@property
@abstractmethod
def _has_tidy_data(self) -> bool:
"""If data is generally provided tidy -> then data should not be tidied but
rather tabulated if data is requested to not being tidy"""
pass
@property
@abstractmethod
def _unit_tree(self):
pass
@property
@abstractmethod
def _values(self):
"""Class to get the values for a request"""
pass
`ScalarRequestCore` has one abstract method that has to be implemented: the `_all` which manages to get a listing of
stations for the requested datasets/parameters. The listing includes:
- station_id
- from_date
- to_date
- height
- name
- state
- latitude
- longitude

The names can be mapped using the `Columns` enumeration.

Step 3: Values class
*********************

The values class is based on `ScalarValuesCore` and manages the acquisition of actual data. The
class is also part of the `ScalarRequestCore` being accessed via the `_values` property. It has to implement the
`_collect_station_parameter` method that takes care of getting values of a parameter/dataset for a station id.
9 changes: 0 additions & 9 deletions docs/data/coverage/dwd/mosmix.rst
Expand Up @@ -12,15 +12,6 @@ comes with a set of 40 parameters and is published every hour while MOSMIX-L has
of about 115 parameters and is released every 6 hours (3am, 9am, 3pm, 9pm). Both
versions have a forecast limit of 240h.

.. ipython:: python
from wetterdienst.provider.dwd.mosmix import DwdMosmixRequest
meta = DwdMosmixRequest.discover(flatten=False)
# Selection of daily historical data
print(meta)
.. _Mosmix: https://www.dwd.de/EN/ourservices/met_application_mosmix/met_application_mosmix.html

Structure
Expand Down

0 comments on commit d5abcf7

Please sign in to comment.