Improve documentation

earthobservations · Oct 1, 2022 · d5abcf7 · d5abcf7
1 parent 2bbc8b2
commit d5abcf7
Show file tree

Hide file tree

Showing 37 changed files with 2,647 additions and 1,754 deletions.
diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst
@@ -0,0 +1,19 @@
+Contributions
+#############
+
+Thank you for considering contributing to wetterdienst! We are an open community that works respectfully together on
+environmental data. We are colorful mix of people: some of us have environmental background, others come from computer
+science related fields. This also means that our contributions may differ in quality and preciseness however we are
+welcoming you to contribute in any possible way whether it be
+
+- requesting the implementation of a dataset by providing urls to the data/metadata source and rough descriptions of
+  the data itself and its meaning to you
+- sketching out new weather services and a possible way to access them
+- "my notebook is running hot and my system seems to have crashed"
+- you have spotted a very specific caching issue and know exactly how to handle it
+
+This also means that contributions can be issues but also pull requests with specific code changes. Working on code you
+may follow our development guide (documentation) to reduce the time used to setup your environment.
+
+Whenever you reach out to us, probably Andreas Motl or me (Benjamin Gutzmann) will respond to you within a few days and
+try to resolve your problem quickly.
diff --git a/THIRD_PARTY_NOTICES b/THIRD_PARTY_NOTICES
diff --git a/docs/contribution/development.rst b/docs/contribution/development.rst
@@ -1,6 +1,10 @@
 Development
 ###########
 
+Whether you work on an issue, try to implement a new feature or work on adding a new weather service, you'll need a
+proper working environment. The following describes how to setup such an environment and how to enable you to
+satisfy our code quality rules which would ultimately fail on Github CI and block a PR.
+
 1. Clone the library and install the environment.
 
    This setup procedure will outline how to install the library and the minimum
@@ -76,8 +80,10 @@ Development
 
 5. Push your changes and submit them as pull request
 
-   Thank you in advance!
+   That's it, you're almost done! We'd already like to thank you for your commitment!
 
+6. Wait for our feedback. We'll probably come back to you in a few days and let you know if there's anything that may
+   need some more polishing.
 
 .. note::
 

diff --git a/docs/contribution/implementing_services.rst b/docs/contribution/implementing_services.rst
diff --git a/docs/contribution/index.rst b/docs/contribution/index.rst
@@ -3,6 +3,5 @@ Contribution
 .. toctree::
    :maxdepth: 1
 
-   introduction
    development
-   implementing_services
+   services
diff --git a/docs/contribution/introduction.rst b/docs/contribution/introduction.rst
diff --git a/docs/contribution/services.rst b/docs/contribution/services.rst
@@ -0,0 +1,263 @@
+Services
+########
+
+The core of wetterdienst is to provide data but as we don't collect the data ourselves we rely on consuming data of
+already existing services - mostly governmental services. To simplify the implementation of weather services we created
+enumerations and classes that should be used in order to adapt whatever API is offered by a service to the general
+scheme of wetterdienst with some handful of attributes to define each API and streamline internal workflows. The
+following paragraphs describe how we can/may/should implement a new weather service in wetterdienst as far as our own
+experience goes. We'll give examples based on the DwdObservationRequest implementation.
+
+
+Step 1: Enumerations
+********************
+
+The basis for the implementation of a new service are enumerations. A weather service requires 5 enumerations:
+- parameter enumeration
+- unit enumeration
+- dataset enumeration
+- resolution enumeration
+- period enumeration
+
+Parameter enumeration
+=====================
+
+The parameter enumeration could look like this:
+
+.. code-block:: python
+
+    from wetterdienst import Resolution
+    from wetterdienst.util.parameter import DatasetTreeCore
+
+    class DwdObservationParameter(DatasetTreeCore):
+        # the string "MINUTE_1" has the match the name of a resolution, here Resolution.MINUTE_1
+        class MINUTE_1(DatasetTreeCore):
+            # precipitation
+            class PRECIPITATION(Enum):
+                QUALITY = "qn"
+                PRECIPITATION_HEIGHT = "rs_01"
+                PRECIPITATION_HEIGHT_DROPLET = "rth_01"
+                PRECIPITATION_HEIGHT_ROCKER = "rwh_01"
+                PRECIPITATION_INDEX = "rs_ind_01"
+
+            PRECIPITATION_HEIGHT = PRECIPITATION.PRECIPITATION_HEIGHT
+            PRECIPITATION_HEIGHT_DROPLET = PRECIPITATION.PRECIPITATION_HEIGHT_DROPLET
+            PRECIPITATION_HEIGHT_ROCKER = PRECIPITATION.PRECIPITATION_HEIGHT_ROCKER
+            PRECIPITATION_INDEX = PRECIPITATION.PRECIPITATION_INDEX
+
+            class ANOTHER_DATASET(Enum):
+                QUALITY = "qn"
+                # this parameter can't be accessed via MINUTE_1.PRECIPITATION_HEIGHT
+                # but has to be queried with something like
+                # parameter=(MINUTE_1.ANOTHER_DATASET.PRECIPITATION_HEIGHT, MINUTE_1.ANOTHER_DATASET)
+                PRECIPITATION_HEIGHT = "rs_01"
+
+.. hint::
+
+    Here `MINUTE_1` represents the resolution of the data and it has to match one of the resolution names of the core
+    resolution (here it matches Resolution.MINUTE_1). It has to match as we access the possible parameters e.g. via
+    the requested resolution.
+
+As the DWD observations are offered in datasets, `DwdObservationParameter` has two layers of parameters:
+- flat layer of all available parameters for a given resolution with favorites for parameters if two of the same name
+  exist in different datasets
+- deep layer of a dataset and its own parameters
+
+Here we have a dataset `PRECIPITATION` in `MINUTE_1` resolution, which has four parameters and one quality column.
+Those parameters are flattened out by adding links to them on the resolution level. This way we can now access
+parameters as follows:
+
+.. code-block:: python
+
+    # PRECIPITATION_HEIGHT of PRECIPITATION dataset
+    DwdObservationRequest(
+        parameter=DwdObservationParameter.MINUTE_1.PRECIPITATION_HEIGHT
+    )
+
+    # same as above
+    DwdObservationRequest(
+        parameter=DwdObservationParameter.MINUTE_1.PRECIPITATION.PRECIPITATION_HEIGHT
+    )
+
+    # PRECIPITATION_HEIGHT of the exact PRECIPITATION dataset, assuming that there would be another dataset with the
+    # same parameter
+    DwdObservationRequest(
+        parameter=(DwdObservationParameter.MINUTE_1.PRECIPITATION.PRECIPITATION_HEIGHT, DwdObservationParameter.MINUTE_1.PRECIPITATION)
+    )
+
+.. hint::
+
+    The values of the enumerations should represent the original name of the parameter to create renaming mappings.
+
+Unit enumeration
+================
+
+The unit enumeration has to match the parameter enumeration except that it should only have deep levels. For the above
+example it should look like:
+
+.. code-block:: python
+
+    from wetterdienst.util.parameter import DatasetTreeCore
+    from wetterdienst.metadata.unit import OriginUnit, SIUnit, UnitEnum
+
+    class DwdObservationUnit(DatasetTreeCore):
+        # the string "MINUTE_1" has the match the name of a resolution, here Resolution.MINUTE_1
+        class MINUTE_1(DatasetTreeCore):
+            # precipitation
+            class PRECIPITATION(UnitEnum):
+                QUALITY = OriginUnit.DIMENSIONLESS.value, SIUnit.DIMENSIONLESS.value
+                PRECIPITATION_HEIGHT = (
+                    OriginUnit.MILLIMETER.value,
+                    SIUnit.KILOGRAM_PER_SQUARE_METER.value,
+                )
+                PRECIPITATION_HEIGHT_DROPLET = (
+                    OriginUnit.MILLIMETER.value,
+                    SIUnit.KILOGRAM_PER_SQUARE_METER.value,
+                )
+                PRECIPITATION_HEIGHT_ROCKER = (
+                    OriginUnit.MILLIMETER.value,
+                    SIUnit.KILOGRAM_PER_SQUARE_METER.value,
+                )
+                PRECIPITATION_INDEX = (
+                    OriginUnit.DIMENSIONLESS.value,
+                    SIUnit.DIMENSIONLESS.value,
+                )
+
+Each parameter is represented by a tuple with the original unit and the SI unit. General conversations are easily
+possible with the pint unit system and for other more complex conversions we may have to define special mappings.
+
+Other enumerations
+==================
+
+The remaining enumerations are simple enumerations. The only thing that has to be considered here is that all the names
+are matching the ones from the parameter enumeration, the resolution enumeration and the period enumeration:
+
+.. code-block:: python
+
+    from enum import Enum
+    from wetterdienst import Resolution, Period
+
+    class DwdObservationDataset(Enum):
+        # 1_minute
+        PRECIPITATION = "precipitation"
+
+    class DwdObservationResolution(Enum):
+        # 1_minute
+        MINUTE_1 = Resolution.MINUTE_1.value
+
+    class DwdObservationPeriod(Enum):
+        # 1_minute
+        HISTORICAL = Period.HISTORICAL.value
+
+Step 2: Request class
+*********************
+
+The request class represents a request and carries all the required attributes as well as the values class that is
+responsible for acquiring the data later on. The implementation is based on `ScalarRequestCore` from `wetterdienst.core`.
+
+Attributes:
+
+.. code-block:: python
+
+    @property
+    @abstractmethod
+    def provider(self) -> Provider:
+        """Optional enumeration for multiple resolutions"""
+        pass
+
+    @property
+    @abstractmethod
+    def kind(self) -> Kind:
+        """Optional enumeration for multiple resolutions"""
+        pass
+
+    @property
+    @abstractmethod
+    def _resolution_base(self) -> Optional[Resolution]:
+        """Optional enumeration for multiple resolutions"""
+        pass
+
+    @property
+    @abstractmethod
+    def _resolution_type(self) -> ResolutionType:
+        """Resolution type, multi, fixed, ..."""
+        pass
+
+    @property
+    @abstractmethod
+    def _period_type(self) -> PeriodType:
+        """Period type, fixed, multi, ..."""
+        pass
+
+    @property
+    @abstractmethod
+    def _period_base(self) -> Optional[Period]:
+        """Period base enumeration from which a period string can be parsed"""
+        pass
+
+    @property
+    @abstractmethod
+    def _parameter_base(self) -> Enum:
+        """parameter base enumeration from which parameters can be parsed e.g.
+        DWDObservationParameter"""
+        pass
+
+    @property
+    @abstractmethod
+    def _data_range(self) -> DataRange:
+        """State whether data from this provider is given in fixed data chunks
+        or has to be defined over start and end date"""
+        pass
+
+    @property
+    @abstractmethod
+    def _has_datasets(self) -> bool:
+        """Boolean if weather service has datasets (when multiple parameters are stored
+        in one table/file)"""
+        pass
+
+    @property
+    def _unique_dataset(self) -> bool:
+        """If ALL parameters are stored in one dataset e.g. all daily data is stored in
+        one file"""
+        if self._has_datasets:
+            raise NotImplementedError("define if only one big dataset is available")
+        return False
+
+    @property
+    @abstractmethod
+    def _has_tidy_data(self) -> bool:
+        """If data is generally provided tidy -> then data should not be tidied but
+        rather tabulated if data is requested to not being tidy"""
+        pass
+
+    @property
+    @abstractmethod
+    def _unit_tree(self):
+        pass
+
+    @property
+    @abstractmethod
+    def _values(self):
+        """Class to get the values for a request"""
+        pass
+
+`ScalarRequestCore` has one abstract method that has to be implemented: the `_all` which manages to get a listing of
+stations for the requested datasets/parameters. The listing includes:
+- station_id
+- from_date
+- to_date
+- height
+- name
+- state
+- latitude
+- longitude
+
+The names can be mapped using the `Columns` enumeration.
+
+Step 3: Values class
+*********************
+
+The values class is based on `ScalarValuesCore` and manages the acquisition of actual data. The
+class is also part of the `ScalarRequestCore` being accessed via the `_values` property. It has to implement the
+`_collect_station_parameter` method that takes care of getting values of a parameter/dataset for a station id.
diff --git a/docs/data/coverage/dwd/mosmix.rst b/docs/data/coverage/dwd/mosmix.rst
@@ -12,15 +12,6 @@ comes with a set of 40 parameters and is published every hour while MOSMIX-L has
 of about 115 parameters and is released every 6 hours (3am, 9am, 3pm, 9pm). Both
 versions have a forecast limit of 240h.
 
-.. ipython:: python
-
-    from wetterdienst.provider.dwd.mosmix import DwdMosmixRequest
-
-    meta = DwdMosmixRequest.discover(flatten=False)
-
-    # Selection of daily historical data
-    print(meta)
-
 .. _Mosmix: https://www.dwd.de/EN/ourservices/met_application_mosmix/met_application_mosmix.html
 
 Structure