diff --git a/README.rst b/README.rst index 7878ee282..7fe012f5b 100644 --- a/README.rst +++ b/README.rst @@ -1,6 +1,6 @@ -################################################################# -Wetterdienst - Python library to ease access to open weather data -################################################################# +########################################### +Wetterdienst - Open weather data for humans +########################################### .. image:: https://github.com/earthobservations/wetterdienst/workflows/Tests/badge.svg :target: https://github.com/earthobservations/wetterdienst/actions?workflow=Tests @@ -27,14 +27,23 @@ Wetterdienst - Python library to ease access to open weather data :target: https://zenodo.org/badge/latestdoi/160953150 -Welcome to Wetterdienst, your friendly weather service library for Python from the -neighbourhood! We are a group of people who try to make access to weather data in +Introduction +************ +Welcome to Wetterdienst, your friendly weather service library for Python. + +We are a group of like-minded people trying to make access to weather data in Python feel like a warm summer breeze, similar to other projects like -`rdwd `_ -for the R language, which originally drew our interest in this project. +rdwd_ for the R language, which originally drew our interest in this project. + +While our long-term goal is to provide access to multiple weather services, +we are still stuck with the German Weather Service (DWD). Contributions are +always welcome! + +This program and its repository tries to use modern Python technologies +all over the place. The library is based on Pandas across the board, +uses Poetry for package administration and GitHub actions for +all things CI. -While our long-term goal is to provide you with data from multiple weather services, -we are still stuck with the German Weather Service (DWD). Features ******** @@ -50,7 +59,6 @@ The library currently covers To get better insight on which data we have currently made available, with this library take a look at `data coverage`_. -.. _data coverage: https://wetterdienst.readthedocs.io/en/latest/pages/data_coverage.html Details ======= @@ -115,31 +123,17 @@ documentation, which will be constantly updated. To stay up to date with the development, take a look at the changelog_. Also, don't miss out our examples_. -.. _Wetterdienst API: https://wetterdienst.readthedocs.io/en/latest/pages/api.html -.. _changelog: https://wetterdienst.readthedocs.io/en/latest/pages/api.html -.. _examples: https://github.com/earthobservations/wetterdienst/tree/master/example - - -Contribution -************ -Check out our contribution section in the documentation! For a successful PR passing -all tests, you have to run - -.. code-block:: bash - - nox -s tests - nox -s black - nox -s lint - -before committing. This will inform you in case of problems with tests and your code -format. - - Data license ************ -**CAUTION** Although the data is specified as being open, the DWD asks you to reference them as -Copyright owner. To check out further, take a look at the -`Open Data Strategy at the DWD `_ -and the -`Official Copyright `_. +copyright owner. Please take a look at the `Open Data Strategy at the DWD`_ and the +`Official Copyright`_ statements before using the data. + + +.. _rdwd: https://github.com/brry/rdwd> +.. _Wetterdienst API: https://wetterdienst.readthedocs.io/en/latest/pages/api.html +.. _data coverage: https://wetterdienst.readthedocs.io/en/latest/pages/data_coverage.html +.. _changelog: https://wetterdienst.readthedocs.io/en/latest/pages/api.html +.. _examples: https://github.com/earthobservations/wetterdienst/tree/master/example +.. _Open Data Strategy at the DWD: https://www.dwd.de/EN/ourservices/opendata/opendata.html +.. _Official Copyright: https://www.dwd.de/EN/service/copyright/copyright_artikel.html?nn=495490&lsbId=627548 diff --git a/docs/index.rst b/docs/index.rst index 88a399e7b..16c54a248 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -3,30 +3,30 @@ You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. -################################################################# -Wetterdienst - Python library to ease access to open weather data -################################################################# +########################################### +Wetterdienst - Open weather data for humans +########################################### +***** +About +***** .. toctree:: :maxdepth: 1 - README + Introduction pages/installation pages/data_coverage - pages/api + API overview pages/cli + pages/license_and_citation +******** +Plumbing +******** .. toctree:: :maxdepth: 1 + pages/library/index pages/behind_the_scenes pages/development pages/changelog - pages/license_and_citation - -Indices and tables -################## - -* :ref:`genindex` -* :ref:`modindex` -* :ref:`search` diff --git a/docs/pages/api.rst b/docs/pages/api.rst index e68e72082..d77f57e0c 100644 --- a/docs/pages/api.rst +++ b/docs/pages/api.rst @@ -1,26 +1,47 @@ ### API ### -The API is divided amongst the data products as written in the :ref:`data-coverage` chapter. +The API offers access to different data products. They are +outlined in more detail within the :ref:`data-coverage` chapter. -*********************** -Historical Weather Data -*********************** -The API for the historical weather data mainly consists of the following functions: +.. contents:: + :local: + :depth: 1 -``discover_climate_observations`` -================================= -- Print out available time resolution, parameter, period type combinations and - subsets of it depending on the entered arguments. +************ +Observations +************ +Acquire historical weather data through requesting by +*parameter*, *time resolution* and *period type*. -``metadata_for_climate_observations`` -===================================== -- Discover what data for a set of parameters (parameter, time_resolution, - period_type) is available, especially which stations can be found. -- With **create_new_file_index**, the function can be forced to retrieve a new list - of files from the server, which is usually avoided as it rarely changes. +Request arguments +================= +The options *parameter*, *time resolution* and *period type* can be used in three ways: -Let's get station information for a given parameter, time resolution and period type: +- by using the exact enumeration e.g. + .. code-block:: python + + Parameter.CLIMATE_SUMMARY + +- by using the enumeration string e.g. + .. code-block:: python + + "climate_summary" or "CLIMATE_SUMMARY" + +- by using the originally defined parameter string e.g. + .. code-block:: python + + "kl" + +Use ``wetterdienst.discover_climate_observations()`` to discover available +time resolution, parameter, period type combinations and their subsets +based on the obtained filter arguments. + + +Station list +============ +Get station information for a given set of *parameter*, *time resolution* +and *period type* options. .. code-block:: python @@ -33,19 +54,22 @@ Let's get station information for a given parameter, time resolution and period period_type=PeriodType.HISTORICAL ) -The function returns a pandas DataFrame with information about the available stations, -including the column **HAS_FILE**, that indicates if the station has a file with data on -the server (which may not always be the case!). +The function returns a Pandas DataFrame with information about the available stations. +The column ``HAS_FILE`` indicates whether the station actually has a file with data on +the server. That might not always be the case for stations which have been phased out. +When using ``create_new_file_index=True``, the function can be forced to retrieve +a new list of files from the server. Otherwise, data will be served from the +cache because this information rarely changes. -``DWDStationRequest`` -===================== +Measurements +============ +Use the ``DWDStationRequest`` class in order to get hold of measurement information. -Synopsis --------- .. code-block:: python - from wetterdienst import DWDStationRequest, Parameter, PeriodType, TimeResolution + from wetterdienst import DWDStationRequest + from wetterdienst import Parameter, PeriodType, TimeResolution request = DWDStationRequest( station_ids=[3, 1048], @@ -66,74 +90,20 @@ This gives us the most options to work with the data, getting multiple parameter once, parsed nicely into column structure with improved parameter names and stored automatically on the drive if wanted. -Details -------- -- A class that can combine multiple periods/date ranges for any number of stations - and parameters of one time resolution. -- Wraps ``collect_climate_observations_data``: - - - Combines create_file_list_for_dwd_server, download_dwd_data and - parse_dwd_data for multiple stations - - Wraps the following three functions: - - - ``create_file_list_for_climate_observations`` - - is used with the help of the metadata to retrieve file paths to - files for a set of parameters + station id - - here also **create_new_file_index** can be used - - - ``download_climate_observations_data_parallel`` - - is used with the created file paths to download and store the data - (second os optionally, in a hdf) - - - ``parse_climate_observations_data`` - - is used to get the data into the Python environment in - shape of a pandas DataFrame. - - the data will be ready to be analyzed by you! - -Additionally, the following functions allow you to reset the cache of the file/meta index: - -- **reset_file_index_cache:** - - reset the cached file index to get latest list of files (only required for - constantly running system) - -- **reset_meta_index_cache:** - - reset the cached meta index to get latest list of files (only required for - constantly running system) - -Parameter, time resolution and period type can be entered in three ways: - -- by using the exact enumeration e.g. - .. code-block:: python - - Parameter.CLIMATE_SUMMARY - -- by using the enumeration string e.g. - .. code-block:: python - - "climate_summary" or "CLIMATE_SUMMARY" - -- by using the originally defined parameter string e.g. - .. code-block:: python - - "kl" - - -****************** Geospatial support -****************** +================== + +Inquire the list of stations by geographic coordinates. -``get_nearby_stations`` -======================= -- Calculate the close weather stations based on the coordinates for the requested data. -- Either selected by rank (n stations) or by distance in km. -- It returns a DataFrame with meta data, distances [in km] and station ids - that can be used to download the data. +- Calculate weather stations close to the given coordinates and set of parameters. +- Either select by rank (n stations) or by distance in km. .. code-block:: python from datetime import datetime - from wetterdienst import get_nearby_stations, DWDStationRequest, Parameter, PeriodType, TimeResolution + from wetterdienst import get_nearby_stations, DWDStationRequest + from wetterdienst import Parameter, PeriodType, TimeResolution stations = get_nearby_stations( 50.0, 8.9, @@ -145,14 +115,16 @@ Geospatial support num_stations_nearby=1 ) -The function returns a meta data DataFrame, where we can find -weather station ids and distances to get the observation data: +The function returns a DataFrame with the list of stations with distances [in km] +to the given coordinates. + +The station ids within the DataFrame: .. code-block:: python station_ids = stations.STATION_ID.unique() -Use these station ids to retrieve weather information: +can be used to download the observation data: .. code-block:: python @@ -174,7 +146,7 @@ Use these station ids to retrieve weather information: Et voila: We just got the data we wanted for our location and are ready to analyse the temperature on historical developments. -Check out the more advanced examples in the +Please also check out more advanced examples in the `example `_ folder on Github. diff --git a/docs/pages/behind_the_scenes.rst b/docs/pages/behind_the_scenes.rst index 74f9a504e..7311f2911 100644 --- a/docs/pages/behind_the_scenes.rst +++ b/docs/pages/behind_the_scenes.rst @@ -1,5 +1,36 @@ -Behind The Scenes ################# +Behind the scenes +################# + +Details +------- +- The ``DWDStationRequest`` class can combine multiple periods/date ranges + for any number of stations and parameters of one time resolution. +- It wraps ``collect_climate_observations_data``, which in turn combines + ``create_file_list_for_climate_observations``, ``download_climate_observations_data_parallel`` + and ``parse_climate_observations_data`` for multiple stations. + + - ``create_file_list_for_climate_observations`` + - is used with the help of the metadata to retrieve file paths to + files for a set of parameters + station id + - here also **create_new_file_index** can be used + + - ``download_climate_observations_data_parallel`` + - is used with the created file paths to download and store the data + (second os optionally, in a hdf) + + - ``parse_climate_observations_data`` + - is used to get the data into the Python environment in + shape of a pandas DataFrame. + - the data will be ready to be analyzed by you! + + +Additionally, the following functions allow you to reset the cache of the file/meta index: + +- **reset_file_index_cache:** + - reset the cached file index to get latest list of files (only required for + constantly running system) -Let's look at some anomalies that happen to be on the file server and how we manage to -get along with them. +- **reset_meta_index_cache:** + - reset the cached meta index to get latest list of files (only required for + constantly running system) diff --git a/docs/pages/cli.rst b/docs/pages/cli.rst index f9f6d8806..13963ecbb 100644 --- a/docs/pages/cli.rst +++ b/docs/pages/cli.rst @@ -1,6 +1,7 @@ ###################### Command line interface ###################### + :: $ wetterdienst --help diff --git a/docs/pages/data_coverage.rst b/docs/pages/data_coverage.rst index b39257622..29609293b 100644 --- a/docs/pages/data_coverage.rst +++ b/docs/pages/data_coverage.rst @@ -1,7 +1,7 @@ .. _data-coverage: ############# -Data Coverage +Data coverage ############# The DWD offers various datasets including but not only: diff --git a/docs/pages/development.rst b/docs/pages/development.rst index b8f55c2f8..333df2905 100644 --- a/docs/pages/development.rst +++ b/docs/pages/development.rst @@ -2,24 +2,33 @@ Development ########### -I/We originally started rebuilding -`rdwd `_ -in Python as a starting project, but soon got accompanied by others to make this work -as flawless as we can. We are always looking for others to join and bring in their own -ideas so please consider writing us! Below you can find more about contribution and the + +************ +Introduction +************ +We originally started rebuilding rdwd_ in Python as a starting project, +but soon got accompanied by others to make this work as flawless as we can. + +We are always looking for others to join and bring in their own ideas so +please consider writing us! Below you can find more about contribution and the most recent changelog of the library. +.. _rdwd: https://github.com/brry/rdwd> + + ************ Contribution ************ - As we are currently keeping development simple, so don't worry to much about style. If you want a PR to be merged, describe what you changed at best precision and we guarantee -a fast merge. Otherwise if you have an idea of a problem or even better a solution just +a fast merge. + +Otherwise, if you have an idea of a problem or even better a solution just let us know via an issue (you could also describe problem with words so we can figure out how to solve it with a suitable programming solution). -For development clone the repository and install developer dependencies via +For working on the code base, please clone the repository and install development +dependencies. .. code-block:: bash @@ -30,7 +39,8 @@ For development clone the repository and install developer dependencies via # or poetry install -Before committing, run black code formatter and lint to test for format. +Before committing, run the black code formatter and the linter to test for appropriate formatting. +This will inform you in case of problems with tests and your code format. .. code-block:: bash @@ -42,4 +52,4 @@ In order to run the tests more **quickly**:: poetry install --extras=excel poetry shell - pytest -vvvv -m "not (remote or slow) + pytest -vvvv -m "not (remote or slow)" diff --git a/docs/pages/installation.rst b/docs/pages/installation.rst index dcbe9ef3f..5f3108f1b 100644 --- a/docs/pages/installation.rst +++ b/docs/pages/installation.rst @@ -1,26 +1,31 @@ -Installation -############ +##### +Setup +##### +Wetterdienst can be used by either installing it on +your workstation or within a Docker container. -The installation of wetterdienst can happen via PyPi or directly from Github. The Github -version will always include most recent changes that may not have been released to PyPi. -PyPi +****** +Native +****** +The installation of ``wetterdienst`` can happen via PyPi or directly from GitHub. The GitHub +version will always include most recent changes that may not have been released to PyPI. + +PyPI .. code-block:: bash pip install wetterdienst -Github +GitHub .. code-block:: bash pip install git+https://github.com/earthobservations/wetterdienst -If you think that any constraints we have set for the library in the pyproject.toml -may have to be updated/improved, please come back to us via mail or place an issue on -Github. +****** Docker ****** @@ -35,29 +40,19 @@ To run the tests in the given environment, just call .. code-block:: bash - docker run -ti -v $(pwd):/app wetterdienst:latest poetry run pytest tests + docker run -ti -v $(pwd):/app wetterdienst:latest poetry run pytest -vvvv tests -from the main directory. To work in an iPython shell call +from the main directory. To work in an iPython shell, invoke .. code-block:: bash docker run -ti -v $(pwd):/app wetterdienst:latest poetry run ipython Command line script -******************* - -You can download data as csv files after building docker container. -Currently, only the `collect_dwd_data` is supported by this service. - -.. code-block:: bash - - docker run \ - -ti -v $(pwd):/app wetterdienst:latest poetry run python wetterdienst/run.py \ - collect_dwd_data "[1048]" "kl" "daily" "historical" /app/dwd_data/ False False True False True True - +=================== -The `wetterdienst` command is also available through Docker: +The ``wetterdienst`` command is also available through Docker: .. code-block:: bash - docker run -ti -v $(pwd):/app wetterdienst:latest poetry run wetterdienst + docker run -ti -v $(pwd):/app wetterdienst:latest poetry run wetterdienst --help diff --git a/docs/pages/library/api.rst b/docs/pages/library/api.rst new file mode 100644 index 000000000..0330b16d8 --- /dev/null +++ b/docs/pages/library/api.rst @@ -0,0 +1,30 @@ +### +API +### + +.. contents:: + :local: + :depth: 1 + +---- + +************ +Observations +************ + +.. autofunction:: wetterdienst.discover_climate_observations + +.. autofunction:: wetterdienst.metadata_for_climate_observations + +.. autoclass:: wetterdienst.api.DWDStationRequest + :members: + +.. autofunction:: wetterdienst.get_nearby_stations + + +******* +RADOLAN +******* + +.. autoclass:: wetterdienst.api.DWDRadolanRequest + :members: diff --git a/docs/pages/library/core.rst b/docs/pages/library/core.rst new file mode 100644 index 000000000..86b2e25b5 --- /dev/null +++ b/docs/pages/library/core.rst @@ -0,0 +1,57 @@ +#### +Core +#### + +.. contents:: + :local: + :depth: 1 + +---- + +.. todo:: Add more modules. + + +*********** +Additionals +*********** + +.. automodule:: wetterdienst.additionals.functions + :members: + +.. automodule:: wetterdienst.additionals.geo_location + :members: + +.. automodule:: wetterdienst.additionals.time_handling + :members: + + +*********** +Data models +*********** + +.. automodule:: wetterdienst.data_models.coordinates + :members: + + +******** +Download +******** + +.. automodule:: wetterdienst.download.download + :members: + +.. automodule:: wetterdienst.download.download_services + :members: + +.. automodule:: wetterdienst.download.https_handling + :members: + +************ +Enumerations +************ + +.. automodule:: wetterdienst.enumerations.column_names_enumeration + :members: + +.. automodule:: wetterdienst.enumerations.datetime_format_enumeration + :members: diff --git a/docs/pages/library/index.rst b/docs/pages/library/index.rst new file mode 100644 index 000000000..996c1c99a --- /dev/null +++ b/docs/pages/library/index.rst @@ -0,0 +1,18 @@ +############## +Module library +############## + + +* :ref:`genindex` +* :ref:`modindex` + + +********** +Subsystems +********** +.. toctree:: + :maxdepth: 1 + + api + machinery + core diff --git a/docs/pages/library/machinery.rst b/docs/pages/library/machinery.rst new file mode 100644 index 000000000..8c75bc8e1 --- /dev/null +++ b/docs/pages/library/machinery.rst @@ -0,0 +1,12 @@ +######### +Machinery +######### + +.. automodule:: wetterdienst.data_collection + :members: + +.. automodule:: wetterdienst.data_storing + :members: + +.. automodule:: wetterdienst.parse_metadata + :members: diff --git a/docs/pages/license_and_citation.rst b/docs/pages/license_and_citation.rst index 9baec82c6..8a589fcbf 100644 --- a/docs/pages/license_and_citation.rst +++ b/docs/pages/license_and_citation.rst @@ -1,4 +1,5 @@ -License And Citation +#################### +License and citation #################### .. include:: ../../LICENSE.rst diff --git a/wetterdienst/additionals/functions.py b/wetterdienst/additionals/functions.py index f9f80a6f3..5c9572733 100644 --- a/wetterdienst/additionals/functions.py +++ b/wetterdienst/additionals/functions.py @@ -294,14 +294,12 @@ def parse_enumeration_from_template( ) -> Union[Parameter, TimeResolution, PeriodType]: """ Function used to parse an enumeration(string) to a enumeration based on a template - Args: - enum_: enumeration as string or Enum - enum_template: base enumeration from which the enumeration is parsed - Returns: - parsed enumeration from template - Raises: - InvalidParameter if no matching enumeration found + :param "enum_": Enumeration as string or Enum + :param enum_template: Base enumeration from which the enumeration is parsed + + :return: Parsed enumeration from template + :raises InvalidParameter: if no matching enumeration found """ try: return enum_template[enum_.upper()] @@ -347,17 +345,18 @@ def discover_climate_observations( period_type: Optional[PeriodType] = None, ) -> str: """ - Function to print/discover available time_resolution/parameter/period_type - combinations. - - Args: - time_resolution: time_resolution to reduce the information - parameter: parameter to reduce the information - period_type: period_type to reduce the information - - Returns: - string of available combinations - + Function to print/discover available time_resolution/parameter/period_type + combinations. + + :param parameter: Observation measure + :type parameter: Parameter + :param time_resolution: Frequency/granularity of measurement interval + :type time_resolution: TimeResolution + :param period_type: Recent or historical files + :type period_type: PeriodType + + :return: JSON string of available combinations + :rtype: str """ if not time_resolution: time_resolution = [*TimeResolution] diff --git a/wetterdienst/additionals/geo_location.py b/wetterdienst/additionals/geo_location.py index 07bde30fe..f05c50888 100644 --- a/wetterdienst/additionals/geo_location.py +++ b/wetterdienst/additionals/geo_location.py @@ -38,24 +38,27 @@ def get_nearby_stations( ) -> pd.DataFrame: """ Provides a list of weather station ids for the requested data - Args: - latitude: latitude of location to search for nearest - weather station - longitude: longitude of location to search for nearest - weather station - minimal_available_date: Start date of timespan where measurements - should be available - maximal_available_date: End date of timespan where measurements - should be available - parameter: observation measure - time_resolution: frequency/granularity of measurement interval - period_type: recent or historical files - num_stations_nearby: Number of stations that should be nearby - max_distance_in_km: alternative filtering criteria, maximum - distance to location in km - Returns: - DataFrames with valid Stations in radius per requested location + :param latitude: Latitude of location to search for nearest + weather station + :param longitude: Longitude of location to search for nearest + weather station + :param minimal_available_date: Start date of timespan where measurements + should be available + :param maximal_available_date: End date of timespan where measurements + should be available + :param parameter: Observation measure + :type parameter: Parameter + :param time_resolution: Frequency/granularity of measurement interval + :type time_resolution: TimeResolution + :param period_type: Recent or historical files + :type period_type: PeriodType + :param num_stations_nearby: Number of stations that should be nearby + :param max_distance_in_km: Alternative filtering criteria, maximum + distance to location in km + + :return: DataFrames with valid stations in radius per requested location + :rtype: pandas.DataFrame """ if num_stations_nearby and max_distance_in_km: diff --git a/wetterdienst/api.py b/wetterdienst/api.py index c6b821737..366d012c5 100644 --- a/wetterdienst/api.py +++ b/wetterdienst/api.py @@ -57,23 +57,27 @@ def __init__( Special handling for period type. If start_date/end_date are given all period types are considered and merged together and the data is filtered for the given dates afterwards. - Args: - station_ids: definition of stations by str, int or list of str/int, - will be parsed to list of int - parameter: str or parameter enumeration defining the requested parameter - time_resolution: str or time resolution enumeration defining the requested - time resolution - period_type: str or period type enumeration defining the requested - period type - start_date: replacement for period type to define exact time of - requested data - end_date: replacement for period type to define exact time of requested data - prefer_local: definition if data should rather be taken from a local source - write_file: should data be written to a local file - folder: place where file lists (and station data) are stored - tidy_data: reshape DataFrame to a more tidy, row based version of data - humanize_column_names: replace column names by more meaningful ones - create_new_file_index: definition if the file index should be recreated + + :param station_ids: definition of stations by str, int or list of str/int, + will be parsed to list of int + :param parameter: Observation measure + :type parameter: Union[Parameter, str] + :param time_resolution: Frequency/granularity of measurement interval + :type time_resolution: Union[TimeResolution, str] + :param period_type: Recent or historical files + :type period_type: Union[PeriodType, str] + :param start_date: Replacement for period type to define exact time + of requested data + :param end_date: Replacement for period type to define exact time + of requested data + :param prefer_local: Definition if data should rather be taken from a + local source + :param write_file: Should data be written to a local file + :param folder: Place where file lists (and station data) are stored + :param tidy_data: Reshape DataFrame to a more tidy + and row-based version of data + :param humanize_column_names: Replace column names by more meaningful ones + :param create_new_file_index: Definition if the file index should be recreated """ if not (period_type or start_date or end_date): @@ -171,13 +175,9 @@ def collect_data(self) -> Generator[pd.DataFrame, None, None]: Method to collect data for a defined request. The function is build as generator in order to not cloak the memory thus if the user wants the data as one pandas DataFrame the generator has to be casted to a DataFrame manually via - pd.concat(list(request.collect_data([...])). - - Args: - same as init + pd.concat(list(request.collect_data()). - Returns: - via a generator per station a pandas.DataFrame + :return: A generator yielding a pandas.DataFrame per station. """ if self.create_new_file_index: reset_file_index_cache() @@ -259,16 +259,18 @@ def __init__( ) -> None: """ - Args: - time_resolution: time resolution enumeration, either hourly or daily - date_times: list of datetimes for which RADOLAN is requested, minutes have - to be defined (HOUR:50), otherwise rounded to 50 minutes as of its provision - start_date: alternative to datetimes, giving a start and end date - end_date: alternative to datetimes, giving a start and end date - prefer_local: boolean if RADOLAN should rather be loaded from disk, for - processing purposes - write_file: boolean if file should be stored on drive - folder: folder where to store RADOLAN data + :param time_resolution: Time resolution enumeration, either hourly or daily + :param date_times: List of datetimes for which RADOLAN is requested. + Minutes have o be defined (HOUR:50), otherwise rounded + to 50 minutes as of its provision. + :param start_date: Alternative to datetimes, giving a start and end date + :param end_date: Alternative to datetimes, giving a start and end date + :param prefer_local: RADOLAN should rather be loaded from disk, for + processing purposes + :type prefer_local: bool + :param write_file: File should be stored on drive + :type write_file: bool + :param folder: Folder where to store RADOLAN data """ time_resolution = parse_enumeration_from_template( time_resolution, TimeResolution @@ -323,8 +325,7 @@ def collect_data(self) -> Generator[Tuple[datetime, BytesIO], None, None]: """ Function used to get the data for the request returned as generator. - Returns: - for each datetime the same datetime and file in bytes + :return: For each datetime, the same datetime and file in bytes """ for date_time in self.date_times: _, file_in_bytes = collect_radolan_data( diff --git a/wetterdienst/data_collection.py b/wetterdienst/data_collection.py index fe8c22dc6..40b61c333 100644 --- a/wetterdienst/data_collection.py +++ b/wetterdienst/data_collection.py @@ -74,25 +74,34 @@ def collect_climate_observations_data( station id and, given by the parameters, either tries to get data from local store and/or if fails tries to get data from the internet. Finally if wanted it will try to store the data in a hdf file. - Args: - station_ids: station ids that are trying to be loaded - parameter: parameter as enumeration - time_resolution: time resolution as enumeration - period_type: period type as enumeration - folder: folder for local file interaction - prefer_local: boolean for if local data should be preferred - write_file: boolean to write data to local storage - tidy_data: boolean to tidy up data so that there's only one set of values for - a datetime in a row - e.g. station_id, parameter, element, datetime, value, quality - humanize_column_names: boolean to yield column names better for - human consumption - run_download_only: boolean to run only the download and storing process - create_new_file_index: boolean if to create a new file index for the - data selection - - Returns: - a pandas DataFrame with all the data given by the station ids + + :param station_ids: station ids that are trying to be loaded + :type station_ids: List[int] + :param parameter: Parameter as enumeration + :type parameter: Parameter + :param time_resolution: Time resolution as enumeration + :type time_resolution: TimeResolution + :param period_type: Period type as enumeration + :type period_type: PeriodType + :param folder: Folder for local file interaction + :type folder: str + :param prefer_local: Local data should be preferred + :type prefer_local: bool + :param write_file: Write data to local storage + :type write_file: bool + :param tidy_data: Tidy up data so that there's only one set of values + for a datetime in a row, e.g. station_id, parameter, + element, datetime, value, quality. + :type tidy_data: bool + :param humanize_column_names: Yield column names for human consumption + :type humanize_column_names: bool + :param run_download_only: Run only the download and storing process + :type run_download_only: bool + :param create_new_file_index: Create a new file index for the data selection + :type create_new_file_index: bool + + :return: All the data given by the station ids. + :rtype: pandas.DataFrame """ parameter = parse_enumeration_from_template(parameter, Parameter) time_resolution = parse_enumeration_from_template(time_resolution, TimeResolution) @@ -185,13 +194,11 @@ def _tidy_up_data(df: pd.DataFrame, parameter: Parameter) -> pd.DataFrame: Function to create a tidy DataFrame by reshaping it, putting quality in a separate column and setting an extra column with the parameter. - Args: - df: DataFrame to be tidied - parameter: the parameter that is written in a column to identify a set of - different parameters amongst each other + :param df: DataFrame to be tidied + :param parameter: the parameter that is written in a column to identify a set of + different parameters amongst each other - Returns: - the tidied DataFrame + :return: The tidied DataFrame """ id_vars = [] date_vars = [] @@ -250,15 +257,20 @@ def collect_radolan_data( """ Function used to collect RADOLAN data for given datetimes and a time resolution. Additionally the file can be written to a local folder and read from there as well. - Args: - date_times: list of datetime objects for which RADOLAN shall be acquired - time_resolution: the time resolution for requested data, either hourly or daily - prefer_local: boolean if file should be read from local store instead - write_file: boolean if file should be stored on the drive - folder: path for storage - - Returns: - list of tuples of a datetime and the corresponding file in bytes + + :param date_times: List of datetime objects for which RADOLAN shall be acquired + :type date_times: List[datetime] + :param time_resolution: Time resolution for requested data, either hourly or daily + :type time_resolution: TimeResolution + :param prefer_local: File should be read from local store instead + :type prefer_local: bool + :param write_file: File should be stored on the drive + :type write_file: bool + :param folder: Path for storage + :type folder: str + + :return: List of tuples of a datetime and the corresponding file in bytes + :rtype: List[Tuple[datetime, BytesIO]] """ if time_resolution not in (TimeResolution.HOURLY, TimeResolution.DAILY): raise ValueError("RADOLAN is only offered in hourly and daily resolution.") diff --git a/wetterdienst/data_models/coordinates.py b/wetterdienst/data_models/coordinates.py index e0a805e8e..d6d2ab703 100644 --- a/wetterdienst/data_models/coordinates.py +++ b/wetterdienst/data_models/coordinates.py @@ -1,4 +1,3 @@ -"""Class for storing and retrieving coordinates""" import numpy as np diff --git a/wetterdienst/data_storing.py b/wetterdienst/data_storing.py index d4f6422bf..3301039cc 100644 --- a/wetterdienst/data_storing.py +++ b/wetterdienst/data_storing.py @@ -28,15 +28,20 @@ def store_climate_observations( DataFrame plus additionally the request parameters to identify data within the hdf file and another folder argument for the place where the file is stored. - Args: - station_data: the pandas DataFrame with the obtained data - station_id: the id of the station of which the data is stored in the DataFrame - parameter: the parameter enumeration - time_resolution: the time resolution enumeration - period_type: the period type as enumeration - folder: the folder where the hdf is stored - Returns: - None, prints information if data was not stored + :param station_data: The pandas DataFrame with the obtained data + :type station_data: pandas.DataFrame + :param station_id: The station id of the station to store + :type station_id: int + :param parameter: Observation measure + :type parameter: Parameter + :param time_resolution: Frequency/granularity of measurement interval + :type time_resolution: TimeResolution + :param period_type: Recent or historical files + :type period_type: PeriodType + :param folder: The folder where the hdf is stored + :type folder: str + + :return: None, prints information if data was not stored """ # Make sure that there is data that can be stored if station_data.empty: @@ -64,16 +69,19 @@ def restore_climate_observations( Function to restore data from a local hdf file based on the place (folder) where the file is stored and parameters that define the request in particular. - Args: - station_id: the station id of which data should be restored - parameter: parameter as enumeration - time_resolution: time resolution as enumeration - period_type: period type as enumeration - folder: folder where the hdf file should be found as string - - Returns: - a DataFrame holding the data or an empty DataFrame depending on if data - could be restored + :param station_id: Station id of which data should be restored + :type station_id: int + :param parameter: Observation measure + :type parameter: Parameter + :param time_resolution: Frequency/granularity of measurement interval + :type time_resolution: TimeResolution + :param period_type: Recent or historical files + :type period_type: PeriodType + :param folder: The folder where the hdf is stored + :type folder: str + + :return: All the data + :rtype: pandas.DataFrame """ request_string = _build_local_store_key( station_id, parameter, time_resolution, period_type @@ -103,14 +111,17 @@ def _build_local_store_key( Function that builds a request string from defined parameters including a single station id - Args: - station_id: station id of data - parameter: parameter as enumeration - time_resolution: time resolution as enumeration - period_type: period type as enumeration - - Returns: - a string building a key that is used to identify the request + :param station_id: Station id of data + :type station_id: int + :param parameter: Observation measure + :type parameter: Parameter + :param time_resolution: Frequency/granularity of measurement interval + :type time_resolution: TimeResolution + :param period_type: Recent or historical files + :type period_type: PeriodType + + :return: A string building a key that is used to identify the request + :rtype: str """ request_string = ( f"{parameter.value}/{time_resolution.value}/" diff --git a/wetterdienst/download/download.py b/wetterdienst/download/download.py index 7f3635582..042fd6a61 100644 --- a/wetterdienst/download/download.py +++ b/wetterdienst/download/download.py @@ -1,4 +1,6 @@ -""" download scripts """ +""" +**Download utilities** +""" import gzip import tarfile from typing import List, Union, Tuple diff --git a/wetterdienst/download/download_services.py b/wetterdienst/download/download_services.py index e79adf2bb..acb76f70c 100644 --- a/wetterdienst/download/download_services.py +++ b/wetterdienst/download/download_services.py @@ -1,4 +1,6 @@ -""" helping functions for downloading german weather service data """ +""" +**DWD download utilities** +""" from io import BytesIO from pathlib import PurePosixPath from typing import Union diff --git a/wetterdienst/enumerations/column_names_enumeration.py b/wetterdienst/enumerations/column_names_enumeration.py index f1e89925c..31e6fcebb 100644 --- a/wetterdienst/enumerations/column_names_enumeration.py +++ b/wetterdienst/enumerations/column_names_enumeration.py @@ -60,10 +60,10 @@ class _DWDDataColumnBase(metaclass=_GetAttrMeta): class DWDOrigDataColumns(_DWDDataColumnBase): """ Original data column names from DWD data - - two anomalies: - - daily/kl -> QN_3, QN_4 - - monthly/kl -> QN_4, QN_6 - - annual/kl -> QN_4, QN_6 + Two anomalies: + - daily/kl -> QN_3, QN_4 + - monthly/kl -> QN_4, QN_6 + - annual/kl -> QN_4, QN_6 """ # 1_minute @@ -415,10 +415,11 @@ class WEATHER_PHENOMENA(Enum): # noqa class DWDDataColumns(_DWDDataColumnBase): """ Original data column names from DWD data - - two anomalies: - - daily/kl -> QN_3, QN_4 - - monthly/kl -> QN_4, QN_6 - - annual/kl -> QN_4, QN_6 + + Two anomalies: + - daily/kl -> QN_3, QN_4 + - monthly/kl -> QN_4, QN_6 + - annual/kl -> QN_4, QN_6 """ # 1_minute diff --git a/wetterdienst/parse_metadata.py b/wetterdienst/parse_metadata.py index af259f135..ec571d1df 100644 --- a/wetterdienst/parse_metadata.py +++ b/wetterdienst/parse_metadata.py @@ -26,22 +26,27 @@ def metadata_for_climate_observations( ) -> pd.DataFrame: """ A main function to retrieve metadata for a set of parameters that creates a - corresponding csv. + corresponding csv. STATE information is added to metadata for cases where there's no such named column (e.g. STATE) in the pandas.DataFrame. For this purpose we use daily precipitation data. That has two reasons: - - daily precipitation data has a STATE information combined with a city - - daily precipitation data is the most common data served by the DWD - Args: - parameter: observation measure - time_resolution: frequency/granularity of measurement interval - period_type: recent or historical files - create_new_meta_index: if true: a new meta index for metadata will - be created - create_new_file_index: if true: a new file index for metadata will - be created - Returns: - pandas.DataFrame with metadata for selected parameters + + - daily precipitation data has a STATE information combined with a city + - daily precipitation data is the most common data served by the DWD + + :param parameter: Observation measure + :type parameter: Parameter + :param time_resolution: Frequency/granularity of measurement interval + :type time_resolution: TimeResolution + :param period_type: Recent or historical files + :type period_type: PeriodType + :param create_new_meta_index: Create a new meta index for metadata + :type create_new_meta_index: bool + :param create_new_file_index: Create a new file index + :type create_new_file_index: bool + + :return: List of stations for selected parameters + :rtype: pandas.DataFrame """ if create_new_meta_index: reset_meta_index_cache()