Release v2.3.0 · glamod/cdm_reader_mapper

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Jan Marius Willruth @JanWillruth)

New features and enhancements

mdf_reader.read_data now supports chunking (PR/360)
read and write both parquet and feather files including new parameter data_format (GH/353, PR/363):
- mdf_reader.read_data,
- mdf_reader.write_data
- cdm_mapper.read_tables
- cdm_mapper.write_tables
introduce ParquetStreamReader to replace pd.parsers.io.TextfileReader (GH/8, PR/348)
cdm_reader.map_model now supports both pd.DataFrame and ParquetStreamReader as output (PR/348)
common.replace_columns now supports both pd.DataFrame and ParquetStreamReader as output (PR/348)
cdm_mapper.utils.mapping_functions: new mapping function convert_to_decimal (PR/370)
test_data: add MAROB test data (PR/370)
mdf_reader.read_data: new parameter "delimiter" (PR370)
cdm_mapper.map_model's output now has attribute "attrs" where columns are stored (PR/379)
ParquetStreamReader now support item assignment (PR/383)
ParquetStreamReader now works with both list and tuple as input data (PR/383)

DataBundle.stack_v and DataBundle.stack_h only support pd.DataFrames as input, otherwise raises an ValueError (PR/360)
set default for extension from psv to specified data_format (PR/363):
- cdm_mapper.read_tables
- cdm_mapper.write_tables
set default for extension from ``csv to specifieddata_format` in `mdf_reader.write_data` (PR/363)
mdf_reader.read_data: save dtypes in return DataBundle as pd.Series not dict (PR/363)
remove common.pandas_TextParser_hdlr (GH/8, PR/348)
cdm_reader_mapper now raises errors instead of logging them (PR/348)
DataBundle now converts all iterables of pd.DataFrame/pd.Series to ParquetStreamReader when initialized (PR/348)
all main functions in common.select now return a tuple of 4 (selected values, rejected values, original indexes of selected values, original indexes of rejected values) (PR/348)
move ParquetStreamReader and all corresponding methods to common.iterables to handle chunking outside of mdf_reader/cdm_mapper/core/metmetpy (GH/349, PR/348)
cdm_mapper.read_tables: if "suffix" is None no suffix is selected instead of the wildcard "*" (PR/379)
ParquetStreamReader.empty now is a property not a class method (PR/379)
cdm_mapper.utils.mapping_functions.string_add does no longer have parameters zfill_col and zfill (PR/383)

replace "ICOADS-30-" with "ICOADS-300-" in icoads_r300 mapping tables (GH/385, PR/386)

re-work internal structure for more readability and better performance (PR/360)
use pre-defined Literal constants in cdm_reader_mapper.properties (PR/363)
mdf_reader.utils.utilities.read_csv: parameter columns to column_names (PR/363)
introduce post-processing decorator that handles both pd.DataFrame and ParquetStreamReader (PR/348)
cdm_mapper.mapper._map_data_model now returns a tuple of DataFrame and columns (PR/379)
delete unused function cdm_mapper.utils.mapping_functions.marob_location_quality (PR/383)
delete unreachable code snippets (PR/383)
mainly increase test coverage (:issue:365, PR/383)