Skip to content

v2.3.0

Choose a tag to compare

@ludwiglierhammer ludwiglierhammer released this 12 Mar 11:25
· 213 commits to main since this release
5b917e4

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Jan Marius Willruth @JanWillruth)

New features and enhancements

  • mdf_reader.read_data now supports chunking (PR/360)

  • read and write both parquet and feather files including new parameter data_format (GH/353, PR/363):

    • mdf_reader.read_data,
    • mdf_reader.write_data
    • cdm_mapper.read_tables
    • cdm_mapper.write_tables
  • introduce ParquetStreamReader to replace pd.parsers.io.TextfileReader (GH/8, PR/348)

  • cdm_reader.map_model now supports both pd.DataFrame and ParquetStreamReader as output (PR/348)

  • common.replace_columns now supports both pd.DataFrame and ParquetStreamReader as output (PR/348)

  • cdm_mapper.utils.mapping_functions: new mapping function convert_to_decimal (PR/370)

  • test_data: add MAROB test data (PR/370)

  • mdf_reader.read_data: new parameter "delimiter" (PR370)

  • cdm_mapper.map_model's output now has attribute "attrs" where columns are stored (PR/379)

  • ParquetStreamReader now support item assignment (PR/383)

  • ParquetStreamReader now works with both list and tuple as input data (PR/383)

Breaking changes

  • DataBundle.stack_v and DataBundle.stack_h only support pd.DataFrames as input, otherwise raises an ValueError (PR/360)

  • set default for extension from psv to specified data_format (PR/363):

    • cdm_mapper.read_tables
    • cdm_mapper.write_tables
  • set default for extension from ``csv to specifieddata_format` in `mdf_reader.write_data` (PR/363)

  • mdf_reader.read_data: save dtypes in return DataBundle as pd.Series not dict (PR/363)

  • remove common.pandas_TextParser_hdlr (GH/8, PR/348)

  • cdm_reader_mapper now raises errors instead of logging them (PR/348)

  • DataBundle now converts all iterables of pd.DataFrame/pd.Series to ParquetStreamReader when initialized (PR/348)

  • all main functions in common.select now return a tuple of 4 (selected values, rejected values, original indexes of selected values, original indexes of rejected values) (PR/348)

  • move ParquetStreamReader and all corresponding methods to common.iterables to handle chunking outside of mdf_reader/cdm_mapper/core/metmetpy (GH/349, PR/348)

  • cdm_mapper.read_tables: if "suffix" is None no suffix is selected instead of the wildcard "*" (PR/379)

  • ParquetStreamReader.empty now is a property not a class method (PR/379)

  • cdm_mapper.utils.mapping_functions.string_add does no longer have parameters zfill_col and zfill (PR/383)

Bug fixes

  • replace "ICOADS-30-" with "ICOADS-300-" in icoads_r300 mapping tables (GH/385, PR/386)

Internal changes

  • re-work internal structure for more readability and better performance (PR/360)
  • use pre-defined Literal constants in cdm_reader_mapper.properties (PR/363)
  • mdf_reader.utils.utilities.read_csv: parameter columns to column_names (PR/363)
  • introduce post-processing decorator that handles both pd.DataFrame and ParquetStreamReader (PR/348)
  • cdm_mapper.mapper._map_data_model now returns a tuple of DataFrame and columns (PR/379)
  • delete unused function cdm_mapper.utils.mapping_functions.marob_location_quality (PR/383)
  • delete unreachable code snippets (PR/383)
  • mainly increase test coverage (:issue:365, PR/383)