v2.3.0
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Jan Marius Willruth @JanWillruth)
New features and enhancements
-
mdf_reader.read_datanow supports chunking (PR/360) -
read and write both
parquetandfeatherfiles including new parameterdata_format(GH/353, PR/363):mdf_reader.read_data,mdf_reader.write_datacdm_mapper.read_tablescdm_mapper.write_tables
-
introduce
ParquetStreamReaderto replacepd.parsers.io.TextfileReader(GH/8, PR/348) -
cdm_reader.map_modelnow supports bothpd.DataFrameandParquetStreamReaderas output (PR/348) -
common.replace_columnsnow supports bothpd.DataFrameandParquetStreamReaderas output (PR/348) -
cdm_mapper.utils.mapping_functions: new mapping functionconvert_to_decimal(PR/370) -
test_data: add MAROB test data (PR/370) -
mdf_reader.read_data: new parameter "delimiter" (PR370) -
cdm_mapper.map_model's output now has attribute "attrs" where columns are stored (PR/379) -
ParquetStreamReadernow support item assignment (PR/383) -
ParquetStreamReadernow works with both list and tuple as input data (PR/383)
Breaking changes
-
DataBundle.stack_vandDataBundle.stack_honly supportpd.DataFramesas input, otherwise raises anValueError(PR/360) -
set default for
extensionfrompsvto specifieddata_format(PR/363):cdm_mapper.read_tablescdm_mapper.write_tables
-
set default for
extensionfrom ``csvto specifieddata_format` in `mdf_reader.write_data` (PR/363) -
mdf_reader.read_data: savedtypesin return DataBundle aspd.Seriesnotdict(PR/363) -
cdm_reader_mappernow raises errors instead of logging them (PR/348) -
DataBundlenow converts all iterables ofpd.DataFrame/pd.SeriestoParquetStreamReaderwhen initialized (PR/348) -
all main functions in
common.selectnow return a tuple of 4 (selected values, rejected values, original indexes of selected values, original indexes of rejected values) (PR/348) -
move
ParquetStreamReaderand all corresponding methods tocommon.iterablesto handle chunking outside ofmdf_reader/cdm_mapper/core/metmetpy(GH/349, PR/348) -
cdm_mapper.read_tables: if "suffix" is None no suffix is selected instead of the wildcard "*" (PR/379) -
ParquetStreamReader.emptynow is a property not a class method (PR/379) -
cdm_mapper.utils.mapping_functions.string_adddoes no longer have parameterszfill_colandzfill(PR/383)
Bug fixes
Internal changes
- re-work internal structure for more readability and better performance (PR/360)
- use pre-defined
Literalconstants incdm_reader_mapper.properties(PR/363) mdf_reader.utils.utilities.read_csv: parametercolumnstocolumn_names(PR/363)- introduce post-processing decorator that handles both
pd.DataFrameandParquetStreamReader(PR/348) cdm_mapper.mapper._map_data_modelnow returns a tuple of DataFrame and columns (PR/379)- delete unused function
cdm_mapper.utils.mapping_functions.marob_location_quality(PR/383) - delete unreachable code snippets (PR/383)
- mainly increase test coverage (:issue:
365, PR/383)