Skip to content

v0.5.0 (2024-05-23) -- Added class KeyedList; strip(), split_where(), multi_groupby(), reduce_dodaf_to_daf() and multi_groupby_reduce()

Compare
Choose a tag to compare
@raylutz raylutz released this 23 May 20:14
· 29 commits to main since this release
v0.5.0  (2024-05-23)
        Added split_where(self, where: Callable) which makes a single pass and splits the daf array in two
            true_daf, false_daf.
        Added to Daffodil multi_groupby(), reduce_dodaf_to_daf() and multi_groupby_reduce()
        Added class KeyedList() to provide a new data item that functions like a dict but is a hd plus list.
            can result in much better performance by not redistributing values in the dict structure.
            This is not yet integrated into daffodil fully.
            
        Removed '_da' from many Daffodil methods and for keyword parameters, to allow future upgrade to KeyedList.
            select_record_da()      -> select_record()
            record_append()
            _basic_get_record_da    -> _basic_get_record
            assign_record_da()      -> assign_record()
            assign_record_da_irow   -> assign_record_irow
            update_by_keylist()
            update_record_da_irow   -> update_record_irow
        changed test_daf accordingly.
            
        Added _build_hd() to consistently build header dict structure.
        Added to_json() and from_json() methods to allow generation of custom JSONEncoder.
        Changed nomenclature in KeyedList class from dex to hd.
        Added from_json and to_json to KeyedList class to allow custom JSONEncoder to be developed.
        
        select_record() silently returns {} if self is empty.
        
        fixed _itermode vs. itermode.
        Added .strip() method.
        correct icols when providing a single str column name, and when column names have more than one character each.
        Added 'flatten' in '.to_list' method which will combine lol to a single list.
        Added .num_rows() which will more robustly calculate the number of rows in edge cases.
        Fix unflattening issue discovered when running edge_test_utils.py.
        Updated documentation to reflect new approach to dtypes and flattening.