Skip to content

v0.5.5 -- Introduced Pyon, indirect_col, __contains__, T_lor in indexing.

Choose a tag to compare

@raylutz raylutz released this 27 Dec 22:44
· 48 commits to main since this release

Largest improvements:

  • introduction of PYON format.

  • added indirect col and support in apply and reduce to handle embedded PYON.

  • Added indexing with range and T_lor (list of range) types, for both column and row indexing.

  • Added contains method to allow " if key in my_daf: " to test if a given key exists.

  • Added daf_sql.py mainly for testing, but will be the basis for extension to sqlite backing with same syntax.

          v0.5.5  (2024-12-27)
          add daf_crm.py as demonstration.
              add op_import() 
          add get_csv_column_names() as refactoring in daf_utils for reading csv.
          precheck_csv_cols()
          compare_lists() -- imported from utils
          Introduce ops_daf for running operations, can also use in audit-engine.
              operation descriptions taken from docstring.
          added 'default_type' to apply_dtypes for any cols not specified in passed dtypes.
          Improved preprocessing of csv file when line is commented out and embedded newlines exist in the line.
          Improved Daf.from_lod() by using columns in dtypes dict if provided instead of relying only on first record of lod.
          
          Added indexing with range and T_lor (list of range) types, for both column and row indexing.
          Added __contains__ method to allow " if key in my_daf: " to test if a given key exists. Requires kd exists.
          revised .sum_da() based on feedback from user group.
          Improve formatting of README.md to include tables of examples.
          improve daf_benchmarks.py to use objsize instead of pympler to evaluate memory use.
          Corrected set_keyfield in daffodil to do nothing if daf is empty.
          Added 'sparse_rows' to reduction 'by' type using an indirect_col.
          Improve daf_sum() to support indirect_col.
          Revised apply_in_place to support by='row_klist'. Func will modify row_klist and that will modify the array.
              Changed name of keyword parameter in apply_in_place() from keylist to rowkeys to avoid confusion.
          added astype parameter for to_list() and to_value()
          Introduced standardization around PYON instead of JSON:
              - Easier to convert esp. during serialization using csv.writer().
              - Compatible with more Python data types.
              - Still easy to convert to JSON.
          Copied function create_index_at_cursor() for sql tables in daf_benchmarks.py
          Added daf_sql.py mainly to support benchmarks at this point.
          This will be the last release before sql enhancements.