All notable changes to this project will be documented in this file.
1.2.0 (2024-04-11)
1.1.1 (2024-03-14)
- statement: "Contain" report (8fd0987)
1.1.0 (2024-01-09)
- Limit output size to stdout (a41f31f)
1.0.1 (2022-10-21)
- Add libpq-dev as required package (0d508f8)
1.0.0 (2022-07-21)
- ColumnExpression report format
- Remove
verbose
from Contain statement in favor ofreport_limit
. Null values are now valid - Create
attach_backend
andget_backend
- Create support for multibackend resources + refactor
- Create
attach_backend
andget_backend
(37d6e31) - Create support for multibackend resources + refactor (8500721)
- Dask backend for NotNull stmt (b3fb691)
- Dask backend for RowCount stmt (a320201)
- Dask backend for Unique stmt (e3cdfd6)
- Implement Dask serialize in all treaters (e824d78)
- Improve ColumnExpression and implement Dask (a02e247)
- Improve Contain stmt and implement Dask (effa169)
- Standardize data readers (a42e9d3)
- Boolean treater (22b39b5)
- DataTreater
treat
support Iterables + force (ae84cad) - fix tests for Dask (40cd1af)
- FutureWarning consistent across repo (60d5722)
- Missing lazy module usage in typing (0413433)
- Remove BaseStatement from statements map (d7e844c)
- Remove Python3.8-only features (0bb8dca)
- Use lru_cache without limits instead of cache (33f0a14)
- Validator behavior for Dask backend (9379f63)
0.8.0 (2022-06-28)
- Minor changes to exception message (0e00b38)
- Rewords list-like alias requirement exception (e526258)
0.7.2 (2022-04-07)
- Deal with pagination in S3 list_objects (ed5e10e)
2022-02-22
- Fix bugs when profiling entirely null columns
2022-02-22
- New
ColumnExpression
(column_expression
) statement to evaluate an expression involving the columns from a scope. data_reader
can now get table from SQL query or file- Add
reader_kwargs
andvalidator_kwargs
arguments inDeirokayOperator
(to support feature above) - Deprecation warning for
path_to_file
argument inDeirokayOperator
in favor ofdata
(to make API consistent with feature above)
- Add
2021-12-15
- New
deirokay.DTypes.DECIMAL
for exact decimal value representation. - New
distinct
boolean parameter forrow_count
statement, which makes it possible to count only distinct values in a given scope. - General improvements in templates generated by
profile
method of builtin statements. - Better logs (stdout) for
deirokay.validate
failed statements. Now it also prints the scope the statements are applied to and the provided severity threshold level.
2021-12-09
- Fix jinja templating for
DeirokayOperator
2021-12-09
- Custom Jinja templates also in
DeirokayOperator
2021-12-09
- Custom Jinja templates in
Derokay.validate
- Improve exception error when there is a typo in column names
2021-12-08
- Fix for setup install
2021-12-08
- New
contain
statement (deirokay.statements.Contain
) - Serialization of DataFrame data into validation documents with
get_dtype_treater
andget_treater_instance
(fromdeirokay.parser
) andserialize
methods (in Deirokay treaters)