Skip to content

Release v0.4.0

Choose a tag to compare

@JohnnyWilson16 JohnnyWilson16 released this 15 Jun 18:43

Release Notes — freshdata-cleaner v0.4.0 (Enterprise)

We are thrilled to announce the official release of freshdata-cleaner version 0.4.0! This release introduces the new Enterprise Layer, designed to extend our rule-based data cleaning engine to larger enterprise data-science workloads, data quality checking, and lineage tracking.

What's New in v0.4.0

1. Enterprise Cleaning Layer (freshdata.enterprise)

  • enterprise.Cleaner: Orchestrates large-scale data cleaning pipelines across heterogeneous source types with high-performance execution.
  • Lineage Tracking (lineage.py): Automatically maps and tracks the flow and modifications of your columns and values as they pass through the cleaning steps.
  • Data Quality Metrics (metrics.py): Calculates statistical profiles, compares distributions, and exports machine-learning-ready data-quality dashboards.

2. Command-Line Interface (CLI)

  • A new CLI tool freshdata is now registered on your PATH.
  • Supported subcommands:
    • freshdata clean — Clean datasets directly from your terminal.
    • freshdata profile — Profile datasets and print formatting issues.
    • freshdata trust — Run sanity checks and verify schema configurations.

3. CI and Robustness Upgrades

  • Upgraded the test coverage threshold constraint to 93% (currently reaching 95.26% coverage over 824 unit tests).
  • Adjusted warning filters to cleanly isolate third-party deprecations, ensuring a zero-warning bar for freshdata internal code.
  • Relaxed local performance/benchmark baseline check tolerances to ensure test runner stability.

Installation

pip install freshdata-cleaner==0.4.0