v1.0.0
What's Changed
Full Changelog: v0.19.5...v1.0.0
Summary
This is the v1.0.0 general availability release of DataComPy, a major version bump from 0.19.x. It introduces a new comparator architecture, sensitive column handling, and several quality-of-life improvements across all backends.
Breaking Changes
- Comparator subpackage (
datacompy/comparator/): New strategy-pattern architecture for column comparison logic. Numeric, string, and array comparators are now backend-specific classes (PandasNumericComparator,SparkStringComparator, etc.) rather than inline logic. Custom comparators can be injected into any backend compare class. validate_tolerance_parameteris now a public API (renamed from_validate_tolerance_parameter).- Fugue integration removed.
- Python minimum is now 3.12 (CI/tooling); PySpark dependency split by Python version.
- Snowflake:
snowflake-snowpark-pythonminimum bumped to 1.37.
New Features
- Sensitive column masking across all four backends (Pandas, Polars, Spark, Snowflake) — columns can be hidden or hashed without modifying the original DataFrames.
- Custom comparators — pass a comparator instance per column to any backend for fully custom comparison logic.
cols_with_mismatchesmethod — programmatic access to the set of mismatching column names.cache_intermediatesoption forSparkSQLCompare— controls intermediate DataFrame caching to tune Spark job performance.- Pandas 3 support.
- Per-column tolerance as a
dictin addition to a globalfloat.
Fixes
SparkSQLCompare: forbid case-sensitive join columns.- Snowflake:
max_diffnow correctly defaults to0whenNone. - Join columns
Nonehandling fixed.
Other
- Jinja2 template-based report rendering.
- Copyright year updated to 2026.
CLAUDE.mdadded.- Dependency ranges updated across all backends.