Skip to content

uniqdiff 1.1.0: Streaming Field Diff Engine

Latest

Choose a tag to compare

@Hweded Hweded released this 07 May 18:19

uniqdiff 1.1.0

uniqdiff 1.1.0 is an additive engine release focused on structured diff, streaming output, schema awareness, and better production usage in ETL / CI pipelines.

This release keeps the stable 1.0 comparison engine contract and extends it with new engine-level primitives for field-level diff, schema diff, sorted streaming diff, JSONL event output, and downstream UniqTools integrations.

Highlights

  • Added field-level diff by key.
  • Added schema-aware diff for columns, inferred value types, and nullability.
  • Added sorted streaming diff APIs for already sorted inputs.
  • Added production-grade uniqdiff.jsonl event stream output.
  • Added lazy JSONL/event readers and summary helpers.
  • Added explicit uniqdiff.engine facade for stable engine imports.
  • Improved mode="auto" metadata and disk strategy selection.
  • Improved JSONL/file output performance.
  • Optimized composite key token extraction and duplicate collection.
  • Added profiling and realistic cross-tool benchmark tooling.

New Engine APIs

New documented APIs include:

  • compare_sorted_iter()
  • write_sorted_diff_file()
  • compare_fields()
  • compare_fields_sorted()
  • compare_fields_files()
  • compare_fields_files_sorted()
  • compare_file_fields()
  • compare_file_fields_sorted()
  • iter_field_diff_rows()
  • iter_field_diff_events()
  • iter_sorted_field_diff_events()
  • infer_schema()
  • compare_schema()
  • compare_file_schema()
  • iter_compare_events()
  • iter_event_rows()
  • summarize_events()
  • summarize_event_file()
  • uniqdiff.engine

CLI Improvements

The CLI now supports:

  • --format jsonl
  • --field-diff
  • --sorted-input
  • --exclude-columns
  • --max-rows
  • --max-bytes
  • --max-output-rows
  • --max-output-bytes
  • --schema-diff
  • --schema-sample-size
  • --empty-string-not-null
  • --loose-numeric-types

Example:

uniqdiff diff old.csv new.csv \
  --key id \
  --field-diff \
  --format jsonl \
  --output diff.jsonl