Skip to content

Release 2.3.0

Latest

Choose a tag to compare

@github-actions github-actions released this 02 Jul 19:03
44c86e6

Highlights

DataJoint 2.3 builds on the 2.2.x line with a set of features centered on provenance — knowing, and being able to trust, exactly which upstream data a computed result was derived from.

The headline is the provenance trinity. Diagram.trace constructs the upstream view of any result; self.upstream makes those declared ancestors ergonomic to read inside make(); and the opt-in strict_provenance flag turns the long-standing "read only from declared dependencies, write only to self" convention into something the framework actively checks. Together they move DataJoint's core provenance promise from a convention people hope to follow to one the framework helps construct and — when enabled — enforce.

Around that:

  • SparkAdapter Codec Protocol — typed codecs can expose their decoded values as Spark-native types, opening lakehouse / Delta-Sharing consumers (e.g. Databricks) to columns that were previously opaque blobs.
  • dj.deploy.set_replica_identity — a new dj.deploy module configures PostgreSQL REPLICA IDENTITY for change-data-capture pipelines.
  • Cascade fixpart_integrity="cascade" is now correct across Part-of-Part and renamed-foreign-key chains (the same upward-propagation machinery Diagram.trace builds on).

No breaking changes. strict_provenance defaults off and everything else is additive, so existing pipelines are unaffected. See What's New in 2.3 for the full narrative.

🚀 Features

🐛 Bug Fixes

Full Changelog: v2.2.4...v2.3.0