Be notified of new releases
Create your free GitHub account today to subscribe to this repository for new releases and build software alongside 40 million developers.Sign up
We are excited to announce the availability of Delta Lake 0.3.0 which introduces new programmatic APIs for manipulating and managing data in Delta Lake tables. Here are the main features:
Scala/Java APIs for DML commands - You can now modify data in Delta Lake tables using programmatic APIs for Delete (#44), Update (#43) and Merge (#42). These APIs mirror the syntax and semantics of their corresponding SQL commands and are great for many workloads, e.g., Slowly Changing Dimension (SCD) operations, merging change data for replication, and upserts from streaming queries. See the documentation for more details.
Scala/Java APIs for query commit history (#54) - You can now query a table’s commit history to see what operations modified the table. This enables you to audit data changes, time travel queries on specific versions, debug and recover data from accidental deletions, etc. See the documentation for more details.
Scala/Java APIs for vacuuming old files (#48) - Delta Lake uses MVCC to enable snapshot isolation and time travel. However, keeping all versions of a table forever can be prohibitively expensive. Stale snapshots (as well as other uncommitted files from aborted transactions) can be garbage collected by vacuuming the table. See the documentation for more details.
To try out Delta Lake 0.3.0, please follow the Delta Lake Quickstart.
We are delighted to announce the availability of Delta Lake 0.2.0!
To try out Delta Lake 0.2.0, please follow the Delta Lake Quickstart.
This release introduces two main features:
Cloud storage support - In addition to HDFS, you can now configure Delta Lake to read and write data on cloud storage services such as Amazon S3 (issue #39) and Azure Blob Storage (issue #40). See here for configuration instructions.
Improved concurrency (issue #69) - Delta Lake now allows concurrent append-only writes while still ensuring serializability. To be considered as append-only, a writer must be only adding new data without reading or modifying existing data in any way. See here for more details.
We have also greatly expanded the test coverage as part of this release.