Problem
add_columns had two gaps compared with the main write path when working with Blob v2 datasets:
- It could not resolve Blob v2 external URIs during the update/write path.
- It could leave orphaned data files and Blob v2 sidecars behind when the operation failed after partial writes.
This made add_columns inconsistent with the expected Blob v2 behavior and could leave storage artifacts behind on failure.
Blob v2 external URI resolution
add_columns now opens its update writer with the same Blob v2 external base resolution needed by the normal write path, so Blob v2 reference values can resolve dataset-registered external URIs correctly.
Failed write cleanup
add_columns now cleans up files created by the current failed attempt, including:
- unfinished data files from the current writer
- completed but uncommitted fragment outputs
- Blob v2 sidecar directories created for those files
The cleanup logic also preserves the intended safety boundaries:
- do not delete external-base files
- do not delete fragments already recovered from / owned by checkpoint state
Scope
This issue covers the add_columns path and its Blob v2 / cleanup behavior.
Follow-up
alter_columns is not included in this change set.
alter_columns now shares some of the same lower-level machinery, but its commit-failure cleanup path should be handled in a separate follow-up PR to keep scope focused and reviewable.
Notes
This work is intended to keep add_columns behavior aligned with Lance Blob v2 design expectations:
- consistent URI resolution behavior across write paths
- no orphaned internal files on failed operations
- no accidental deletion of external user-managed data
Problem
add_columnshad two gaps compared with the main write path when working with Blob v2 datasets:This made
add_columnsinconsistent with the expected Blob v2 behavior and could leave storage artifacts behind on failure.Blob v2 external URI resolution
add_columnsnow opens its update writer with the same Blob v2 external base resolution needed by the normal write path, so Blob v2 reference values can resolve dataset-registered external URIs correctly.Failed write cleanup
add_columnsnow cleans up files created by the current failed attempt, including:The cleanup logic also preserves the intended safety boundaries:
Scope
This issue covers the
add_columnspath and its Blob v2 / cleanup behavior.Follow-up
alter_columnsis not included in this change set.alter_columnsnow shares some of the same lower-level machinery, but its commit-failure cleanup path should be handled in a separate follow-up PR to keep scope focused and reviewable.Notes
This work is intended to keep
add_columnsbehavior aligned with Lance Blob v2 design expectations: