You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Tables in Hive Catalog can be managed or external. Data for managed tables is stored in the Warehouse directory and their lifecycle is managed by the catalog, e.g. dropping the table also deletes the data/files on the disk. External Tables have their data stored in a custom directory, which is not managed by the catalog, e.g. dropping the table does not deletes the data/files on the disk.
Currently SDLB supports only writing to external table for Hive & DeltaLake. Configuring a path is mandatory when writing to such a DataObject, and is good for data protection and flexibility. But using managed tables is easier and in some Databricks environments the only official way.
SDLB should support writing to managed DeltaLake Tables on Databricks.
Describe the solution you'd like
Add an additional configuration attribute DeltaLakeTableDataObject.isExternal = true (default).
If isExternal = false, configuring a path for writing to the table should not be needed. This implies that the Spark Table API (saveAsTable/insertInto) is used instead of writer....save().
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Tables in Hive Catalog can be managed or external. Data for managed tables is stored in the Warehouse directory and their lifecycle is managed by the catalog, e.g. dropping the table also deletes the data/files on the disk. External Tables have their data stored in a custom directory, which is not managed by the catalog, e.g. dropping the table does not deletes the data/files on the disk.
Currently SDLB supports only writing to external table for Hive & DeltaLake. Configuring a path is mandatory when writing to such a DataObject, and is good for data protection and flexibility. But using managed tables is easier and in some Databricks environments the only official way.
SDLB should support writing to managed DeltaLake Tables on Databricks.
Describe the solution you'd like
Add an additional configuration attribute DeltaLakeTableDataObject.isExternal = true (default).
If isExternal = false, configuring a path for writing to the table should not be needed. This implies that the Spark Table API (saveAsTable/insertInto) is used instead of writer....save().
The text was updated successfully, but these errors were encountered: