-sandbox

<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning" style="width: 600px">
</div>

# Using Clone with Delta Lake

Delta Lake provides native support for copying existing tables with **`CLONE`**. This notebook will explore both deep and shallow clones. The docs for this feature are <a href="https://docs.databricks.com/delta/delta-utility.html#clone-a-delta-table" target="_blank">here</a>; full syntax docs are available <a href="https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-clone.html" target="_blank">here</a>.

## Learning Objectives
By the end of this lesson, you should be able to:
* Describe the behavior of deep and shallow clones
* Use deep clones to create full incremental backups of tables
* Use shallow clones to create development datasets
* Describe expected behavior after performing common database operations on source and clone tables

## Configure the Environment
The following cell will create a database and source table that we'll use in this lesson, alongside some variables we'll use to control file locations.

In [0]:
%run ../Includes/Classroom-Setup-1.4

## Look at the Production Table Details
The production table we'll be using as our source is named **`sensors_prod`**.

Use the following cell to explore the table history. Note that 4 total transactions have been run to create and load data into this table.

In [0]:
%sql
DESCRIBE HISTORY sensors_prod

version,timestamp,userId,userName,operation,operationParameters,job,notebook,clusterId,readVersion,isolationLevel,isBlindAppend,operationMetrics,userMetadata,engineInfo
3,2022-10-26T09:53:47.000+0000,8453215174696142,odl_user_771624@databrickslabs.com,WRITE,"Map(mode -> Append, partitionBy -> [])",,List(4382774100892981),1024-143331-6vol2yy0,2.0,WriteSerializable,True,"Map(numFiles -> 4, numOutputRows -> 1000, numOutputBytes -> 20988)",,Databricks-Runtime/10.4.x-scala2.12
2,2022-10-26T09:53:46.000+0000,8453215174696142,odl_user_771624@databrickslabs.com,WRITE,"Map(mode -> Append, partitionBy -> [])",,List(4382774100892981),1024-143331-6vol2yy0,1.0,WriteSerializable,True,"Map(numFiles -> 4, numOutputRows -> 1000, numOutputBytes -> 20939)",,Databricks-Runtime/10.4.x-scala2.12
1,2022-10-26T09:53:45.000+0000,8453215174696142,odl_user_771624@databrickslabs.com,WRITE,"Map(mode -> Append, partitionBy -> [])",,List(4382774100892981),1024-143331-6vol2yy0,0.0,WriteSerializable,True,"Map(numFiles -> 4, numOutputRows -> 1000, numOutputBytes -> 20945)",,Databricks-Runtime/10.4.x-scala2.12
0,2022-10-26T09:53:43.000+0000,8453215174696142,odl_user_771624@databrickslabs.com,CREATE TABLE,"Map(isManaged -> false, description -> null, partitionBy -> [], properties -> {})",,List(4382774100892981),1024-143331-6vol2yy0,,WriteSerializable,True,Map(),,Databricks-Runtime/10.4.x-scala2.12


Explore the table description to discover the schema and additional details. Note that comments have been added to describe each data field.

In [0]:
%sql
DESCRIBE FORMATTED sensors_prod

col_name,data_type,comment
time,bigint,event timestamp in ms since epoch
device_id,bigint,"device IDs, integer only"
sensor_type,string,sensor type identifier; single upper case letter
signal_strength,double,decimal value between 0 and 1
,,
# Partitioning,,
Not partitioned,,
,,
# Detailed Table Information,,
Catalog,spark_catalog,


The helper function **`DA.check_files`** was defined to accept a table name and return the count of underlying data files (as well as list the content of the table directory).

Recall that all Delta tables comprise:
1. Data files stored in parquet format
1. Transaction logs stored in the **`_delta_log`** directory

The table name we're interacting with in the metastore is just a pointer to these underlying assets.

In [0]:
files = DA.check_files("sensors_prod")
display(files)

path,name,size,modificationTime
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/_delta_log/,_delta_log/,0,1666778028000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00000-6814a526-9367-46ba-ab3c-1394a01ef0a3-c000.snappy.parquet,part-00000-6814a526-9367-46ba-ab3c-1394a01ef0a3-c000.snappy.parquet,5253,1666778027000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00000-aa2f07f8-49fd-41c7-ac6c-7227dddce34b-c000.snappy.parquet,part-00000-aa2f07f8-49fd-41c7-ac6c-7227dddce34b-c000.snappy.parquet,5237,1666778025000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00000-d2f635cd-c559-4c2d-8714-bbeaeed660cd-c000.snappy.parquet,part-00000-d2f635cd-c559-4c2d-8714-bbeaeed660cd-c000.snappy.parquet,5248,1666778026000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00001-4a2c72a8-d250-473f-81bb-7255f39b996b-c000.snappy.parquet,part-00001-4a2c72a8-d250-473f-81bb-7255f39b996b-c000.snappy.parquet,5235,1666778025000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00001-5ba27b16-fb29-44e4-aba5-17152ba07153-c000.snappy.parquet,part-00001-5ba27b16-fb29-44e4-aba5-17152ba07153-c000.snappy.parquet,5246,1666778027000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00001-6729b866-055d-4957-82a7-73db112ad0c2-c000.snappy.parquet,part-00001-6729b866-055d-4957-82a7-73db112ad0c2-c000.snappy.parquet,5234,1666778026000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00002-6b69e0f9-8dfc-470e-82a9-0647eaffdb62-c000.snappy.parquet,part-00002-6b69e0f9-8dfc-470e-82a9-0647eaffdb62-c000.snappy.parquet,5232,1666778025000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00002-a18d30ac-14b5-488c-aa6c-771bc1849a29-c000.snappy.parquet,part-00002-a18d30ac-14b5-488c-aa6c-771bc1849a29-c000.snappy.parquet,5229,1666778026000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00002-cca0dff5-7933-4db0-809b-b1d7570a6134-c000.snappy.parquet,part-00002-cca0dff5-7933-4db0-809b-b1d7570a6134-c000.snappy.parquet,5245,1666778027000


## Create a Backup of Your Dataset with Deep Clone

Deep clone will copy all data and metadata files from your source table to a specified location, registering it with the declared table name.

In [0]:
%sql
CREATE OR REPLACE TABLE sensors_backup 
DEEP CLONE sensors_prod
LOCATION '${da.paths.working_dir}/backup/sensors'

source_table_size,source_num_of_files,num_removed_files,num_copied_files,removed_files_size,copied_files_size
62872,12,0,12,0,62872


You'll recall that our **`sensors_prod`** table had 4 versions associated with it. The clone operation created version 0 of the cloned table. 

The **`operationsParameters`** field indicates the **`sourceVersion`** that was cloned.

The **`operationMetrics`** field will provide information about the files copied during this transaction.

In [0]:
%sql
DESCRIBE HISTORY sensors_backup

version,timestamp,userId,userName,operation,operationParameters,job,notebook,clusterId,readVersion,isolationLevel,isBlindAppend,operationMetrics,userMetadata,engineInfo
0,2022-10-26T10:00:54.000+0000,8453215174696142,odl_user_771624@databrickslabs.com,CLONE,"Map(source -> dbacademy_odl_user_771624_databrickslabs_com_adewd_1_4.sensors_prod, sourceVersion -> 3, isShallow -> false)",,List(4382774100892981),1024-143331-6vol2yy0,-1,Serializable,False,"Map(removedFilesSize -> 0, numRemovedFiles -> 0, sourceTableSize -> 62872, numCopiedFiles -> 12, copiedFilesSize -> 62872, sourceNumOfFiles -> 12)",,Databricks-Runtime/10.4.x-scala2.12


Metadata like comments will also be cloned.

In [0]:
%sql
DESCRIBE FORMATTED sensors_backup

col_name,data_type,comment
time,bigint,event timestamp in ms since epoch
device_id,bigint,"device IDs, integer only"
sensor_type,string,sensor type identifier; single upper case letter
signal_strength,double,decimal value between 0 and 1
,,
# Partitioning,,
Not partitioned,,
,,
# Detailed Table Information,,
Catalog,spark_catalog,


In [0]:
%sql
DESCRIBE EXTENDED sensors_backup

col_name,data_type,comment
time,bigint,event timestamp in ms since epoch
device_id,bigint,"device IDs, integer only"
sensor_type,string,sensor type identifier; single upper case letter
signal_strength,double,decimal value between 0 and 1
,,
# Partitioning,,
Not partitioned,,
,,
# Detailed Table Information,,
Catalog,spark_catalog,


## Incremental Cloning

If you examine the files in your backup table, you'll see that you have the same number of files as your source table. Upon closer examination, you'll note that file names and sizes have also been preserved by the clone. 

This allows Delta Lake to incrementally apply changes to the backup table.

In [0]:
files = DA.check_files("sensors_backup")
display(files)

path,name,size,modificationTime
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/backup/sensors/__tmp_path_dir/,__tmp_path_dir/,0,1666778454000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/backup/sensors/_delta_log/,_delta_log/,0,1666778456000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/backup/sensors/part-00000-6814a526-9367-46ba-ab3c-1394a01ef0a3-c000.snappy.parquet,part-00000-6814a526-9367-46ba-ab3c-1394a01ef0a3-c000.snappy.parquet,5253,1666778454000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/backup/sensors/part-00000-aa2f07f8-49fd-41c7-ac6c-7227dddce34b-c000.snappy.parquet,part-00000-aa2f07f8-49fd-41c7-ac6c-7227dddce34b-c000.snappy.parquet,5237,1666778454000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/backup/sensors/part-00000-d2f635cd-c559-4c2d-8714-bbeaeed660cd-c000.snappy.parquet,part-00000-d2f635cd-c559-4c2d-8714-bbeaeed660cd-c000.snappy.parquet,5248,1666778454000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/backup/sensors/part-00001-4a2c72a8-d250-473f-81bb-7255f39b996b-c000.snappy.parquet,part-00001-4a2c72a8-d250-473f-81bb-7255f39b996b-c000.snappy.parquet,5235,1666778453000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/backup/sensors/part-00001-5ba27b16-fb29-44e4-aba5-17152ba07153-c000.snappy.parquet,part-00001-5ba27b16-fb29-44e4-aba5-17152ba07153-c000.snappy.parquet,5246,1666778453000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/backup/sensors/part-00001-6729b866-055d-4957-82a7-73db112ad0c2-c000.snappy.parquet,part-00001-6729b866-055d-4957-82a7-73db112ad0c2-c000.snappy.parquet,5234,1666778454000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/backup/sensors/part-00002-6b69e0f9-8dfc-470e-82a9-0647eaffdb62-c000.snappy.parquet,part-00002-6b69e0f9-8dfc-470e-82a9-0647eaffdb62-c000.snappy.parquet,5232,1666778453000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/backup/sensors/part-00002-a18d30ac-14b5-488c-aa6c-771bc1849a29-c000.snappy.parquet,part-00002-a18d30ac-14b5-488c-aa6c-771bc1849a29-c000.snappy.parquet,5229,1666778453000


To see incremental clone in action, begin by committing a transaction to the **`sensor_prod`** table. Here, we'll delete all those records where **`sensor_type`** is "C".

Remember that Delta Lake manages changes at the file level, so any file containing a matching record will be rewritten.

In [0]:
%sql
DELETE FROM sensors_prod WHERE sensor_type = 'C'

num_affected_rows
733


When we re-execute our deep clone command, we only copy those files that were written during our most recent transaction.

In [0]:
%sql
CREATE OR REPLACE TABLE sensors_backup 
DEEP CLONE sensors_prod
LOCATION '${da.paths.working_dir}/backup/sensors'

source_table_size,source_num_of_files,num_removed_files,num_copied_files,removed_files_size,copied_files_size
38592,4,12,4,62872,38592


We can review our history to confirm this.

In [0]:
%sql
DESCRIBE HISTORY sensors_backup

version,timestamp,userId,userName,operation,operationParameters,job,notebook,clusterId,readVersion,isolationLevel,isBlindAppend,operationMetrics,userMetadata,engineInfo
1,2022-10-26T10:07:42.000+0000,8453215174696142,odl_user_771624@databrickslabs.com,CLONE,"Map(source -> dbacademy_odl_user_771624_databrickslabs_com_adewd_1_4.sensors_prod, sourceVersion -> 4, isShallow -> false)",,List(4382774100892981),1024-143331-6vol2yy0,0,Serializable,False,"Map(removedFilesSize -> 62872, numRemovedFiles -> 12, sourceTableSize -> 38592, numCopiedFiles -> 4, copiedFilesSize -> 38592, sourceNumOfFiles -> 4)",,Databricks-Runtime/10.4.x-scala2.12
0,2022-10-26T10:00:54.000+0000,8453215174696142,odl_user_771624@databrickslabs.com,CLONE,"Map(source -> dbacademy_odl_user_771624_databrickslabs_com_adewd_1_4.sensors_prod, sourceVersion -> 3, isShallow -> false)",,List(4382774100892981),1024-143331-6vol2yy0,-1,Serializable,False,"Map(removedFilesSize -> 0, numRemovedFiles -> 0, sourceTableSize -> 62872, numCopiedFiles -> 12, copiedFilesSize -> 62872, sourceNumOfFiles -> 12)",,Databricks-Runtime/10.4.x-scala2.12


## Creating Development Datasets with Shallow Clone

Whereas deep clone copies both data and metadata, shallow clone just copies the metadata and creates a pointer to the existing data files.

Note that the cloned table will have read-only permissions on the source data files. This makes it easy to create development datasets using a production dataset without fear of table corruption.

Here, we'll also specify using version 2 of our source production table.

In [0]:
%sql
CREATE OR REPLACE TABLE sensors_dev
SHALLOW CLONE sensors_prod VERSION AS OF 2
LOCATION '${da.paths.working_dir}/dev/sensors'

source_table_size,source_num_of_files,num_removed_files,num_copied_files,removed_files_size,copied_files_size
41884,8,0,0,0,0


When we look at the target directory, we'll note that no data files exist. 

The metadata for this table just points to those data files in the source table's data directory.

In [0]:
files = DA.check_files("sensors_dev")
display(files)

path,name,size,modificationTime
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/dev/sensors/_delta_log/,_delta_log/,0,1666778937000


## Apply Changes to Development Data
But what happens if you want to test modifications to your development table?

The code below inserts only those records from version 3 of our production table that don't have the value "C" as a **`sensor_type`**.

In [0]:
%sql
MERGE INTO sensors_dev dev
USING (SELECT * FROM sensors_prod@v3 WHERE sensor_type != "C") prod
ON dev.device_id = prod.device_id AND dev.time = prod.time
WHEN NOT MATCHED THEN INSERT *

num_affected_rows,num_updated_rows,num_deleted_rows,num_inserted_rows
745,0,0,745


The operation is successful and new rows are inserted. If we check the contents of our table location, we'll see that data files now exists.

In [0]:
files = DA.check_files("sensors_dev")
display(files)

path,name,size,modificationTime
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/dev/sensors/_delta_log/,_delta_log/,0,1666779000000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/dev/sensors/part-00000-dcd922e7-d5c3-49b1-9b05-ac5a0f32128a-c000.snappy.parquet,part-00000-dcd922e7-d5c3-49b1-9b05-ac5a0f32128a-c000.snappy.parquet,6920,1666778999000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/dev/sensors/part-00001-468b41db-9cd1-402d-85ba-54add4e67d2c-c000.snappy.parquet,part-00001-468b41db-9cd1-402d-85ba-54add4e67d2c-c000.snappy.parquet,7081,1666778999000


Any changes made to a shallow cloned table will write new data files to the specified target directory, meaning that you can safely test writes, updates, and deletes without risking corruption of your original table. The Delta logs will automatically reference the correct files (from the source table and this clone directory) to materialize the current view of your dev table.

## File Retention and Cloned Tables

It's important to understand how cloned tables behave with file retention actions.

Recall the files in our **`sensors_prod`** table:

In [0]:
files = DA.check_files("sensors_prod")
display(files)

path,name,size,modificationTime
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/_delta_log/,_delta_log/,0,1666778846000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00000-0099dfcf-b01f-4914-8504-bf82af8b09c1-c000.snappy.parquet,part-00000-0099dfcf-b01f-4914-8504-bf82af8b09c1-c000.snappy.parquet,9778,1666778845000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00000-6814a526-9367-46ba-ab3c-1394a01ef0a3-c000.snappy.parquet,part-00000-6814a526-9367-46ba-ab3c-1394a01ef0a3-c000.snappy.parquet,5253,1666778027000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00000-aa2f07f8-49fd-41c7-ac6c-7227dddce34b-c000.snappy.parquet,part-00000-aa2f07f8-49fd-41c7-ac6c-7227dddce34b-c000.snappy.parquet,5237,1666778025000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00000-d2f635cd-c559-4c2d-8714-bbeaeed660cd-c000.snappy.parquet,part-00000-d2f635cd-c559-4c2d-8714-bbeaeed660cd-c000.snappy.parquet,5248,1666778026000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00001-4a2c72a8-d250-473f-81bb-7255f39b996b-c000.snappy.parquet,part-00001-4a2c72a8-d250-473f-81bb-7255f39b996b-c000.snappy.parquet,5235,1666778025000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00001-5ba27b16-fb29-44e4-aba5-17152ba07153-c000.snappy.parquet,part-00001-5ba27b16-fb29-44e4-aba5-17152ba07153-c000.snappy.parquet,5246,1666778027000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00001-6729b866-055d-4957-82a7-73db112ad0c2-c000.snappy.parquet,part-00001-6729b866-055d-4957-82a7-73db112ad0c2-c000.snappy.parquet,5234,1666778026000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00001-da614bd9-0703-493f-89c5-204968690a43-c000.snappy.parquet,part-00001-da614bd9-0703-493f-89c5-204968690a43-c000.snappy.parquet,9668,1666778845000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00002-00e5c222-f5cc-4038-9bd8-e824d6341d36-c000.snappy.parquet,part-00002-00e5c222-f5cc-4038-9bd8-e824d6341d36-c000.snappy.parquet,9635,1666778845000


Run the cell below to **`VACUUM`** your source production table (removing all files not referenced in the most recent version).

In [0]:
spark.conf.set("spark.databricks.delta.retentionDurationCheck.enabled", False)
spark.sql("VACUUM sensors_prod RETAIN 0 HOURS")
spark.conf.set("spark.databricks.delta.retentionDurationCheck.enabled", True)

We see that there are now fewer total data files associated with this table.

In [0]:
files = DA.check_files("sensors_prod")
display(files)

path,name,size,modificationTime
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/_delta_log/,_delta_log/,0,1666779086000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00000-0099dfcf-b01f-4914-8504-bf82af8b09c1-c000.snappy.parquet,part-00000-0099dfcf-b01f-4914-8504-bf82af8b09c1-c000.snappy.parquet,9778,1666778845000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00001-da614bd9-0703-493f-89c5-204968690a43-c000.snappy.parquet,part-00001-da614bd9-0703-493f-89c5-204968690a43-c000.snappy.parquet,9668,1666778845000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00002-00e5c222-f5cc-4038-9bd8-e824d6341d36-c000.snappy.parquet,part-00002-00e5c222-f5cc-4038-9bd8-e824d6341d36-c000.snappy.parquet,9635,1666778845000
dbfs:/user/odl_user_771624@databrickslabs.com/dbacademy/adewd/1.4/prod/sensors/part-00003-6f00c15a-dac4-4929-9e4a-da07fc2a05a4-c000.snappy.parquet,part-00003-6f00c15a-dac4-4929-9e4a-da07fc2a05a4-c000.snappy.parquet,9511,1666778845000


You'll recall that our **`sensors_dev`** table was initialized against version 2 of our production table. As such, it still has reference to data files associated with that table version.

Because these data files have been removed by our vacuum operation, we should expect the following query against our shallow cloned table to fail.

Uncomment it now and give it a try:

In [0]:
%sql 
SELECT * FROM sensors_dev

Because deep clone created a full copy of our files and associated metadata, we still have access to our **`sensors_backup`** table. Here, we'll query the original version of this backup (which corresponds to version 3 of our source table).

In [0]:
%sql
SELECT * FROM sensors_backup@v0

time,device_id,sensor_type,signal_strength
1666778146120,41,C,0.725857295862095
1666778148373,2,B,0.2340557213476499
1666778150017,50,C,0.2327073071152596
1666778151343,68,B,0.1416979119722508
1666778149666,52,B,0.5527502492078951
1666778142355,26,B,0.6810106603945704
1666778139205,81,A,0.1619922854785235
1666778143300,18,C,0.1645352697163677
1666778152559,72,B,0.1004115987548484
1666778154022,89,A,0.0838841174337486


One of the useful features of deep cloning is the ability to set different table properties for file and log retention. This allows production tables to have optimized performance while maintaining files for auditing and regulatory compliance. 

The cell below sets the log and deleted file retention periods to 10 years.

In [0]:
%sql
ALTER TABLE sensors_backup
SET TBLPROPERTIES (
  delta.logRetentionDuration = '3650 days',
  delta.deletedFileRetentionDuration = '3650 days'
)

## Wrapping Up

In this notebook, we explored the basic syntax and behavior of deep and shallow clones. We saw how changes to source and clone tables impacted tables, including the ability to incrementally clone changes to keep a backup table in-sync with its source. We saw that shallow clone could be used for creating temporary tables for development based on production data, but noted that removal of source data files will lead to errors when trying to query this shallow clone.

Run the following cell to delete the tables and files associated with this lesson.

In [0]:
DA.cleanup()

-sandbox
&copy; 2022 Databricks, Inc. All rights reserved.<br/>
Apache, Apache Spark, Spark and the Spark logo are trademarks of the <a href="https://www.apache.org/">Apache Software Foundation</a>.<br/>
<br/>
<a href="https://databricks.com/privacy-policy">Privacy Policy</a> | <a href="https://databricks.com/terms-of-use">Terms of Use</a> | <a href="https://help.databricks.com/">Support</a>