d-sandbox

<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning" style="width: 600px; height: 163px">
</div>

# 3.7 Managed and Unmanaged Tables

## ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) In this notebook you:<br>
* Write to managed and unmanaged tables
* Explore the effect of dropping tables on the metadata and underlying data

In [0]:
%run ../Includes/Classroom-Setup

## ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Managed and Unmanaged Tables<br>

-sandbox
A **managed table** is a table that manages both the data itself as well as the metadata.  In this case, a `DROP TABLE` command removes both the metadata for the table as well as the data itself.  

**Unmanaged tables** manage the metadata from a table such as the schema and data location, but the data itself sits in a different location, often backed by a blob store like the Azure Blob or S3. Dropping an unmanaged table drops only the metadata associated with the table while the data itself remains in place.

<div><img src="https://files.training.databricks.com/images/eLearning/ETL-Part-2/managed-and-unmanaged-tables.png" style="height: 400px; margin: 20px"/></div>

Start with a managed table.

In [0]:
%sql
USE default;

DROP TABLE IF EXISTS tableManaged;

CREATE TABLE tableManaged (
  var1 INT,
  var2 INT
);

INSERT INTO tableManaged
  VALUES (1, 1), (2, 2)

Use `DESCRIBE EXTENDED` to describe the contents of the table.  Scroll down to see the table `Type`.

Notice the location is also `dbfs:/user/hive/warehouse/< your database >/tablemanaged`.

In [0]:
%sql
DESCRIBE EXTENDED tableManaged

col_name,data_type,comment
var1,int,
var2,int,
,,
# Detailed Table Information,,
Database,default,
Table,tablemanaged,
Owner,root,
Created Time,Mon Nov 23 00:26:18 UTC 2020,
Last Access,Thu Jan 01 00:00:00 UTC 1970,
Created By,Spark 2.4.3,


Now use an external, or unmanaged, table

In [0]:
%sql
DROP TABLE IF EXISTS tableUnmanaged;

CREATE EXTERNAL TABLE tableUnmanaged (
  var1 INT,
  var2 INT
)
STORED AS parquet
LOCATION '/tmp/unmanagedTable'

Describe the table and look for the `Type`

In [0]:
%sql
DESCRIBE EXTENDED tableUnmanaged

col_name,data_type,comment
var1,int,
var2,int,
,,
# Detailed Table Information,,
Database,default,
Table,tableunmanaged,
Owner,root,
Created Time,Mon Nov 23 00:26:23 UTC 2020,
Last Access,Thu Jan 01 00:00:00 UTC 1970,
Created By,Spark 2.4.3,


This is an external, or managed table.  If we were to shut down our cluster, this data will persist.  Now insert values into the table.

In [0]:
%sql
INSERT INTO tableUnmanaged
  VALUES (1, 1), (2, 2)

Take a look at the result.

In [0]:
%sql
SELECT * FROM tableUnmanaged

var1,var2
1,1
2,2


Now view the underlying files in where the data was persisted.

In [0]:
%fs ls /tmp/unmanagedTable

path,name,size
dbfs:/tmp/unmanagedTable/_SUCCESS,_SUCCESS,0
dbfs:/tmp/unmanagedTable/_committed_2431590344566543184,_committed_2431590344566543184,226
dbfs:/tmp/unmanagedTable/_started_2431590344566543184,_started_2431590344566543184,0
dbfs:/tmp/unmanagedTable/part-00000-tid-2431590344566543184-c7e72c43-3753-435b-84f0-8e4cb687fa62-1066-1-c000.snappy.parquet,part-00000-tid-2431590344566543184-c7e72c43-3753-435b-84f0-8e4cb687fa62-1066-1-c000.snappy.parquet,615
dbfs:/tmp/unmanagedTable/part-00001-tid-2431590344566543184-c7e72c43-3753-435b-84f0-8e4cb687fa62-1067-1-c000.snappy.parquet,part-00001-tid-2431590344566543184-c7e72c43-3753-435b-84f0-8e4cb687fa62-1067-1-c000.snappy.parquet,615


## ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Dropping Managed and Unmanaged Tables<br>

Confirm that the underlying files exist for the managed table.

In [0]:
%fs ls dbfs:/user/hive/warehouse/tablemanaged

path,name,size
dbfs:/user/hive/warehouse/tablemanaged/_SUCCESS,_SUCCESS,0
dbfs:/user/hive/warehouse/tablemanaged/_committed_1955761497764427627,_committed_1955761497764427627,196
dbfs:/user/hive/warehouse/tablemanaged/_started_1955761497764427627,_started_1955761497764427627,0
dbfs:/user/hive/warehouse/tablemanaged/part-00000-tid-1955761497764427627-d32d8671-4f33-4b5a-bc4c-8fb427773846-1064-1-c000,part-00000-tid-1955761497764427627-d32d8671-4f33-4b5a-bc4c-8fb427773846-1064-1-c000,4
dbfs:/user/hive/warehouse/tablemanaged/part-00001-tid-1955761497764427627-d32d8671-4f33-4b5a-bc4c-8fb427773846-1065-1-c000,part-00001-tid-1955761497764427627-d32d8671-4f33-4b5a-bc4c-8fb427773846-1065-1-c000,4


Now drop the managed table.

In [0]:
%sql
DROP TABLE tableManaged

Take a look--the files are gone!

In [0]:
%fs ls dbfs:/user/hive/warehouse/tablemanaged

Now drop the unmanaged, or external, table.

In [0]:
%sql
DROP TABLE tableUnmanaged

Now take a look at the underlying files.

In [0]:
%fs ls /tmp/unmanagedTable

path,name,size
dbfs:/tmp/unmanagedTable/_SUCCESS,_SUCCESS,0
dbfs:/tmp/unmanagedTable/_committed_2431590344566543184,_committed_2431590344566543184,226
dbfs:/tmp/unmanagedTable/_started_2431590344566543184,_started_2431590344566543184,0
dbfs:/tmp/unmanagedTable/part-00000-tid-2431590344566543184-c7e72c43-3753-435b-84f0-8e4cb687fa62-1066-1-c000.snappy.parquet,part-00000-tid-2431590344566543184-c7e72c43-3753-435b-84f0-8e4cb687fa62-1066-1-c000.snappy.parquet,615
dbfs:/tmp/unmanagedTable/part-00001-tid-2431590344566543184-c7e72c43-3753-435b-84f0-8e4cb687fa62-1067-1-c000.snappy.parquet,part-00001-tid-2431590344566543184-c7e72c43-3753-435b-84f0-8e4cb687fa62-1067-1-c000.snappy.parquet,615


They're still there!

## Summary
- Use external/unmanaged tables when you want to persist your data once the cluster has shut down
- Use managed tables when you only want ephemeral data

-sandbox
&copy; 2020 Databricks, Inc. All rights reserved.<br/>
Apache, Apache Spark, Spark and the Spark logo are trademarks of the <a href="http://www.apache.org/">Apache Software Foundation</a>.<br/>
<br/>
<a href="https://databricks.com/privacy-policy">Privacy Policy</a> | <a href="https://databricks.com/terms-of-use">Terms of Use</a> | <a href="http://help.databricks.com/">Support</a>