### Install delta-spark

```
pip install delta-spark
```

In [1]:
#
import pyspark
from delta import configure_spark_with_delta_pip
from pyspark.sql import Row
from pyspark.sql import SparkSession
#
#
_builder =	(	SparkSession.builder.master('local[1]') \
				.appName('pyspark-deltalake-local-testing') \
				.config(	'spark.sql.extensions'
						,	'io.delta.sql.DeltaSparkSessionExtension')
				.config(	'spark.sql.catalog.spark_catalog'
						,	'org.apache.spark.sql.delta.catalog.DeltaCatalog'))
#
_spark = configure_spark_with_delta_pip(_builder).enableHiveSupport().getOrCreate()
#

In [2]:
_spark
#
# try to stop?
### _spark.stop()
#

In [3]:
#
_spdf_salesterr = _spark.read \
					.format('csv') \
					.option('inferSchema', True) \
					.option('header',True) \
					.load("file:///C:/Users/Administrator/Documents/Code/Python/Spark/DimSalesTerritory.csv")
#
_spdf_salesterr.show(truncate=False)
#

+-----------------+--------------------------+--------------------+---------------------+-------------------+
|SalesTerritoryKey|SalesTerritoryAlternateKey|SalesTerritoryRegion|SalesTerritoryCountry|SalesTerritoryGroup|
+-----------------+--------------------------+--------------------+---------------------+-------------------+
|1                |1                         |Northwest           |United States        |North America      |
|2                |2                         |Northeast           |United States        |North America      |
|3                |3                         |Central             |United States        |North America      |
|4                |4                         |Southwest           |United States        |North America      |
|5                |5                         |Southeast           |United States        |North America      |
|6                |6                         |Canada              |Canada               |North America      |
|7        

In [5]:
#
_path = 'file:///C:/Users/Administrator/Documents/Code/Python/Spark/MyDeltaLake/SalesTerritory/'
#
_spdf_salesterr.write \
				.mode('overwrite') \
				.format('delta') \
				.saveAsTable(	\
								'DimSalesTerritory'	\
							,	path	=	_path	\
							,	format	=	'delta'	\
							,	mode	=	'overwrite')
#

## DESCRIBE table schema
### check the detailed info of the table

In [5]:
#
_resultsAsSQL_DataFrame = _spark.sql('DESCRIBE EXTENDED DimSalesTerritory')
#
_resultsAsSQL_DataFrame.show(truncate=False)
#

+----------------------------+-----------------------------------------------------------------------------------+-------+
|col_name                    |data_type                                                                          |comment|
+----------------------------+-----------------------------------------------------------------------------------+-------+
|SalesTerritoryKey           |int                                                                                |NULL   |
|SalesTerritoryAlternateKey  |int                                                                                |NULL   |
|SalesTerritoryRegion        |string                                                                             |NULL   |
|SalesTerritoryCountry       |string                                                                             |NULL   |
|SalesTerritoryGroup         |string                                                                             |NULL   |
|               

In [6]:
#
_resultsAsSQL_DataFrame = _spark.sql('SELECT * FROM DimSalesTerritory ORDER BY SalesTerritoryKey DESC;')
#
_resultsAsSQL_DataFrame.show(truncate=False)
#

+-----------------+--------------------------+--------------------+---------------------+-------------------+
|SalesTerritoryKey|SalesTerritoryAlternateKey|SalesTerritoryRegion|SalesTerritoryCountry|SalesTerritoryGroup|
+-----------------+--------------------------+--------------------+---------------------+-------------------+
|11               |0                         |NA                  |NA                   |NA                 |
|10               |10                        |United Kingdom      |United Kingdom       |Europe             |
|9                |9                         |Australia           |Australia            |Pacific            |
|8                |8                         |Germany             |Germany              |Europe             |
|7                |7                         |France              |France               |Europe             |
|6                |6                         |Canada              |Canada               |North America      |
|5        

## DESCRIBE HISTORY

### you can check the history of the table

In [7]:
#
_resultsAsSQL_DataFrame = _spark.sql('DESCRIBE HISTORY DimSalesTerritory')
#
_resultsAsSQL_DataFrame.show(truncate=False)
#

+-------+-----------------------+------+--------+---------------------------------+-----------------------------------------------------------------------------------------------+----+--------+---------+-----------+--------------+-------------+--------------------------------------------------------------------------------------------------------+------------+-----------------------------------+
|version|timestamp              |userId|userName|operation                        |operationParameters                                                                            |job |notebook|clusterId|readVersion|isolationLevel|isBlindAppend|operationMetrics                                                                                        |userMetadata|engineInfo                         |
+-------+-----------------------+------+--------+---------------------------------+-----------------------------------------------------------------------------------------------+----+--------+---------

In [8]:
#
#from pyspark.dbutils import DBUtils
#
#_dbutils = DBUtils(_spark)
#_dbutils.fs.ls('file:///C:/Users/Administrator/Documents/Code/Python/Spark/SalesTerritory/')
#

### load transactions into data-frame

In [9]:
#
_spdf_salesterr_trans = _spark.read \
						.format('csv') \
						.option('inferSchema', True) \
						.option('header',True) \
						.load("file:///C:/Users/Administrator/Documents/Code/Python/Spark/DimSalesTerritoryTransactions1.csv")
#
_spdf_salesterr_trans.show(truncate=False)
#

+-----------------+--------------------------+--------------------+---------------------+-------------------+
|SalesTerritoryKey|SalesTerritoryAlternateKey|SalesTerritoryRegion|SalesTerritoryCountry|SalesTerritoryGroup|
+-----------------+--------------------------+--------------------+---------------------+-------------------+
|1                |1                         |Northwest           |United States        |Federation         |
|2                |2                         |Northeast           |United States        |Klingon Empire     |
|99               |99                        |Kalos               |Cronos               |Klingon Empire     |
|3                |3                         |DELETE              |DELETE               |NULL               |
+-----------------+--------------------------+--------------------+---------------------+-------------------+



In [10]:
#
_path = 'file:///C:/Users/Administrator/Documents/Code/Python/Spark/MyDeltaLake/SalesTerritoryTransactions/'
#
_spdf_salesterr_trans.write \
					.mode('overwrite') \
					.format('delta') \
					.saveAsTable(	\
									'DimSalesTerritoryTransactions'	\
								,	path	=	_path	\
								,	format	=	'delta'	\
								,	mode	=	'overwrite')
#

In [11]:
#
_resultsAsSQL_DataFrame = _spark.sql('SELECT * FROM DimSalesTerritoryTransactions ORDER BY SalesTerritoryKey ASC;')
#
_resultsAsSQL_DataFrame.show(truncate=False)
#

+-----------------+--------------------------+--------------------+---------------------+-------------------+
|SalesTerritoryKey|SalesTerritoryAlternateKey|SalesTerritoryRegion|SalesTerritoryCountry|SalesTerritoryGroup|
+-----------------+--------------------------+--------------------+---------------------+-------------------+
|1                |1                         |Northwest           |United States        |Federation         |
|2                |2                         |Northeast           |United States        |Klingon Empire     |
|3                |3                         |DELETE              |DELETE               |NULL               |
|99               |99                        |Kalos               |Cronos               |Klingon Empire     |
+-----------------+--------------------------+--------------------+---------------------+-------------------+



In [12]:
#
_resultsAsSQL_DataFrame = _spark.sql( \
'''
MERGE INTO DimSalesTerritory			terr
USING DimSalesTerritoryTransactions		trans
ON trans.SalesTerritoryKey = terr.SalesTerritoryKey
WHEN MATCHED AND trans.SalesTerritoryRegion = "DELETE" THEN
	DELETE
WHEN MATCHED THEN
	UPDATE SET SalesTerritoryGroup = trans.SalesTerritoryGroup
WHEN NOT MATCHED THEN
	INSERT *
;''')
#

In [13]:
#
_resultsAsSQL_DataFrame = _spark.sql('DESCRIBE HISTORY DimSalesTerritory')
#
_resultsAsSQL_DataFrame.show(truncate=False)
#

+-------+-----------------------+------+--------+---------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----+--------+---------+-----------+--------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In [14]:
#
_resultsAsSQL_Query_DataFrame = _spark.sql('SELECT * FROM DimSalesTerritory ORDER BY SalesTerritoryKey ASC;')
#
_resultsAsSQL_Query_DataFrame.show(truncate=False)
#

+-----------------+--------------------------+--------------------+---------------------+-------------------+
|SalesTerritoryKey|SalesTerritoryAlternateKey|SalesTerritoryRegion|SalesTerritoryCountry|SalesTerritoryGroup|
+-----------------+--------------------------+--------------------+---------------------+-------------------+
|1                |1                         |Northwest           |United States        |Federation         |
|2                |2                         |Northeast           |United States        |Klingon Empire     |
|4                |4                         |Southwest           |United States        |North America      |
|5                |5                         |Southeast           |United States        |North America      |
|6                |6                         |Canada              |Canada               |North America      |
|7                |7                         |France              |France               |Europe             |
|8        

## DELTA tables TRANSACTIONS can ONLY and ONLY make changes in a SINGLE TABLE, next statement is <b>INVALID</b> on DeltaLake

```SQL
BEGIN TRANSACTION;
--
INSERT INTO Table1 VALUES (1,2);
INSERT INTO Table2 VALUES (3,4);
--
COMMIT;
```

### Change setting to allow directly query on delta logs

In [15]:
#
## ## _sc = SparkContext('local[1]', 'pyspark-deltalake-local-testing')
## ## _spark.conf.set('spark.databricks.delta.formatCheck.enabled', True)
## ## _spark.sparkContext.getConf().getAll()
#
from pyspark import SparkContext
#
_sc = _spark.sparkContext
_scConf = _sc.getConf()
#
_scConf.set('spark.databricks.delta.formatCheck.enabled', False)
_confValue = _scConf.get('spark.databricks.delta.formatCheck.enabled')
#
print('value to replace = {0}'.format(_confValue))
#
## restart spark context
#
_spark.sparkContext.stop()
#
## ## ## _spark = SparkSession.builder.config(conf=_scConf).getOrCreate() ### <<<<<<<<<<<<
#
_builder =	(	SparkSession.builder.master('local[1]') \
				.appName('pyspark-deltalake-local-testing') \
				.config(conf=_scConf) \
				.config(	'spark.sql.extensions'
						,	'io.delta.sql.DeltaSparkSessionExtension')
				.config(	'spark.sql.catalog.spark_catalog'
						,	'org.apache.spark.sql.delta.catalog.DeltaCatalog'))
#
_spark = configure_spark_with_delta_pip(_builder).config(conf=_scConf).getOrCreate()
#

value to replace = False


In [16]:
#
_sc = _spark.sparkContext
_scConf = _sc.getConf()
#
_confValue = _scConf.get('spark.databricks.delta.formatCheck.enabled')
#
print('new value = {0}'.format(_confValue))
#

new value = False


## Let's make a merge using DataFrame

In [17]:
#
import pyspark
from delta import *
from delta.tables import *
from pyspark.sql.types import *
from pyspark.sql.functions import *
#
_structSchema = StructType([	\
								StructField('SalesTerritoryKey',			IntegerType(),	True)	\
							,	StructField('SalesTerritoryAlternateKey',	IntegerType(),	True)	\
							,	StructField('SalesTerritoryRegion',			StringType(),	True)	\
							,	StructField('SalesTerritoryCountry',		StringType(),	True)	\
							,	StructField('SalesTerritoryGroup',			StringType(),	True)	\
							])
#
_tableForMerge_AsList = [(11,0,'San Juan','Puerto Rico','Caribbean'),(98,-1,'King','United States','Klingon Empire')]
_tableForMerge_AsDataFrame = _spark.createDataFrame(_tableForMerge_AsList, _structSchema)
#
_path = 'file:///C:/Users/Administrator/Documents/Code/Python/Spark/MyDeltaLake/SalesTerritory'
_DimSalesTerritory_AsDeltaTable = DeltaTable.forPath(_spark, _path)
_DimSalesTerritory_AsDeltaTable.toDF().show(truncate=False)
#

+-----------------+--------------------------+--------------------+---------------------+-------------------+
|SalesTerritoryKey|SalesTerritoryAlternateKey|SalesTerritoryRegion|SalesTerritoryCountry|SalesTerritoryGroup|
+-----------------+--------------------------+--------------------+---------------------+-------------------+
|1                |1                         |Northwest           |United States        |Federation         |
|2                |2                         |Northeast           |United States        |Klingon Empire     |
|4                |4                         |Southwest           |United States        |North America      |
|5                |5                         |Southeast           |United States        |North America      |
|6                |6                         |Canada              |Canada               |North America      |
|7                |7                         |France              |France               |Europe             |
|8        

In [18]:
#
## ref : https://medium.com/@ansabiqbal/delta-lake-introduction-with-examples-using-pyspark-cb2a0d7a549d
## ref : https://microsoftlearning.github.io/mslearn-fabric/Instructions/Labs/03-delta-lake.html
#
_DimSalesTerritory_AsDeltaTable	\
	.alias('target') \
	.merge(	\
		_tableForMerge_AsDataFrame.alias('origin'), \
			'target.SalesTerritoryKey = origin.SalesTerritoryKey and target.SalesTerritoryAlternateKey = origin.SalesTerritoryAlternateKey') \
	.whenMatchedUpdateAll().whenNotMatchedInsertAll().execute()
#

DataFrame[num_affected_rows: bigint, num_updated_rows: bigint, num_deleted_rows: bigint, num_inserted_rows: bigint]

In [19]:
#
_path = 'file:///C:/Users/Administrator/Documents/Code/Python/Spark/MyDeltaLake/SalesTerritory/'
#
_DimSalesTerritory_AsDataFrame = _spark.read.format('delta').load(path=_path)
#
_DimSalesTerritory_AsDataFrame.write \
								.mode('overwrite') \
								.format('delta') \
								.saveAsTable(	\
												'DimSalesTerritory'	\
											,	path	=	_path	\
											,	format	=	'delta'	\
											,	mode	=	'overwrite')
#
_resultsAsSQL_DataFrame = _spark.sql('DESCRIBE EXTENDED DimSalesTerritory')
#
_resultsAsSQL_DataFrame.show(truncate=False)
#

+----------------------------+-----------------------------------------------------------------------------------+-------+
|col_name                    |data_type                                                                          |comment|
+----------------------------+-----------------------------------------------------------------------------------+-------+
|SalesTerritoryKey           |int                                                                                |NULL   |
|SalesTerritoryAlternateKey  |int                                                                                |NULL   |
|SalesTerritoryRegion        |string                                                                             |NULL   |
|SalesTerritoryCountry       |string                                                                             |NULL   |
|SalesTerritoryGroup         |string                                                                             |NULL   |
|               

In [20]:
#
_resultsAsSQL_DataFrame = _spark.sql('DESCRIBE HISTORY DimSalesTerritory')
#
_resultsAsSQL_DataFrame.show(truncate=False)
#

+-------+-----------------------+------+--------+---------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----+--------+---------+-----------+--------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In [21]:
#
_path = 'file:///C:/Users/Administrator/Documents/Code/Python/Spark/MyDeltaLake/SalesTerritory/'
#
# Get the version 0 data
_original_data = _spark.read.format('delta').option('versionAsOf', 0).load(_path)
_original_data.show(truncate=False)
#

+-----------------+--------------------------+--------------------+---------------------+-------------------+
|SalesTerritoryKey|SalesTerritoryAlternateKey|SalesTerritoryRegion|SalesTerritoryCountry|SalesTerritoryGroup|
+-----------------+--------------------------+--------------------+---------------------+-------------------+
|1                |1                         |Northwest           |United States        |North America      |
|2                |2                         |Northeast           |United States        |North America      |
|3                |3                         |Central             |United States        |North America      |
|4                |4                         |Southwest           |United States        |North America      |
|5                |5                         |Southeast           |United States        |North America      |
|6                |6                         |Canada              |Canada               |North America      |
|7        

In [22]:
#
# Get the version 0 data
_original_data = _spark.read.format('delta').option('versionAsOf', 3).load(_path)
_original_data.show(truncate=False)
#

+-----------------+--------------------------+--------------------+---------------------+-------------------+
|SalesTerritoryKey|SalesTerritoryAlternateKey|SalesTerritoryRegion|SalesTerritoryCountry|SalesTerritoryGroup|
+-----------------+--------------------------+--------------------+---------------------+-------------------+
|1                |1                         |Northwest           |United States        |Federation         |
|2                |2                         |Northeast           |United States        |Klingon Empire     |
|4                |4                         |Southwest           |United States        |North America      |
|5                |5                         |Southeast           |United States        |North America      |
|6                |6                         |Canada              |Canada               |North America      |
|7                |7                         |France              |France               |Europe             |
|8        

In [89]:
import os
import json
#
def show_log_file(logpath:str, logfilename:str):
	logFile = os.path.join(logpath,logfilename)
	#
	print('Processing LogFile : {0}{1}'.format(logFile,'\n'))
	#
	logLine = []
	try:
		for line in open(logFile,'r'):
			try:
				_value = json.loads(line)
				logLine.append(_value)
			except json.JSONDecodeError as e:
				print('Error parsing valid JSON: {0}'.format(line))
	except Exception as eX:
		print('Errror Reading File : {0}'.format(eX))
	#
	print('commit Info : {0}\n'.format(logLine[0]))
	print('log Block 2 : {0}\n'.format(logLine[1]))
	print('log Block 3 : {0}\n'.format(logLine[2]))
	if len(logLine) > 4:
		print('log Block 4 : {0}\n'.format(logLine[3]))
	#

### Read Delta Log(s)

In [90]:
import os
#
_resultsAsSQL_DataFrame = _spark.sql('DESCRIBE EXTENDED DimSalesTerritory')
#
_resultsAsSQL_DataFrame = _resultsAsSQL_DataFrame.where(col('col_name') == 'Location').select(col('data_type').alias('value'))
#
_resultsAsSQL_AsPandas = _resultsAsSQL_DataFrame.toPandas()
_resultsAsSQL_AsDict = _resultsAsSQL_AsPandas.to_dict()
#
print('Path = {0}\n\n'.format(_resultsAsSQL_AsDict['value'][0]))
#
_fixedPath = str('{0}/_delta_log'.format((_resultsAsSQL_AsDict['value'][0]).replace('file:/','')))
_filesonPath = os.listdir(_fixedPath)
_json_files = [f for f in _filesonPath if f.endswith('.json')]
#
for _cF in _json_files:
	show_log_file('{0}/'.format(_fixedPath), _cF)
#

Path = file:/C:/Users/Administrator/Documents/Code/Python/Spark/MyDeltaLake/SalesTerritory


Processing LogFile : C:/Users/Administrator/Documents/Code/Python/Spark/MyDeltaLake/SalesTerritory/_delta_log/00000000000000000000.json

commit Info : {'commitInfo': {'timestamp': 1750877412128, 'operation': 'CREATE OR REPLACE TABLE AS SELECT', 'operationParameters': {'partitionBy': '[]', 'clusterBy': '[]', 'description': None, 'isManaged': 'false', 'properties': '{}'}, 'isolationLevel': 'Serializable', 'isBlindAppend': False, 'operationMetrics': {'numFiles': '1', 'numRemovedFiles': '0', 'numRemovedBytes': '0', 'numOutputRows': '11', 'numOutputBytes': '2143'}, 'engineInfo': 'Apache-Spark/4.0.0 Delta-Lake/4.0.0', 'txnId': '13a67fb5-4a25-409f-b18d-419efc1a1631'}}

log Block 2 : {'metaData': {'id': 'c78f28d6-7ee6-4386-8960-1871df589752', 'format': {'provider': 'parquet', 'options': {}}, 'schemaString': '{"type":"struct","fields":[{"name":"SalesTerritoryKey","type":"integer","nullable":true,"metada

### DESCRIBE HISTORY using <b>%fs</b>

In [103]:
#
_fixedPath = str('{0}'.format((_resultsAsSQL_AsDict['value'][0]).replace('file:/','')))
_query = "DESCRIBE HISTORY '{0}'".format(_fixedPath)
_resultsAsSQL_DataFrame = _spark.sql(_query)
#
_resultsAsSQL_DataFrame.sort(col('version').asc()).show(truncate=False)
#

+-------+-----------------------+------+--------+---------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----+--------+---------+-----------+--------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### SELECT from delta path

In [125]:
## path could be executed as SELECT statement too..
#
_path = 'file:///C:/Users/Administrator/Documents/Code/Python/Spark/MyDeltaLake/SalesTerritory/'
_query = "SELECT * FROM delta.`{0}`".format(_path)
#
print('QUERY : {0}\n'.format(_query))
_resultsAsSQL_DataFrame = _spark.sql(_query)
#
_resultsAsSQL_DataFrame.show(truncate=False)
#

QUERY : SELECT * FROM delta.`file:///C:/Users/Administrator/Documents/Code/Python/Spark/MyDeltaLake/SalesTerritory/`

+-----------------+--------------------------+--------------------+---------------------+-------------------+
|SalesTerritoryKey|SalesTerritoryAlternateKey|SalesTerritoryRegion|SalesTerritoryCountry|SalesTerritoryGroup|
+-----------------+--------------------------+--------------------+---------------------+-------------------+
|1                |1                         |Northwest           |United States        |Federation         |
|2                |2                         |Northeast           |United States        |Klingon Empire     |
|4                |4                         |Southwest           |United States        |North America      |
|5                |5                         |Southeast           |United States        |North America      |
|6                |6                         |Canada              |Canada               |North America      |
|7

### CREATE table FROM path

In [133]:
#
_path = 'file:///C:/Users/Administrator/Documents/Code/Python/Spark/MyDeltaLake/SalesTerritory/'
_query = "CREATE TABLE IF NOT EXISTS DimSalesTerritory USING delta LOCATION '{0}'".format(_path)
#
print('QUERY : {0}'.format(_query))
_resultsAsSQL_DataFrame = _spark.sql(_query)
#
_resultsAsSQL_DataFrame = _spark.sql('SELECT * FROM DimSalesTerritory')
#
_resultsAsSQL_DataFrame.show(50, truncate=False)
#

QUERY : CREATE TABLE IF NOT EXISTS DimSalesTerritory USING delta LOCATION 'file:///C:/Users/Administrator/Documents/Code/Python/Spark/MyDeltaLake/SalesTerritory/'
+-----------------+--------------------------+--------------------+---------------------+-------------------+
|SalesTerritoryKey|SalesTerritoryAlternateKey|SalesTerritoryRegion|SalesTerritoryCountry|SalesTerritoryGroup|
+-----------------+--------------------------+--------------------+---------------------+-------------------+
|1                |1                         |Northwest           |United States        |Federation         |
|2                |2                         |Northeast           |United States        |Klingon Empire     |
|4                |4                         |Southwest           |United States        |North America      |
|5                |5                         |Southeast           |United States        |North America      |
|6                |6                         |Canada              |

## Databricks Delta Lake Tables: Managed vs Unmanaged

### Managed vs Unmanaged Delta Lake Table

Delta Lake is a powerful storage layer for big data processing workloads in Databricks. In the previous article, we discussed [Delta Lake on Databricks: Python Installation and Setup Guide](/delta-lake-on-databricks-python-installation-and-setup-guide-25c9a9bd11ed)  
When working with Delta Lake tables, you can choose between two types of tables: managed and unmanaged. In this article, we’ll explore the key differences between these two types of tables, and how they’re used in Databricks.

Managed Delta Table
-------------------

Managed Delta Tables are tables whose metadata and data are managed by Delta Lake. You can create a managed Delta table using the SQL API or Python API in Databricks. Managed tables manage the storage and location of data and the table schema. Here’s an example of how to create a managed Delta table using the SQL API:

```
CREATE TABLE my_table (
  id INT,
  name STRING
)
USING DELTA;
```


Unmanaged Delta Table
---------------------

On the other hand, Unmanaged Delta Tables are tables whose metadata is managed by Delta Lake, but data is managed externally. You can create unmanaged Delta tables using the SQL API or Python API in Databricks. Here’s an example of how to create an unmanaged Delta table using the SQL API:

```
CREATE TABLE my_table
USING DELTA
LOCATION '/mnt/delta/my_table';
```


Managed vs Unmanaged Tables
---------------------------

There are several differences between managed and unmanaged tables in Delta Lake. Here are a few key differences:

*   **Storage Location**: Managed tables are stored in a location managed by Delta Lake, while unmanaged tables are stored in an external location managed by the user.
*   **Data Management**: Managed tables manage both metadata and data, while unmanaged tables manage only metadata.
*   **Schema Management**: Both managed and unmanaged tables manage the schema of the table.
*   **Performance**: Managed tables are generally faster than unmanaged tables because they have better control over the storage and access of the data.
*   **Dropping Table**: When you drop a managed Delta table, both the table metadata and data are deleted from the storage layer. However, when you drop an unmanaged Delta table, only the table metadata is deleted, and the data remains intact in the external storage layer. Therefore, you need to be careful when dropping unmanaged tables, as you could lose your data if you’re not careful.

In conclusion, Delta Lake tables in Databricks can be managed or unmanaged, and understanding the differences between the two is crucial to optimizing your big data processing workflows. Whether you need the flexibility of unmanaged tables or the power of managed tables, Delta Lake has you covered.