# Upgrading S3 Parquet to Delta format

In this demo, you will see a simulation of taking an existing Parquet file on a non-Unity Catalog location ( in our case a `dbfs` directory) and create a Delta Lake format table with Change Data Feed enabled

## Prerequisite for using S3 buckets

You will need to either configure in [Catalog Explorer or using SQL an external storage location for the S3 bucket](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-external-locations) 

## Configure Scope for Running Demo where you can create tables

In [0]:
%python
catalog = dbutils.widgets.get("catalog")
schema = dbutils.widgets.get("schema")

## Create Mock S3 location and write a sample Parquet file

In [0]:
%fs mkdirs mock_s3_bucket

In [0]:
%python
spark.table("samples.tpch.orders").write.mode("overwrite").format("parquet").save("dbfs:/mock_s3_bucket/orders")

## Convert Parquet to Delta without copying data

In [0]:
%python
# spark.table("samples.tpch.orders").printSchema()

In [0]:
-- CREATE EXTERNAL TABLE test_ext
-- (
--  o_orderkey long,
--  o_custkey long,
--  o_orderstatus string,
--  o_totalprice double,
--  o_orderdate string,
--  o_orderpriority string,
--  o_clerk string,
--  o_shippriority integer,
--  o_comment string
-- )
-- STORED AS PARQUET 
-- LOCATION 'dbfs:/mock_s3_bucket/orders'

In [0]:
%sql
CONVERT TO DELTA parquet.`dbfs:/mock_s3_bucket/orders`

## See the creation of `_delta_log` directory 

In [0]:
%fs ls mock_s3_bucket/orders

## Use parameters to create an External Table in a schema from Mock S3 Bucket

In [0]:
%sql
USE CATALOG ${catalog};
USE SCHEMA ${schema};

In [0]:

CREATE TABLE orders_delta
USING DELTA LOCATION "dbfs:/mock_s3_bucket/orders"

In [0]:
DESCRIBE EXTENDED orders_delta

## Add Change Data Feed Support

In [0]:
%sql
ALTER TABLE orders_delta
SET TBLPROPERTIES (delta.enableChangeDataFeed = true);

## Make changes to data 

In [0]:
UPDATE orders_delta SET o_comment = 'updated' WHERE o_orderkey = 1;
DELETE FROM orders_delta WHERE o_orderkey = 1;
INSERT INTO ${catalog}.${schema}.orders_delta VALUES (1,
 184501,
 'O',
 203010.51,
 '1996-02-01',
 '5-LOW',
 'Clerk#000004753',
 0,
 'nstructions sleep furiously among ')

## View raw CDF files

Note the `_change_data` directory

In [0]:
%fs ls mock_s3_bucket/orders

## View CDF history

In [0]:
DESCRIBE HISTORY orders_delta

In [0]:
SELECT * FROM table_changes('orders_delta',2)