### Load Source Data
Load the existing e-commerce transactions data from the Databricks catalog into a Spark DataFrame for further processing.

In [0]:
events = spark.read.table("workspace.default.ecommerce_transactions")

### Convert Data to Delta Format (PySpark)
Convert the source data into Delta format by writing it as a managed Delta table using format("delta") and saveAsTable().

In [0]:
events.write \
  .format("delta") \
  .mode("overwrite") \
  .saveAsTable("workspace.default.events_table")


### Verify Delta Table
Confirming that the newly created table is stored in Delta format.

In [0]:

# DESCRIBE DETAIL provides metadata about the table
display(
  spark.sql(
    """
    DESCRIBE DETAIL workspace.default.events_table 
    """
  )
)

format,id,name,description,location,createdAt,lastModified,partitionColumns,clusteringColumns,numFiles,sizeInBytes,properties,minReaderVersion,minWriterVersion,tableFeatures,statistics,clusterByAuto
delta,6bcdcc6d-400f-4de8-99c5-7fca557d4e2d,workspace.default.events_table,,,2026-01-12T04:29:46.175Z,2026-01-12T14:35:22.000Z,List(),List(),1,460193,Map(delta.enableDeletionVectors -> true),3,7,"List(appendOnly, deletionVectors, invariants)","Map(numRowsDeletedByDeletionVectors -> 0, numDeletionVectors -> 0)",False


### Create Delta Table Using SQL

In [0]:
%sql
CREATE TABLE IF NOT EXISTS workspace.default.events_delta  
USING DELTA
AS
SELECT * FROM workspace.default.events_table;


num_affected_rows,num_inserted_rows


### Test Schema Enforcement
Try inserting data with an incompatible schema to test schema enforcement.


In [0]:
try:
    wrong_schema = spark.createDataFrame(
        [("a","b","c")],
        ["x","y","z"]
    )

    wrong_schema.write \
      .format("delta") \
      .mode("append") \
      .saveAsTable("workspace.default.events_table")

except Exception as e:
    print("Schema enforcement worked ")
    print(e)


Schema enforcement worked 
[_LEGACY_ERROR_TEMP_DELTA_0007] A schema mismatch detected when writing to the Delta table (Table ID: 6bcdcc6d-400f-4de8-99c5-7fca557d4e2d).
To enable schema migration using DataFrameWriter or DataStreamWriter, please set:
'.option("mergeSchema", "true")'.
For other operations, set the session configuration
spark.databricks.delta.schema.autoMerge.enabled to "true". See the documentation
specific to the operation for details.

Table schema:
root
-- Transaction_ID: long (nullable = true)
-- User_Name: string (nullable = true)
-- Age: long (nullable = true)
-- Country: string (nullable = true)
-- Product_Category: string (nullable = true)
-- Purchase_Amount: double (nullable = true)
-- Payment_Method: string (nullable = true)
-- Transaction_Date: date (nullable = true)


Data schema:
root
-- x: string (nullable = true)
-- y: string (nullable = true)
-- z: string (nullable = true)

         
Table ACLs are enabled in this cluster, so automatic schema migration is

### Handle Duplicate Inserts



In [0]:
removedup = events.dropDuplicates(["Transaction_ID"])

removedup.write \
  .format("delta") \
  .mode("overwrite") \
  .option("overwriteSchema", "true") \
  .saveAsTable("workspace.default.events_table")


In [0]:
%sql
SELECT Transaction_ID, COUNT(*)
FROM workspace.default.events_table
GROUP BY Transaction_ID
HAVING COUNT(*) > 1;

Transaction_ID,COUNT(*)
