### Bronze Layer
Raw e-commerce event data is ingested from Amazon S3 using Unity Catalog External Locations.
Data is stored in Delta format without business transformations to preserve raw fidelity.
Ingestion metadata, including load timestamp and source file path (_metadata.file_path), is captured to enable auditability and lineage within Unity Catalog.

## Create schema

In [0]:
%sql
CREATE SCHEMA IF NOT EXISTS e_commerce_capstone.bronze;


Set catalog & schema

In [0]:
%sql
USE CATALOG e_commerce_capstone;
USE SCHEMA bronze;


## Read raw CSVs from S3

In [0]:
raw_df = (
    spark.read
         .option("header", "true")
         .option("inferSchema", "true")
         .csv("s3://ecommerce-capstone-raw/*.csv")
)



Add ingestion metadata (best practice)

In [0]:
from pyspark.sql.functions import current_timestamp, col

bronze_df = (
    raw_df
    .withColumn("ingestion_timestamp", current_timestamp())
    .withColumn("source_file", col("_metadata.file_path"))
)



## Write Bronze Delta table

In [0]:
(
    bronze_df
    .write
    .format("delta")
    .mode("overwrite")
    .saveAsTable("raw_events")
)


Validate Bronze

In [0]:
%sql
SELECT COUNT(*) FROM e_commerce_capstone.bronze.raw_events;


COUNT(*)
411709736


In [0]:
%sql
SELECT source_file, ingestion_timestamp
FROM e_commerce_capstone.bronze.raw_events
LIMIT 10;


source_file,ingestion_timestamp
s3://ecommerce-capstone-raw/2019-Dec.csv,2026-01-31T17:09:52.595Z
s3://ecommerce-capstone-raw/2019-Dec.csv,2026-01-31T17:09:52.595Z
s3://ecommerce-capstone-raw/2019-Dec.csv,2026-01-31T17:09:52.595Z
s3://ecommerce-capstone-raw/2019-Dec.csv,2026-01-31T17:09:52.595Z
s3://ecommerce-capstone-raw/2019-Dec.csv,2026-01-31T17:09:52.595Z
s3://ecommerce-capstone-raw/2019-Dec.csv,2026-01-31T17:09:52.595Z
s3://ecommerce-capstone-raw/2019-Dec.csv,2026-01-31T17:09:52.595Z
s3://ecommerce-capstone-raw/2019-Dec.csv,2026-01-31T17:09:52.595Z
s3://ecommerce-capstone-raw/2019-Dec.csv,2026-01-31T17:09:52.595Z
s3://ecommerce-capstone-raw/2019-Dec.csv,2026-01-31T17:09:52.595Z
