> ### Bronze Pipeline - Raw Data Ingestion
Ingested raw source files into the Bronze layer using structured, repeatable workflows that preserve data fidelity and lineage.

In [0]:
# Load October and November datasets from DBFS
df_oct = spark.read.option("header", "true").csv("/Volumes/workspace/ecommerce/ecommerce_data/2019-Oct.csv")
df_nov = spark.read.option("header", "true").csv("/Volumes/workspace/ecommerce/ecommerce_data/2019-Nov.csv")

# Display basic stats for October
print(f"October 2019 - Total Events: {df_oct.count():,}")
print("="*60)
print("SCHEMA:")
print("="*60)
df_oct.printSchema()

        # Display basic stats for November
print(f"November 2019 - Total Events: {df_oct.count():,}")
print("="*60)
print("SCHEMA:")
print("="*60)
df_oct.printSchema()

# Count of Rows in Oct and Nov
print(f"Total rows in df_oct is: {df_oct.count()}")
print(f"Total rows in df_nov is: {df_nov.count()}")   

# Overview of Data
display(df_oct.limit(5))
display(df_nov.limit(5))






October 2019 - Total Events: 42,448,764
SCHEMA:
root
 |-- event_time: string (nullable = true)
 |-- event_type: string (nullable = true)
 |-- product_id: string (nullable = true)
 |-- category_id: string (nullable = true)
 |-- category_code: string (nullable = true)
 |-- brand: string (nullable = true)
 |-- price: string (nullable = true)
 |-- user_id: string (nullable = true)
 |-- user_session: string (nullable = true)

November 2019 - Total Events: 42,448,764
SCHEMA:
root
 |-- event_time: string (nullable = true)
 |-- event_type: string (nullable = true)
 |-- product_id: string (nullable = true)
 |-- category_id: string (nullable = true)
 |-- category_code: string (nullable = true)
 |-- brand: string (nullable = true)
 |-- price: string (nullable = true)
 |-- user_id: string (nullable = true)
 |-- user_session: string (nullable = true)

Total rows in df_oct is: 42448764
Total rows in df_nov is: 67501979


event_time,event_type,product_id,category_id,category_code,brand,price,user_id,user_session
2019-10-01 00:00:00 UTC,view,44600062,2103807459595387724,,shiseido,35.79,541312140,72d76fde-8bb3-4e00-8c23-a032dfed738c
2019-10-01 00:00:00 UTC,view,3900821,2053013552326770905,appliances.environment.water_heater,aqua,33.2,554748717,9333dfbd-b87a-4708-9857-6336556b0fcc
2019-10-01 00:00:01 UTC,view,17200506,2053013559792632471,furniture.living_room.sofa,,543.1,519107250,566511c2-e2e3-422b-b695-cf8e6e792ca8
2019-10-01 00:00:01 UTC,view,1307067,2053013558920217191,computers.notebook,lenovo,251.74,550050854,7c90fc70-0e80-4590-96f3-13c02c18c713
2019-10-01 00:00:04 UTC,view,1004237,2053013555631882655,electronics.smartphone,apple,1081.98,535871217,c6bd7419-2748-4c56-95b4-8cec9ff8b80d


event_time,event_type,product_id,category_id,category_code,brand,price,user_id,user_session
2019-11-01 00:00:00 UTC,view,1003461,2053013555631882655,electronics.smartphone,xiaomi,489.07,520088904,4d3b30da-a5e4-49df-b1a8-ba5943f1dd33
2019-11-01 00:00:00 UTC,view,5000088,2053013566100866035,appliances.sewing_machine,janome,293.65,530496790,8e5f4f83-366c-4f70-860e-ca7417414283
2019-11-01 00:00:01 UTC,view,17302664,2053013553853497655,,creed,28.31,561587266,755422e7-9040-477b-9bd2-6a6e8fd97387
2019-11-01 00:00:01 UTC,view,3601530,2053013563810775923,appliances.kitchen.washer,lg,712.87,518085591,3bfb58cd-7892-48cc-8020-2f17e6de6e7f
2019-11-01 00:00:01 UTC,view,1004775,2053013555631882655,electronics.smartphone,xiaomi,183.27,558856683,313628f1-68b8-460d-84f6-cec7a8796ef2


### Joining the datasets df_Oct and df_Nov
Combining them ensures consistent downstream processing, enables month‑over‑month analysis, and avoids fragmented logic across separate dataframes.

In [0]:
# Create a combined dataset using union() function
df = df_oct.union(df_nov)
print(f"Total Events: {df.count():,}")
   

Total Events: 109,950,743


### Save as Table

In [0]:
# Save to table
df.write.format("delta").mode("overwrite").saveAsTable("df_bronze_events")

###Landed all datasets in Delta format with minimal transformations, establishing a reliable foundation for downstream cleansing and enrichment.