## Create a Databricks notebook to load a JSOn file to the product delta table

In [0]:
# Read product catalog JSON file into a Spark DataFrame
filePath = "dbfs:/FileStore/GlobalRetail/bronze_layer/product_catalog/products.json"
df = spark.read.json(filePath)
display(df)

-  Whenever a new file comes in, we're going to just copy that data into our bronze layer and keep all the data as it is. 
- So in this layer, we want to just append all the incoming data and to add a one extra column of the timestamps

In [0]:
# Add ingestion timestamp column to the DataFrame
from pyspark.sql.functions import current_timestamp
df_new = df.withColumn("ingestion_timestamp", current_timestamp())
display(df_new)

- We store the data in a Delta Lake table, which serves as the foundational table format.
- Delta Lake enables us to insert, modify, merge, and remove data, while also supporting ACID transactions.

In [0]:
# Write the DataFrame to a Delta table in append mode
spark.sql("use globalretail_bronze")
df_new.write.format("delta").mode("append").saveAsTable("bronze_products")

In [0]:
spark.sql("select * from bronze_products limit 180").show()

- After loading the data from our CSV file into the Delta Lake table, we need to move the processed file from the current folder to an archive folder to avoid reprocessing it.

In [0]:
# Generate archive file path with current timestamp for archiving processed customer data
import datetime
archive_folder= "dbfs:/FileStore/GlobalRetail/bronze_layer/product_catalog/archive/"
archive_filepath = archive_folder + '_'+datetime.datetime.now().strftime("%Y%m%d%H%M%s")
dbutils.fs.mv(filePath, archive_filepath)
print(archive_filepath)