# ü•â Bronze Layer ‚Äî Raw Ingestion

**Annie's Magic Numbers Medallion Architecture**

This notebook ingests raw CSV files into the Bronze layer in Delta format.

### üîê CELL 0 ‚Äî ADLS Gen2 Authentication (Storage Account Key)

In [None]:
# ============================================================
# CELL 0 ‚Äî Azure Data Lake Gen2 Authentication
# ============================================================
# This cell configures authentication so Databricks can access
# the ADLS Gen2 account using a Storage Account Key.

spark.conf.set(
    "fs.azure.account.key.anniedatalake123.dfs.core.windows.net",
    "<PASTE_STORAGE_ACCOUNT_KEY_1_HERE>"
)

### üü¶ CELL 1 ‚Äî Azure Data Lake Base Paths

In [None]:
# ============================================================
# CELL 1 ‚Äî Azure Data Lake Gen2 Base Paths
# ============================================================
container_name = "annie-data"
storage_account = "anniedatalake123"

base_path = f"abfss://{container_name}@{storage_account}.dfs.core.windows.net/"
raw_path = base_path + "raw/"
bronze_path = base_path + "bronze/"

### üü¶ CELL 2 ‚Äî Validate RAW Zone Accessibility

In [None]:
# ============================================================
# CELL 2 ‚Äî Validate RAW Zone Accessibility
# ============================================================
dbutils.fs.ls(raw_path)

### üü¶ CELL 3 ‚Äî Generic CSV Reader Function

In [None]:
# ============================================================
# CELL 3 ‚Äî Generic CSV Reader Function
# ============================================================
def read_csv(filename):
    return (
        spark.read
             .option("header", True)
             .option("inferSchema", True)
             .csv(raw_path + filename)
    )

### üü¶ CELL 4 ‚Äî Load RAW CSV Files into DataFrames

In [None]:
# ============================================================
# CELL 4 ‚Äî Load RAW CSV Files into DataFrames
# ============================================================
sales_df = read_csv("SalesFINAL12312016.csv")
purchases_df = read_csv("PurchasesFINAL12312016.csv")
prices_df = read_csv("2017PurchasePricesDec.csv")
begin_inventory_df = read_csv("BegInvFINAL12312016.csv")
end_inventory_df = read_csv("EndInvFINAL12312016.csv")
invoices_df = read_csv("InvoicePurchases12312016.csv")

### üü¶ CELL 5 ‚Äî Data Inspection & Schema Validation

In [None]:
# ============================================================
# CELL 5 ‚Äî Data Inspection & Schema Validation
# ============================================================
display(sales_df)
display(purchases_df)
display(prices_df)

### üü¶ CELL 6 ‚Äî Bronze Delta Writer Function

In [None]:
# ============================================================
# CELL 6 ‚Äî Bronze Delta Writer Function
# ============================================================
def write_bronze(df, table_name):
    (
        df.write
          .format("delta")
          .mode("overwrite")
          .save(bronze_path + table_name)
    )

### üü¶ CELL 7 ‚Äî Persist DataFrames to Bronze Layer

In [None]:
# ============================================================
# CELL 7 ‚Äî Persist DataFrames to the Bronze Layer
# ============================================================
write_bronze(sales_df, "sales")
write_bronze(purchases_df, "purchases")
write_bronze(prices_df, "prices")
write_bronze(begin_inventory_df, "begin_inventory")
write_bronze(end_inventory_df, "end_inventory")
write_bronze(invoices_df, "invoices")

### üü¶ CELL 8 ‚Äî Validate Bronze Folder Structure

In [None]:
# ============================================================
# CELL 8 ‚Äî Validate Bronze Folder Structure
# ============================================================
dbutils.fs.ls(bronze_path)

### üü¶ CELL 9 ‚Äî Validate Delta Read from Bronze

In [None]:
# ============================================================
# CELL 9 ‚Äî Validate Delta Read from Bronze
# ============================================================
sales_bronze_df = (
    spark.read
         .format("delta")
         .load(bronze_path + "sales")
)

display(sales_bronze_df)