# Load Silver Table to Gold Table - Account

## Overview
Load Account data from Silver lakehouse table to Gold lakehouse table.

## Data Flow
- **Source**: MAAG_LH_Silver.finance.Account (Silver lakehouse table)
- **Target**: MAAG_LH_Gold.finance.Account (Gold lakehouse - attached as default)
- **Process**: Read Silver table, apply transformations, load to Gold Delta table


In [1]:
import pandas as pd
from pyspark.sql.types import *
from pyspark.sql.functions import col, sum as spark_sum, current_timestamp
import os

# Configuration - Silver to Gold data flow
WORKSPACE_NAME = "Fabric_MAAG"
SOURCE_LAKEHOUSE_NAME = "MAAG_LH_Silver"
SOURCE_SCHEMA = "finance"
SOURCE_TABLE = "account"

# Source: Absolute path to Silver lakehouse table
SOURCE_TABLE_PATH = f"abfss://{WORKSPACE_NAME}@onelake.dfs.fabric.microsoft.com/{SOURCE_LAKEHOUSE_NAME}.Lakehouse/Tables/{SOURCE_SCHEMA}/{SOURCE_TABLE}"

# Target: Gold lakehouse (attached as default)
TARGET_SCHEMA = "finance"
TARGET_TABLE = "account"
TARGET_FULL_PATH = f"{TARGET_SCHEMA}.{TARGET_TABLE}"

print(f"🔄 Loading Account from Silver to Gold")
print(f"📂 Source: {SOURCE_TABLE_PATH}")
print(f"🎯 Target: {TARGET_FULL_PATH}")
print("="*50)

# Read from Silver lakehouse table
df = spark.read.format("delta").load(SOURCE_TABLE_PATH)

print(f"✅ Data loaded from Silver table")
print(f"📊 Records: {df.count()}")
print(f"📋 Columns: {df.columns}")

# Display sample data
print(f"\n📖 Sample data from Silver:")
df.show(10, truncate=False)

StatementMeta(, dc0df727-09a2-4dad-91a0-b0bae3c35323, 3, Finished, Available, Finished)

🔄 Loading Account from Silver to Gold
📂 Source: abfss://Fabric_MAAG@onelake.dfs.fabric.microsoft.com/MAAG_LH_Silver.Lakehouse/Tables/finance/account
🎯 Target: finance.account
✅ Data loaded from Silver table
📊 Records: 1026
📋 Columns: ['AccountId', 'AccountNumber', 'CustomerId', 'AccountType', 'AccountStatus', 'CreatedDate', 'CreatedBy']

📖 Sample data from Silver:
+------------------------------------+-------------+----------+-----------+-------------+-----------+---------+
|AccountId                           |AccountNumber|CustomerId|AccountType|AccountStatus|CreatedDate|CreatedBy|
+------------------------------------+-------------+----------+-----------+-------------+-----------+---------+
|375977d2-a88e-4c47-abe7-9a8c2ac011bf|ACC-ADB-1000 |CID-001   |Receivable |Overdue      |2018-01-10 |         |
|d22a862c-1102-4d80-b1c1-7524b764ef39|ACC-ADB-1001 |CID-002   |Receivable |Active       |2018-01-10 |         |
|5966d3c9-d76e-486d-be52-4850dceae572|ACC-ADB-1002 |CID-003   |Receivable

In [2]:
# --- Gold layer transformations and data quality ---
print(f"🔧 Applying Gold layer transformations...")

# Add audit columns for Gold layer and set default for CreatedBy if blank or null
from pyspark.sql.functions import when, trim

df_gold = df.withColumn("GoldLoadTimestamp", current_timestamp())\
    .withColumn("CreatedBy", when(trim(col("CreatedBy")).isNull() | (trim(col("CreatedBy")) == ""), "Sample script").otherwise(col("CreatedBy")))

# Data quality checks for Gold layer
print(f"\n🔍 Gold layer data quality validation...")

# Check for duplicates
duplicate_count = df_gold.groupBy("AccountId").count().filter(col("count") > 1).count()
if duplicate_count > 0:
    print(f"⚠️ Found {duplicate_count} duplicate AccountId values")
else:
    print(f"✅ No duplicates found")

# Check for nulls in key fields
null_checks = df_gold.select(
    spark_sum(col("AccountId").isNull().cast("int")).alias("null_accountid"),
    spark_sum(col("AccountType").isNull().cast("int")).alias("null_accounttype")
).collect()[0]

if null_checks["null_accountid"] > 0 or null_checks["null_accounttype"] > 0:
    print(f"⚠️ Found nulls: AccountId={null_checks['null_accountid']}, AccountType={null_checks['null_accounttype']}")
else:
    print(f"✅ No nulls in key fields")

print(f"\n📖 Sample Gold data:")
df_gold.show(10, truncate=False)

StatementMeta(, dc0df727-09a2-4dad-91a0-b0bae3c35323, 4, Finished, Available, Finished)

🔧 Applying Gold layer transformations...

🔍 Gold layer data quality validation...
✅ No duplicates found
✅ No nulls in key fields

📖 Sample Gold data:
+------------------------------------+-------------+----------+-----------+-------------+-----------+-------------+-------------------------+
|AccountId                           |AccountNumber|CustomerId|AccountType|AccountStatus|CreatedDate|CreatedBy    |GoldLoadTimestamp        |
+------------------------------------+-------------+----------+-----------+-------------+-----------+-------------+-------------------------+
|375977d2-a88e-4c47-abe7-9a8c2ac011bf|ACC-ADB-1000 |CID-001   |Receivable |Overdue      |2018-01-10 |Sample script|2025-08-25 20:30:36.06085|
|d22a862c-1102-4d80-b1c1-7524b764ef39|ACC-ADB-1001 |CID-002   |Receivable |Active       |2018-01-10 |Sample script|2025-08-25 20:30:36.06085|
|5966d3c9-d76e-486d-be52-4850dceae572|ACC-ADB-1002 |CID-003   |Receivable |Active       |2018-01-10 |Sample script|2025-08-25 20:30:36.06085

In [3]:
# --- Load data to Gold table ---
print(f"💾 Loading data to Gold table: {TARGET_FULL_PATH}")

try:
    # Write to Gold Delta table (default lakehouse)
    df_gold.write \
      .format("delta") \
      .mode("overwrite") \
      .option("overwriteSchema", "true") \
      .saveAsTable(TARGET_FULL_PATH)

    print(f"✅ Data loaded successfully to Gold table")

    # Verify the load
    result_count = spark.sql(f"SELECT COUNT(*) as count FROM {TARGET_FULL_PATH}").collect()[0]["count"]
    print(f"📊 Records in Gold table: {result_count}")

    # Show sample of loaded Gold data
    print(f"\n📖 Sample from Gold table:")
    spark.sql(f"SELECT * FROM {TARGET_FULL_PATH} ORDER BY AccountId").show(10, truncate=False)

    print(f"🎉 Silver to Gold data load complete!")

except Exception as e:
    print(f"❌ Error loading data to Gold table: {str(e)}")
    raise

StatementMeta(, dc0df727-09a2-4dad-91a0-b0bae3c35323, 5, Finished, Available, Finished)

💾 Loading data to Gold table: finance.account
✅ Data loaded successfully to Gold table
📊 Records in Gold table: 1026

📖 Sample from Gold table:
+------------------------------------+---------------+----------+-----------+-------------+-----------+-------------+-------------------------+
|AccountId                           |AccountNumber  |CustomerId|AccountType|AccountStatus|CreatedDate|CreatedBy    |GoldLoadTimestamp        |
+------------------------------------+---------------+----------+-----------+-------------+-----------+-------------+-------------------------+
|002e2644-6d23-4955-9b59-7d8f8ca5e0bc|ACC-ADB-1423   |CID-424   |Receivable |Overdue      |2018-01-10 |Sample script|2025-08-25 20:31:16.32844|
|00af1be6-574a-42bc-a4c8-efbd0820ab63|ACC-Fabric-1081|CID-082   |Receivable |Overdue      |2020-06-05 |Sample script|2025-08-25 20:31:16.32844|
|0140f8c3-df3d-426b-b5f9-d67df82c2d08|ACC-Fabric-1216|CID-217   |Receivable |Active       |2024-03-02 |Sample script|2025-08-25 20:31:16