# Load Silver Table to Gold Table - Payment

## Overview
Load Payment data from Silver lakehouse table to Gold lakehouse table.

## Data Flow
- **Source**: MAAG_LH_Silver.finance.Payment (Silver lakehouse table)
- **Target**: MAAG_LH_Gold.finance.Payment (Gold lakehouse - attached as default)
- **Process**: Read Silver table, apply transformations, load to Gold Delta table


In [1]:
import pandas as pd
from pyspark.sql.types import *
from pyspark.sql.functions import col, sum as spark_sum, current_timestamp
import os

# Configuration - Silver to Gold data flow
WORKSPACE_NAME = "Fabric_MAAG"
SOURCE_LAKEHOUSE_NAME = "MAAG_LH_Silver"
SOURCE_SCHEMA = "finance"
SOURCE_TABLE = "payment"

# Source: Absolute path to Silver lakehouse table
SOURCE_TABLE_PATH = f"abfss://{WORKSPACE_NAME}@onelake.dfs.fabric.microsoft.com/{SOURCE_LAKEHOUSE_NAME}.Lakehouse/Tables/{SOURCE_SCHEMA}/{SOURCE_TABLE}"

# Target: Gold lakehouse (attached as default)
TARGET_SCHEMA = "finance"
TARGET_TABLE = "payment"
TARGET_FULL_PATH = f"{TARGET_SCHEMA}.{TARGET_TABLE}"

print(f"üîÑ Loading Payment from Silver to Gold")
print(f"üìÇ Source: {SOURCE_TABLE_PATH}")
print(f"üéØ Target: {TARGET_FULL_PATH}")
print("="*50)

# Read from Silver lakehouse table
df = spark.read.format("delta").load(SOURCE_TABLE_PATH)

print(f"‚úÖ Data loaded from Silver table")
print(f"üìä Records: {df.count()}")
print(f"üìã Columns: {df.columns}")

# Display sample data
print(f"\nüìñ Sample data from Silver:")
df.show(10, truncate=False)

StatementMeta(, 08354f42-0217-49ff-8633-e9c226c022ed, 3, Finished, Available, Finished)

üîÑ Loading Payment from Silver to Gold
üìÇ Source: abfss://Fabric_MAAG@onelake.dfs.fabric.microsoft.com/MAAG_LH_Silver.Lakehouse/Tables/finance/payment
üéØ Target: finance.payment
‚úÖ Data loaded from Silver table
üìä Records: 3616
üìã Columns: ['PaymentId', 'PaymentNumber', 'InvoiceId', 'OrderId', 'PaymentDate', 'PaymentAmount', 'PaymentStatus', 'PaymentMethod', 'CreatedBy']

üìñ Sample data from Silver:
+------------------------------------+-------------+------------------------------------+-------+-----------+-------------+-------------+-------------+---------+
|PaymentId                           |PaymentNumber|InvoiceId                           |OrderId|PaymentDate|PaymentAmount|PaymentStatus|PaymentMethod|CreatedBy|
+------------------------------------+-------------+------------------------------------+-------+-----------+-------------+-------------+-------------+---------+
|7c82dbb4-99bb-4f83-a7d6-f9aa62282c1a|PM-F100000   |547fb079-2c37-456d-b7fc-e0a081bbf04f|       |2

In [3]:
# --- Gold layer transformations and data quality ---
print(f"üîß Applying Gold layer transformations...")

# Add audit columns for Gold layer and set default for CreatedBy if blank or null
from pyspark.sql.functions import when, trim

df_gold = df.withColumn("GoldLoadTimestamp", current_timestamp())\
    .withColumn("CreatedBy", when(trim(col("CreatedBy")).isNull() | (trim(col("CreatedBy")) == ""), "Sample script").otherwise(col("CreatedBy")))

# Data quality checks for Gold layer
print(f"\nüîç Gold layer data quality validation...")

# Check for duplicates
duplicate_count = df_gold.groupBy("PaymentId").count().filter(col("count") > 1).count()
if duplicate_count > 0:
    print(f"‚ö†Ô∏è Found {duplicate_count} duplicate PaymentId values")
else:
    print(f"‚úÖ No duplicates found")

# Check for nulls in key fields
null_checks = df_gold.select(
    spark_sum(col("PaymentId").isNull().cast("int")).alias("null_paymentid"),
    spark_sum(col("PaymentMethod").isNull().cast("int")).alias("null_paymenttype")
).collect()[0]

if null_checks["null_paymentid"] > 0 or null_checks["null_paymenttype"] > 0:
    print(f"‚ö†Ô∏è Found nulls: PaymentId={null_checks['null_paymentid']}, PaymentType={null_checks['null_paymenttype']}")
else:
    print(f"‚úÖ No nulls in key fields")

print(f"\nüìñ Sample Gold data:")
df_gold.show(10, truncate=False)

StatementMeta(, 08354f42-0217-49ff-8633-e9c226c022ed, 5, Finished, Available, Finished)

üîß Applying Gold layer transformations...

üîç Gold layer data quality validation...
‚ö†Ô∏è Found 1808 duplicate PaymentId values
‚úÖ No nulls in key fields

üìñ Sample Gold data:
+------------------------------------+-------------+------------------------------------+-------+-----------+-------------+-------------+-------------+-------------+-------------------------+
|PaymentId                           |PaymentNumber|InvoiceId                           |OrderId|PaymentDate|PaymentAmount|PaymentStatus|PaymentMethod|CreatedBy    |GoldLoadTimestamp        |
+------------------------------------+-------------+------------------------------------+-------+-----------+-------------+-------------+-------------+-------------+-------------------------+
|7c82dbb4-99bb-4f83-a7d6-f9aa62282c1a|PM-F100000   |547fb079-2c37-456d-b7fc-e0a081bbf04f|       |2024-03-05 |16696.36     |Completed    |MC           |Sample script|2025-08-25 20:28:07.73467|
|97240d28-bf79-4ef0-b5de-9cfc733ab033|PM-F100001

In [4]:
# --- Load data to Gold table ---
print(f"üíæ Loading data to Gold table: {TARGET_FULL_PATH}")

try:
    # Write to Gold Delta table (default lakehouse)
    df_gold.write \
      .format("delta") \
      .mode("overwrite") \
      .option("overwriteSchema", "true") \
      .saveAsTable(TARGET_FULL_PATH)

    print(f"‚úÖ Data loaded successfully to Gold table")

    # Verify the load
    result_count = spark.sql(f"SELECT COUNT(*) as count FROM {TARGET_FULL_PATH}").collect()[0]["count"]
    print(f"üìä Records in Gold table: {result_count}")

    # Show sample of loaded Gold data
    print(f"\nüìñ Sample from Gold table:")
    spark.sql(f"SELECT * FROM {TARGET_FULL_PATH} ORDER BY PaymentId").show(10, truncate=False)

    print(f"üéâ Silver to Gold data load complete!")

except Exception as e:
    print(f"‚ùå Error loading data to Gold table: {str(e)}")
    raise

StatementMeta(, 08354f42-0217-49ff-8633-e9c226c022ed, 6, Finished, Available, Finished)

üíæ Loading data to Gold table: finance.payment
‚úÖ Data loaded successfully to Gold table
üìä Records in Gold table: 3616

üìñ Sample from Gold table:
+------------------------------------+-------------+------------------------------------+-------+-----------+-------------+-------------+-------------+-------------+--------------------------+
|PaymentId                           |PaymentNumber|InvoiceId                           |OrderId|PaymentDate|PaymentAmount|PaymentStatus|PaymentMethod|CreatedBy    |GoldLoadTimestamp         |
+------------------------------------+-------------+------------------------------------+-------+-----------+-------------+-------------+-------------+-------------+--------------------------+
|00099c6b-7a82-4d8f-92da-a240f6a7c938|PM-F100053   |ef8fedbd-4650-4727-b274-60bc47f9e0a6|       |2019-09-01 |22182.5      |Completed    |MC           |Sample script|2025-08-25 20:28:47.690893|
|00099c6b-7a82-4d8f-92da-a240f6a7c938|PM-F100053   |ef8fedbd-4650-4727-b2