# Silver to Gold - Column Name Transformation

This notebook reads cleaned Delta Lake tables from the **Silver** layer, renames columns
from PascalCase to snake_case, and writes the curated data as **Delta Lake** tables to the **Gold** layer.

**Transformation**: All column names are converted from `PascalCase` to `snake_case`
(e.g., `ModifiedDate` becomes `modified_date`).

## 1. Explore Silver and Gold Containers

In [None]:
# List tables in the Silver layer
silver_path = "wasbs://silver@intechdataproject.blob.core.windows.net/SalesLT/"
silver_tables = dbutils.fs.ls(silver_path)

for t in silver_tables:
    print(t.name)

In [None]:
# List contents of the Gold layer
gold_path = "wasbs://gold@intechdataproject.blob.core.windows.net/"
gold_contents = dbutils.fs.ls(gold_path)

for g in gold_contents:
    print(g.name)

## 2. Test with a Single Table (Address)

In [None]:
# Read Address table from Silver
df = spark.read.format("delta").load(silver_path + "Address/")
print("Original columns:", df.columns)
display(df)

In [None]:
def to_snake_case(name):
    """
    Convert a PascalCase or camelCase string to snake_case.
    Example: 'ModifiedDate' -> 'modified_date'
    """
    result = ""
    for i, char in enumerate(name):
        if char.isupper() and i > 0 and not name[i - 1].isupper():
            result += "_"
        result += char.lower()
    return result

In [None]:
# Test: show the column name conversion
for col_name in df.columns:
    print(f"{col_name}  ->  {to_snake_case(col_name)}")

In [None]:
# Rename all columns to snake_case
for col_name in df.columns:
    df = df.withColumnRenamed(col_name, to_snake_case(col_name))

print("Renamed columns:", df.columns)
display(df)

## 3. Transform All Tables (Silver to Gold)

Loop through every table in the Silver layer:
1. Read the Delta table
2. Rename all columns from PascalCase to snake_case
3. Write to Gold as Delta Lake

In [None]:
# Get all table names from the Silver layer
table_names = []

for folder in dbutils.fs.ls(silver_path):
    name = folder.name.replace("/", "")
    table_names.append(name)

print("Tables found:", table_names)

In [None]:
for table in table_names:
    # Read from Silver
    input_path = silver_path + table + "/"
    df = spark.read.format("delta").load(input_path)

    # Rename all columns to snake_case
    for col_name in df.columns:
        df = df.withColumnRenamed(col_name, to_snake_case(col_name))

    # Write to Gold as Delta
    output_path = "wasbs://gold@intechdataproject.blob.core.windows.net/SalesLT/" + table + "/"
    df.write.format("delta").mode("overwrite").save(output_path)

    print(f"Processed: {table} ({df.count()} rows)")

print("\nSilver to Gold transformation complete!")

In [None]:
# Verify: display the last processed table
display(df)