# Silver Layer Scripting : Transformation Notebook

This notebook focuses exclusively on transforming the sales details dataset from the Bronze layer into a clean and trusted Silver table. Each transformation ensures data quality, consistency, and analytics readiness

**Dataset full Name** : bike_lakehouse.bronze.erp_loc_a101

# Load libraries and functions 

In [0]:
import pyspark.sql.functions as F
from pyspark.sql.functions import length , col , trim

from pyspark.sql.types import StringType , DateType

### Load Bronze Table
Read the Bronze table into a Spark DataFrame to begin transformations.

In [0]:
df = spark.table('bike_lakehouse.bronze.erp_loc_a101')

In [0]:
df.limit(10).display()

CID,CNTRY
AW-00011000,Australia
AW-00011001,Australia
AW-00011002,Australia
AW-00011003,Australia
AW-00011004,Australia
AW-00011005,Australia
AW-00011006,Australia
AW-00011007,Australia
AW-00011008,Australia
AW-00011009,Australia


### Trim String Columns
Automatically remove leading/trailing spaces from all string columns.

In [0]:
for field in df.schema.fields :

    if isinstance(field.name , StringType) :

        df = df.withColumn(field.name , trim(col(field.name)))

### Customer ID Cleanup

In [0]:
df = df.withColumn('CID' , F.regexp_replace(col('CID') , '-',''))

### Country Normalization

In [0]:

df = df.withColumn('CNTRY' ,
    F
    .when(F.upper(col('CNTRY')).isin('US','USA') , "United States")
    .when(F.upper(col("CNTRY")) == "DE", "Germany")
    .when((col("CNTRY") == "") | col("cntry").isNull(), "n/a")
    .otherwise(col('CNTRY'))

)

### Renaming Columns

In [0]:

RENAME_MAP = {
    "cid": "customer_number",
    "cntry": "country"
}

In [0]:
for old_name , new_name in RENAME_MAP.items() :
    df = df.withColumnRenamed(old_name , new_name)

### Sanity checks of dataframe

In [0]:
df.limit(10).display()

customer_number,country
AW00011000,Australia
AW00011001,Australia
AW00011002,Australia
AW00011003,Australia
AW00011004,Australia
AW00011005,Australia
AW00011006,Australia
AW00011007,Australia
AW00011008,Australia
AW00011009,Australia


### Writing Silver Table

In [0]:
df.write.mode('overwrite').format('delta').saveAsTable('bike_lakehouse.silver.erp_customer_location')