# Silver Layer Scripting : Transformation Notebook

This notebook focuses exclusively on transforming the sales details dataset from the Bronze layer into a clean and trusted Silver table. Each transformation ensures data quality, consistency, and analytics readiness

**Dataset full Name** : bike_lakehouse.bronze.erp_px_cat_g1v2

### Load libraries and functions

In [0]:
import pyspark.sql.functions as F
from pyspark.sql.functions import trim , length , col

from pyspark.sql.types import StringType , DateType

### Load Bronze Table
Read the Bronze table into a Spark DataFrame to begin transformations.



In [0]:
df = spark.table('bike_lakehouse.bronze.erp_px_cat_g1v2')

In [0]:
df.limit(10).display()

ID,CAT,SUBCAT,MAINTENANCE
AC_BR,Accessories,Bike Racks,Yes
AC_BS,Accessories,Bike Stands,No
AC_BC,Accessories,Bottles and Cages,No
AC_CL,Accessories,Cleaners,Yes
AC_FE,Accessories,Fenders,No
AC_HE,Accessories,Helmets,Yes
AC_HP,Accessories,Hydration Packs,No
AC_LI,Accessories,Lights,Yes
AC_LO,Accessories,Locks,Yes
AC_PA,Accessories,Panniers,No


### Trim String Columns
Automatically remove leading/trailing spaces from all string columns.

In [0]:
for field in df.schema.fields :

    if isinstance(field.name , StringType) :

        df = df(field.name , trim(col(field.name)))

### Normalization 
Normalize Maintenance Flag to Boolean

In [0]:
df = df.withColumn(
    'MAINTENANCE' ,
    F
    .when(F.upper(col("maintenance")) == "YES", F.lit(True))
    .when(F.upper(col("maintenance")) == "NO", F.lit(False))
    .otherwise(None)
)

### Renaming Columns

In [0]:

RENAME_MAP = {
    "id": "category_id",
    "cat": "category",
    "subcat": "subcategory",
    "maintenance": "maintenance_flag"
}

In [0]:
for old_name , new_name in RENAME_MAP.items() :

    df = df.withColumnRenamed(old_name , new_name)

### Sanity checks of dataframe

In [0]:
df.limit(10).display()

category_id,category,subcategory,maintenance_flag
AC_BR,Accessories,Bike Racks,True
AC_BS,Accessories,Bike Stands,False
AC_BC,Accessories,Bottles and Cages,False
AC_CL,Accessories,Cleaners,True
AC_FE,Accessories,Fenders,False
AC_HE,Accessories,Helmets,True
AC_HP,Accessories,Hydration Packs,False
AC_LI,Accessories,Lights,True
AC_LO,Accessories,Locks,True
AC_PA,Accessories,Panniers,False


### Writing Silver Table

In [0]:
df.write.mode('overwrite').format('delta').saveAsTable('bike_lakehouse.silver.erp_product_category')

In [0]:
%sql
select * from bike_lakehouse.silver.erp_product_category limit 10 ;

category_id,category,subcategory,maintenance_flag
AC_BR,Accessories,Bike Racks,True
AC_BS,Accessories,Bike Stands,False
AC_BC,Accessories,Bottles and Cages,False
AC_CL,Accessories,Cleaners,True
AC_FE,Accessories,Fenders,False
AC_HE,Accessories,Helmets,True
AC_HP,Accessories,Hydration Packs,False
AC_LI,Accessories,Lights,True
AC_LO,Accessories,Locks,True
AC_PA,Accessories,Panniers,False
