# Silver Layer: ERP Locations Transformation
Processing customer geography data from `erp_loc_a101`.
- **Framework**: `silver_engine` for automated string cleaning.
- **Transformations**: 
    - Cleans `customer_number` by removing hyphens and converting to upper case.
    - Standardizes Country names (DE -> Germany, US/USA -> USA).
    - Handles empty/null strings by mapping them to 'n/a'.
- **Output**: Delta table `workspace.silver.erp_locations`.


In [0]:
%run ../../helpers/silver_engine.ipynb

In [0]:
%python
import pyspark.sql.functions as F

def logic(df):
    return (
        df
        # 1. ID Cleaning: Remove hyphens and convert to Upper Case to match CRM format
        .withColumn("customer_number", F.upper(F.regexp_replace(F.col("cid"), "-", "")))
        
        # 2. Country Normalization: Consolidating country codes into full descriptive names
        .withColumn("country_name", 
            F.when(F.trim(F.upper(F.col("cntry"))).isin("DE", "GERMANY"), "Germany")
             .when(F.trim(F.upper(F.col("cntry"))).isin("US", "USA", "UNITED STATES"), "United States")
             .when((F.trim(F.col("cntry")) == "") | (F.col("cntry").isNull()), "n/a")
             .otherwise(F.col("cntry"))
        )
        
        # 3. Final Selection: Keeping only necessary columns for the Gold Layer join
        .select("customer_number", "country_name")
    )

# Executing the standardized silver pipeline
run_silver_pipeline("erp_loc_a101", "erp_locations", logic)

In [0]:
%sql
-- Validation of normalized geographic data
SELECT * FROM workspace.silver.erp_locations;