## Transformations (Silver layer)

If you want to structure your data differently for the silver layer, you can specify the new schema in the "Schema" place-holder below.

In [None]:
new_schema = {Schema}

```import dlt``` - **dlt** is imported which is Python package for working with Delta Lake and also provides additional functionality or abstractions for working with data in Apache Spark.

```from pyspark.sql.functions import *``` - This imports all functions from the pyspark.sql.functions module. The **pyspark.sql.functions** module contains a wide range of functions that can be used for data manipulation and transformation in Spark SQL.

In [None]:
import dlt
from pyspark.sql.functions import *

**dlt.read()** function is used to read data from a source.

**.select(*)** selects all columns from the DataFrame read from the source.

**{Tranformations}** - This place-holder represents additional transformations that you want to make with respect to the data. Few instances have been given in the following codes.

> Note: - All the transformations you make on a table need to match with the new schema that you have declared at the beginning.

In [None]:
@dlt.table
def ToBeDimension1_cleansed():
    return (
        dlt.read("ToBeDimension1_raw")
        .select(*)
        .withColumnRenamed('OldColumnName', 'New_Column_Name') #changing column name
        .withColumn("column_name", regexp_replace("column_name", "string_value", "new_string_value")) 
         #replace part of a string with another string
         .withColumn("datetime_comun", from_unixtime("formatted_datettime_column")) 
         #changes date-time column to unix date-time format
        .withColumn("column_name", from_json(col("column_name"), new_schema)) 
         #changing the schema of a column in json
        .withColumn("column_name", explode("column_name")) #exploding the array to get the individual rows
        {Tranformations}
        )

In [None]:
@dlt.table
def ToBeDimension2_cleansed():
    return (
        dlt.read("ToBeDimension2_raw")
        .select(*)
        .withColumnRenamed('OldColumnName', 'New_Column_Name') #changing column name
        .withColumn("datetime_comun", from_unixtime("formatted_datettime_column")) 
         #changes date-time column to unix date-time format
        .withColumn("column_name", regexp_replace("column_name", "string_value", "new_string_value")) 
         #replace part of a string with another string
        .withColumn("column_name", from_json(col("column_name"), new_schema)) 
         #changing the schema of a column in json
        .withColumn("column_name", explode("column_name")) #exploding the array to get the individual rows
        {Tranformations}
        )

In [None]:
@dlt.table
def ToBeFact_cleansed():
    return (
        dlt.read("ToBeFact_raw")
        .select(*)
        .withColumnRenamed('OldColumnName', 'New_Column_Name') #changing column name
        .withColumn("datetime_comun", from_unixtime("formatted_datettime_column")) 
         #changes date-time column to unix date-time format
        .withColumn("column_name", regexp_replace("column_name", "string_value", "new_string_value")) 
         #replace part of a string with another string
        .withColumn("column_name", from_json(col("column_name"), new_schema)) 
         #changing the schema of a column in json
        .withColumn("column_name", explode("column_name")) #exploding the array to get the individual rows
        {Tranformations}
        )