## Create Dimension and Fact tables (Gold layer

The gold layer is the final layer in the data pipeline, where the refined and transformed data resides for business intelligence and analytics purposes.
1. The code creates a create 3 dataframes called `dim1`,`dim2`,`fact_cleansed` by reading the data from the `Dim1_cleansed` ,`Dim2_cleansed`,`Fact_cleansed` table. 
2. The code then creates a fact table by joining the `fact_cleansed` table with the `dim1` and `dim2` tables. It performs inner joins on the `key_column` between the these 3 tables/dataframes and selects specific columns from each table. It also computes a new column using expressions and transformations.
4. The resulting joined data is selected and transformed, and the fact table is created with the specified table properties, comment, and configurations.

Overall, the code creates dimension tables (`dim1` and `dim2`) and a fact table by joining the dimensions with the streaming data from `fact_cleansed`. The resulting tables are stored as Delta tables.

In [None]:
from pyspark.sql.functions import col, expr # used for data manipulation and transformation in Spark SQL

In [None]:
dim1 = spark.read.table("Dim1_cleansed")
dim2 = spark.read.table("Dim2_cleansed")
fact_cleansed = spark.read.table("Fact_cleansed")

In [None]:
fact_df = fact_cleansed.join(dim1, fact_cleansed.key_column==dim1.key_column,"inner")
                       .join(dim2, fact_cleansed.key_column==dim2.key_column,"inner") # # joins the three DataFrames based on the common column key_colum
           # computes a new column using the select() function and various transformations such as col(), alias(), and expr().
        .select(
            "fact_cleansed.column1",
            "dim1.column2",
            "dim2.column3",
            col("fact_cleansed.old_column_name").cast("date").alias("new_column_name"),
            "fact_cleansed.column4",
            "fact_cleansed.column5",
            expr("fact_cleansed.column4 * fact_cleansed.column5").alias("new_column_name"),
        )
##create a table from the DataFrame
fact_df.write.format("delta").saveAsTable("Fact")