# DimFecha – Silver → Gold

**Objetivo:** Crear la dimensión de fechas para análisis temporal a partir de las fechas existentes en la tabla Orders de Silver.

**Proceso:**
1. Calcular la fecha mínima y máxima presentes en Orders (OrderDate, DueDate, ShipDate).  
2. Generar un rango de fechas diario entre min y max.  
3. Calcular columnas de análisis temporal: DateKey, Year, Quarter, Month, Day, WeekOfYear, IsWeekend.  
4. Guardar como Delta Table Gold: `Tables/DimFecha`.  


In [None]:
from pyspark.sql import functions as F
from pyspark.sql.functions import to_date, col

# === 1. Cargar tabla Orders desde Silver ===
df_orders = spark.table("AdventureWorks_SilverLayer.AdventureWorks_Silver_Orders")

# === 2. Convertir las columnas de texto a fecha (formato M/d/yyyy) ===
df_orders = df_orders \
    .withColumn("OrderDate", to_date(col("OrderDate"), "M/d/yyyy")) \
    .withColumn("DueDate", to_date(col("DueDate"), "M/d/yyyy")) \
    .withColumn("ShipDate", to_date(col("ShipDate"), "M/d/yyyy"))

# === 3. Calcular fecha mínima y máxima entre los campos de fecha ===
min_date = df_orders.select(F.min("OrderDate")).collect()[0][0]
max_date = df_orders.select(F.max("ShipDate")).collect()[0][0]

#print(f"Rango de fechas detectado: {min_date} → {max_date}")

# === 4. Generar rango de fechas y columnas de DimFecha

df_dates = spark.sql(f"""
SELECT sequence(to_date('{min_date}'), to_date('{max_date}'), interval 1 day) as DateSeq
""")

df_dates = df_dates.withColumn("FullDate", F.explode("DateSeq")) \
  .withColumn("DateKey", F.date_format("FullDate", "yyyyMMdd").cast("int")) \
  .withColumn("Year", F.year("FullDate")) \
  .withColumn("Quarter", F.quarter("FullDate")) \
  .withColumn("Month", F.month("FullDate")) \
  .withColumn("Day", F.dayofmonth("FullDate")) \
  .withColumn("WeekOfYear", F.weekofyear("FullDate")) \
  .withColumn("IsWeekend", (F.dayofweek("FullDate").isin([1,7])).cast("int"))

# === 5. Revisar los primeros registros
#df_dates.show(10)

# === 6. Guardar tabla final en capa Gold ===
df_dates.write.mode("overwrite").format("delta").option("overwriteSchema", "true").save("Tables/DimFecha")

StatementMeta(, 00074288-bda5-42fc-bfd5-60d3fbffee85, 3, Finished, Available, Finished)

Rango de fechas: 2011-05-31 → 2014-05-08
+--------------------+----------+--------+----+-------+-----+---+----------+---------+
|             DateSeq|  FullDate| DateKey|Year|Quarter|Month|Day|WeekOfYear|IsWeekend|
+--------------------+----------+--------+----+-------+-----+---+----------+---------+
|[2011-05-31, 2011...|2011-05-31|20110531|2011|      2|    5| 31|        22|        0|
|[2011-05-31, 2011...|2011-06-01|20110601|2011|      2|    6|  1|        22|        0|
|[2011-05-31, 2011...|2011-06-02|20110602|2011|      2|    6|  2|        22|        0|
|[2011-05-31, 2011...|2011-06-03|20110603|2011|      2|    6|  3|        22|        0|
|[2011-05-31, 2011...|2011-06-04|20110604|2011|      2|    6|  4|        22|        1|
|[2011-05-31, 2011...|2011-06-05|20110605|2011|      2|    6|  5|        22|        1|
|[2011-05-31, 2011...|2011-06-06|20110606|2011|      2|    6|  6|        23|        0|
|[2011-05-31, 2011...|2011-06-07|20110607|2011|      2|    6|  7|        23|        0|
|[