# Gold Marts: NASA Predictive Maintenance

## Purpose
Engineering analytics for turbofan engine health monitoring

## Source
- Gold: `fact_nasa_engines`
- Silver: `nasa_turbofan_train` (for trends)

## Mart Tables
1. **mart_engine_degradation** - Degradation patterns by dataset
2. **mart_sensor_health** - Sensor anomaly detection

**Author:** Kevin  
**Date:** Feb 9, 2026



In [0]:
from pyspark.sql.functions import col, avg, stddev, max, min, count, current_timestamp, round as spark_round

storage_account_name = "stgolistmigration"
account_key = ""

spark.conf.set(
    f"fs.azure.account.key.{storage_account_name}.dfs.core.windows.net",
    account_key
)

def get_silver_path(table):
    return f"abfss://silver@{storage_account_name}.dfs.core.windows.net/{table}/"

def get_gold_path(table):
    return f"abfss://gold@{storage_account_name}.dfs.core.windows.net/{table}/"

print("✅ Config loaded")


✅ Config loaded


In [0]:
print("📖 Loading Gold tables...")

fact_engines = spark.read.format("delta").load(get_gold_path("fact_nasa_engines"))
print(f"✅ fact_nasa_engines: {fact_engines.count():,}")


📖 Loading Gold tables...
✅ fact_nasa_engines: 709


In [0]:
print("📊 Building mart_engine_degradation...")

mart_degradation = fact_engines \
    .groupBy("dataset_name") \
    .agg(
        count("*").alias("engine_count"),
        avg("total_cycles").alias("avg_lifetime_cycles"),
        min("total_cycles").alias("min_lifetime_cycles"),
        max("total_cycles").alias("max_lifetime_cycles"),
        stddev("total_cycles").alias("stddev_lifetime_cycles"),
        avg("sensor_1_avg").alias("avg_fan_temp"),
        avg("sensor_7_avg").alias("avg_hpc_pressure"),
        avg("sensor_11_avg").alias("avg_static_pressure"),
        avg("sensor_1_stddev").alias("avg_fan_temp_variation"),
        avg("sensor_7_stddev").alias("avg_pressure_variation")
    ) \
    .withColumn("avg_lifetime_cycles", spark_round(col("avg_lifetime_cycles"), 0)) \
    .withColumn("avg_fan_temp", spark_round(col("avg_fan_temp"), 2)) \
    .withColumn("avg_hpc_pressure", spark_round(col("avg_hpc_pressure"), 2)) \
    .withColumn("avg_fan_temp_variation", spark_round(col("avg_fan_temp_variation"), 2)) \
    .withColumn("mart_created_at", current_timestamp()) \
    .orderBy("dataset_name")

print(f"✅ Created: {mart_degradation.count()} dataset summaries")
mart_degradation.show(truncate=False)

# Write
mart_degradation.write.format("delta").mode("overwrite").save(get_gold_path("mart_engine_degradation"))
print("💾 Saved to: mart_engine_degradation")

print("\n🎉 NASA Predictive Maintenance Marts Complete!")


📊 Building mart_engine_degradation...
✅ Created: 4 dataset summaries
+------------+------------+-------------------+-------------------+-------------------+----------------------+------------+----------------+-------------------+----------------------+----------------------+--------------------------+
|dataset_name|engine_count|avg_lifetime_cycles|min_lifetime_cycles|max_lifetime_cycles|stddev_lifetime_cycles|avg_fan_temp|avg_hpc_pressure|avg_static_pressure|avg_fan_temp_variation|avg_pressure_variation|mart_created_at           |
+------------+------------+-------------------+-------------------+-------------------+----------------------+------------+----------------+-------------------+----------------------+----------------------+--------------------------+
|FD001       |100         |206.0              |128                |362                |46.34274920675729     |1590.61     |47.55           |8.443000000000005  |5.79                  |0.23739999999999997   |2026-02-09 13:34:35.138