# 🚀 02_10_EnvironmentalSystems_Data_Generation.ipynb

This notebook generates **synthetic environmental and external weather data** for the Databricks AeroDemo pipeline.
It creates aircraft-level records capturing environmental impacts relevant to operations and component stress.

---

### 📋 What this notebook does:

✅ **Sets up aircraft fleet**
- Uses the same fleet IDs as earlier generators (`A320`, `B737`, `A330`)

✅ **Generates daily environmental records**
- For each aircraft:
  - Simulates **external temperature** and **humidity**
  - Captures **wind speed** and **crosswind impact**
  - Includes **precipitation type** (rain, snow, clear) and **atmospheric pressure**
  - Adds random extreme events (storms, turbulence)

✅ **Creates a structured dataset**
- Records → CSV for Auto Loader (`/Volumes/arao/aerodemo/tmp/environmental`)

✅ **Saves output safely**
- Removes old sample files before saving new ones

---

### 🛠 Key points:
- **Focus area:** Environmental and weather impacts on aircraft systems
- **Data diversity:** Includes normal, adverse, and extreme weather conditions
- **Downstream integration:** Feeds DLT tables like `twin_environmental_systems` and operational risk analysis

---

### 🔗 Where this fits:
This notebook is part of the **02_ series** synthetic data generators:
- `02_01_Sensor_Data_Generation.ipynb`
- `02_02_Engine_Data_Generation.ipynb`
- `02_03_CabinPressurization_Data_Generation.ipynb`
- `02_04_Airframe_Synthetic_Data_Generation.ipynb`
- `02_05_LandingGear_Data_Generation.ipynb`
- `02_06_Avionics_Data_Generation.ipynb`
- `02_07_ElectricalSystems_Data_Generation.ipynb`
- `02_08_FuelSystems_Data_Generation.ipynb`
- `02_09_HydraulicSystems_Data_Generation.ipynb`
- `02_10_EnvironmentalSystems_Data_Generation.ipynb` (this file)

In [0]:
import os
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
import random

def generate_environmental_systems_data(num_records_per_aircraft=100):
    aircraft_ids = ["A320_101", "A330_201", "B737_301"]
    precipitation_options = ['CLEAR', 'RAIN', 'SNOW', 'STORM']
    extreme_events = ['NONE', 'TURBULENCE', 'SEVERE_WIND', 'THUNDERSTORM']

    data = {
        'environmental_id': [],
        'aircraft_id': [],
        'event_timestamp': [],
        'external_temperature': [],
        'humidity': [],
        'wind_speed': [],
        'crosswind_impact': [],
        'precipitation': [],
        'atmospheric_pressure': [],
        'extreme_event': []
    }

    for aircraft_id in aircraft_ids:
        for i in range(num_records_per_aircraft):
            random_days_ago = random.randint(0, 6)
            random_time = datetime.now() - timedelta(
                days=random_days_ago,
                hours=random.randint(0, 23),
                minutes=random.randint(0, 59)
            )
            data['environmental_id'].append(f'EN_{aircraft_id}_{i:03d}')
            data['aircraft_id'].append(aircraft_id)
            data['event_timestamp'].append(random_time.strftime("%Y-%m-%d %H:%M:%S"))
            data['external_temperature'].append(round(np.random.uniform(-60, 40), 2))  # °C
            data['humidity'].append(round(np.random.uniform(0, 100), 2))  # percentage
            data['wind_speed'].append(round(np.random.uniform(0, 150), 2))  # knots
            data['crosswind_impact'].append(round(np.random.uniform(0, 30), 2))  # degrees
            data['precipitation'].append(random.choice(precipitation_options))
            data['atmospheric_pressure'].append(round(np.random.uniform(900, 1100), 2))  # hPa
            data['extreme_event'].append(random.choice(extreme_events))

    df = pd.DataFrame(data)
    return df

# Generate DataFrame
df = generate_environmental_systems_data()

# Save to Auto Loader-compatible path
output_path = "/Volumes/arao/aerodemo/tmp/environmental"
os.makedirs(output_path, exist_ok=True)
output_file = f"{output_path}/environmental_systems_sample.csv"
if os.path.exists(output_file):
    os.remove(output_file)  # safely remove old file

df.to_csv(output_file, index=False)

print(f"✅ Environmental systems data generated: {len(df)} rows saved to {output_file}")