# 📈 05_Model_Inference.ipynb

This notebook demonstrates how to use the trained and registered aircraft anomaly prediction model for inference.

It covers:
- Loading the model using both version number and alias (recommended)
- Predicting anomaly likelihood on new sensor feature data
- Writing high-risk predictions to the `anomaly_alerts` Delta table

This ensures the end-to-end machine learning lifecycle is complete from training to real-time scoring.


In [0]:
import mlflow
import pandas as pd
from pyspark.sql.functions import col
from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

## 🔢 Load model using version number

This method explicitly loads a specific version of the model registered in Unity Catalog.

In [0]:
model_uri_version = "models:/AircraftAnomalyPredictor/2"
model_v2 = mlflow.pyfunc.load_model(model_uri_version)
print("✅ Loaded model version 2")

## 🏷️ Load model using alias

This preferred method loads the model tagged with the alias `champion`, making it easy to switch versions without code changes.

In [0]:
model_uri_alias = "models:/AircraftAnomalyPredictor@champion"
model_champion = mlflow.pyfunc.load_model(model_uri_alias)
print("✅ Loaded model with alias @champion")

## 🧪 Prepare input features for prediction

The sample below must match the schema used during model training, including the engineered features.

In [0]:
sample_data = pd.DataFrame([{
    "engine_temp": 620.5,
    "fuel_efficiency": 73.2,
    "vibration": 5.1,
    "altitude": 30500.0,
    "airspeed": 455.5,
    "oil_pressure": 60.0,
    "engine_rpm": 3900,
    "battery_voltage": 24.8,
    "prev_anomaly": 1.0
}])

## 🔍 Run inference using the Registered Model Version

This cell runs inference using a **specific registered model version** (`version 2`) from Unity Catalog.
Using version numbers is useful when you want full control over which model version to use, especially for repeatable experiments.

In [0]:
import pandas as pd
import numpy as np
import mlflow.pyfunc

# 🧪 Load model using pyfunc interface (reliable for inference)
model_uri = "models:/AircraftAnomalyPredictor/7"
model_v2 = mlflow.pyfunc.load_model(model_uri)

# ✅ Prepare sample input (with int32 cast)
sample_data = pd.DataFrame([{
    "engine_temp": 612.5,
    "fuel_efficiency": 76.0,
    "vibration": 5.1,
    "altitude": 31000.0,
    "airspeed": 460.0,
    "oil_pressure": 58.5,
    "engine_rpm": np.int32(3900),
    "battery_voltage": 25.0,
    "prev_anomaly": 0.0,
    "days_since_maint": 20,
    "avg_engine_temp_7d": 608.3,
    "avg_vibration_7d": 5.05,
    "avg_rpm_7d": 3850
}])

# ✅ Predict
prediction = model_v2.predict(sample_data)
print("Predicted Anomaly (0 = Normal, 1 = Anomalous):")
print(prediction)

## 🔍 Run inference using the `@champion` alias

This ensures you're scoring with the most recently promoted model version.

In [0]:
import pandas as pd
import numpy as np
import mlflow.pyfunc

# 🧪 Load model
model_uri = "models:/AircraftAnomalyPredictor/6"
model = mlflow.pyfunc.load_model(model_uri)

# ✅ Define sample input
sample_data = pd.DataFrame([{
    "engine_temp": "612.5",
    "fuel_efficiency": "76.0",
    "vibration": "5.1",
    "altitude": "31000.0",
    "airspeed": "460.0",
    "oil_pressure": "58.5",
    "engine_rpm": "3900",
    "battery_voltage": "25.0",
    "prev_anomaly": "0.0",  # optional
    "avg_engine_temp_7d": 608.3,
    "avg_vibration_7d": 5.05,
    "avg_rpm_7d": 3850.0,
    "days_since_maint": np.int32(20)  # 👈 force int32
}])

# ✅ Explicitly cast column types
sample_data = sample_data.astype({
    "engine_temp": str,
    "fuel_efficiency": str,
    "vibration": str,
    "altitude": str,
    "airspeed": str,
    "oil_pressure": str,
    "engine_rpm": str,
    "battery_voltage": str,
    "prev_anomaly": str,
    "avg_engine_temp_7d": float,
    "avg_vibration_7d": float,
    "avg_rpm_7d": float,
    "days_since_maint": np.int32  # 👈 required fix
})

# ✅ Run prediction
prediction = model.predict(sample_data)
print("Predicted Anomaly (0 = Normal, 1 = Anomalous):")
print(prediction)

## 💾 Save inference results to Delta table

This allows downstream applications or alerts to monitor high-risk events.

In [0]:
inference_df = sample_data.copy()
inference_df["predicted_anomaly"] = predictions
spark_df = spark.createDataFrame(inference_df)

# Overwrite for demo; use append in production
spark_df.write.format("delta").mode("overwrite").saveAsTable("arao.aerodemo.anomaly_alerts")
print("✅ Inference results written to 'anomaly_alerts' Delta table")