# Validate Model

Scores the trained model on the holdout test set and prints a classification report with ROC AUC.

**Pipeline context:** This notebook is the missing "Validate" step. It gets added to the pipeline visually using the **Elyra pipeline editor**, inserted between Train Model and Upload Model.

**Quality gate:** If AUC falls below 0.7, the model should be retrained with different hyperparameters or more data.

In [None]:
import numpy as np
import pickle
from sklearn.metrics import roc_auc_score, classification_report

In [None]:
# Load the trained model and test data from the previous pipeline step
MODEL_PATH = "model.pkl"
TEST_DATA_PATH = "test_data.npz"

print("Loading trained model...")
with open(MODEL_PATH, "rb") as f:
    clf = pickle.load(f)

print("Loading test data...")
test_data = np.load(TEST_DATA_PATH)
X_test, y_test = test_data["X_test"], test_data["y_test"]

print(f"Test samples: {len(X_test)}")

In [None]:
# Run predictions and compute metrics
y_pred = clf.predict(X_test)
y_prob = clf.predict_proba(X_test)[:, 1]
auc = roc_auc_score(y_test, y_prob)

print(classification_report(y_test, y_pred, target_names=["Legitimate", "Fraud"]))
print(f"ROC AUC Score: {auc:.4f}")

In [None]:
# Quality gate
if auc < 0.7:
    print("WARNING: AUC below 0.7 threshold -- model may need retraining")
elif auc < 0.9:
    print("Model quality: ACCEPTABLE (AUC 0.7-0.9)")
else:
    print("Model quality: GOOD (AUC > 0.9)")