# Ensemble Classification Pipeline Example

This notebook demonstrates how to use the trained domain classification pipeline to make predictions.  
It supports both label prediction and probability estimation, with optional SHAP explanations.  
The first section shows how to classify a small batch of domains interactively;  
the second one computes performance metrics across the entire test dataset.


In [None]:
# Import necessary modules
import sys

from core.validator import load_saved_split, load_train_split, load_random_sample
from pipeline import DomainClassifier


### Set label and dataset

In [None]:
MALICIOUS_LABEL = "phishing"  # phishing / malware
STAGE = 3                     # 1 / 2 / 3
VERIFICATION = True           # True / False, use verification dataset of validation dataset

### Classification pipeline demonstration

In [None]:


# Load saved verification data
x_test, y_test = load_saved_split(STAGE, MALICIOUS_LABEL, folder="./data/", verification=VERIFICATION)

# Initialize classifier
DomainClassifier = DomainClassifier(data_sample=x_test, label=MALICIOUS_LABEL)
DomainClassifier.determine_stage(x_test)

# Initialize confusion matrix counters
fp, fn, tp, tn = 0, 0, 0, 0

# Classify domains one by one
for domain, expected_label in zip(x_test, y_test):
    # Get final prediction probability
    final_class = DomainClassifier.classify_proba(domain)['final_proba']
    print(f"Final Class: {final_class} Expected: {expected_label}")
    input()  # Pause for user review (optional)
