# Results

This Jupyter notebook is designed to demonstrate the loading of three distinct trained ensemble models, perform predictions on new data, and evaluate these predictions using several metrics: accuracy, recall, precision, and F1 score. The process encompasses obtaining predictions from individual learners within each ensemble and aggregating these predictions to form the final ensemble predictions. The evaluation metrics provide a comprehensive understanding of the ensemble models' performance on the given dataset.


## Import Required Libraries

First, we import all necessary libraries and modules. This includes standard data processing and machine learning libraries such as NumPy and joblib for model loading, and specific functions from scikit-learn for ensemble methods and metrics. We also suppress warnings to keep the notebook clean and more readable.

In [None]:
import warnings
warnings.filterwarnings('ignore')
from sklearn.ensemble import VotingClassifier
from sklearn.metrics import accuracy_score, recall_score, precision_score, f1_score
import joblib
import numpy as np


## Load Ensemble Models

Here, we load the pre-trained ensemble models from their respective files. These models have been saved previously using the joblib library, which allows for easy storage and loading of Python objects. Ensure that the paths to the model files ('ensemble1.pkl', 'ensemble2.pkl', 'ensemble3.pkl') are correct and accessible.

In [None]:
hard_ensemble = joblib.load('ensemble-hard.pkl')
soft_ensemble = joblib.load('ensemble-soft.pkl')
stack_ensemble = joblib.load('ensemble-stacking.pkl')

ensembles = [hard_ensemble, soft_ensemble, stack_ensemble]

## Define Input Data and True Labels

In this cell, we define the input data on which we want to make predictions. This data consists of a list of text samples. Additionally, we specify the true labels for these samples, which are required to evaluate the performance of our models. Replace the placeholder true labels with the actual labels corresponding to your input data.

In [None]:
quotes = [
    "Gago ka putang ina", 
    'OIDAjodiajisjdai', 
    "You are a fucking bitch", 
    "NAKO  NAHIYA  YUNG  KAPAL  NG  PERA  NI  BINAY ", 
    "fuck you binay gago ka",
]

true_labels = np.array([1, 0, 1, 1, 1])  # Example placeholder labels

## Predictions and Metrics Evaluation

For each of the loaded ensemble models, we perform the following steps:
- Obtain final predictions for the input data. For hard voting classifiers, we directly use the 'predict' method. For soft voting classifiers or other ensemble types, we use the 'predict_proba' method and then derive predictions based on the highest probability.
- Evaluate the models' predictions using four metrics: accuracy, recall, precision, and F1 score. These metrics provide a holistic view of the models' performance, indicating not only their overall correctness (accuracy) but also how well they manage positive class predictions (recall and precision) and the balance between recall and precision (F1 score).

In [None]:
for idx, ensemble in enumerate(ensembles, start=1):
    print(f"=== Ensemble Model {idx} ===")
    
    # Final ensemble predictions
    if isinstance(ensemble, VotingClassifier) and ensemble.voting == 'hard':
        predictions = ensemble.predict(quotes)
    else:
        # Assuming binary classification and taking the class with the higher probability
        predictions_proba = ensemble.predict_proba(quotes)
        predictions = np.argmax(predictions_proba, axis=1)
    
    # Metrics evaluation
    accuracy = accuracy_score(true_labels, predictions)
    recall = recall_score(true_labels, predictions)
    precision = precision_score(true_labels, predictions)
    f1 = f1_score(true_labels, predictions)
    
    print(f"Accuracy: {accuracy}")
    print(f"Recall: {recall}")
    print(f"Precision: {precision}")
    print(f"F1 Score: {f1}\n")