**Andere manier om te kijken naar evaluation, i.p.v terug te splitten en te predicten, nu opgeslagen predicition gebruiken**

In [34]:
import pandas as pd
import torch
import numpy as np
import torch.nn.functional as F
import os
import json  # Needed for loading the mappings

# === Define path to the saved run ===
save_path = r"C:\Users\corne\OneDrive - KU Leuven\Thesis\Working Code\SAVED-Models\GroNLP\Run_2025-04-10_21-40"

# ‚úÖ Load the saved test predictions 
df = pd.read_csv(os.path.join(save_path, "test_predictions.csv"))

# ‚úÖ Recreate logits tensor from the CSV
logits = torch.tensor(df["logits"].apply(eval).tolist())

# ‚úÖ Apply softmax to get prediction probabilities
probabilities = F.softmax(logits, dim=1)

# ‚úÖ Extract raw values
texts = df["text"].tolist()
true_labels_ids = df["true_label"].tolist()
predicted_label_ids = df["predicted_label"].tolist()

# ‚úÖ Convert label IDs to themes using the mappings
with open(os.path.join(save_path, "label_mappings.json"), "r", encoding="utf-8") as f:
    mappings = json.load(f)

theme_to_id = mappings["theme_to_id"]
id_to_theme = {int(k): v for k, v in mappings["id_to_theme"].items()}  # convert keys back to int


**Unknowns**

In [35]:
# Find most confidently wrong predictions
probs_np = probabilities.numpy()
confidences = probs_np.max(axis=1)

errors = []
for i in range(len(texts)):
    if true_labels_ids[i] != predicted_label_ids[i]:
        errors.append((confidences[i], texts[i], id_to_theme[true_labels_ids[i]], id_to_theme[predicted_label_ids[i]]))

# Sort by confidence descending
errors.sort(reverse=True)

# Show top 5
for confidence, text, true_theme, predicted_theme in errors[:5]:
    print(f"üß† Confidence: {confidence:.2f}")
    print(f"‚ùå True: {true_theme} | Predicted: {predicted_theme}")
    print(f"üí¨ Text: {text}")
    print("-" * 50)


üß† Confidence: 1.00
‚ùå True: Toerisme | Predicted: Onderwijs en Samenleving
üí¨ Text: Kan de minister het voorstel van de sector steunen om de vraag te richten tot de bevoegde federale instanties om in navolging van de zorgsector ook voor de toeristische sector de werkuren van jobstudenten in het derde en vierde kwartaal van 2022 te neutraliseren zodat deze uren niet meetellen voor de berekening van de 475 uren die elk jaar door jobstudenten gepresteerd mogen worden tegen solidariteitsbijdragen?
--------------------------------------------------
üß† Confidence: 1.00
‚ùå True: Brussel en de Vlaamse Rand | Predicted: Welzijn en Gezondheid
üí¨ Text: Hoeveel van die gemiste kinderen hebben uiteindelijk een participatietoeslag gekregen?
--------------------------------------------------
üß† Confidence: 1.00
‚ùå True: Welzijn en Gezondheid | Predicted: Onderwijs en Samenleving
üí¨ Text: Zal het makkelijker worden voor instellingen om toegang te krijgen tot de schooltoelage, teneinde 

In [36]:
true_labels = [id_to_theme[i] for i in true_labels_ids]
predicted_labels = [id_to_theme[i] for i in predicted_label_ids]
correct = [true == pred for true, pred in zip(true_labels, predicted_labels)]

output_df = pd.DataFrame({
    "text": texts,
    "true_label": true_labels,
    "predicted_label": predicted_labels,
    "is_correct": correct,
    "confidence": confidences
})

# ‚úÖ Sort: incorrect first, then by highest confidence
output_df = output_df.sort_values(by=["is_correct", "confidence"], ascending=[True, False])

# ‚úÖ Optional: add ranking
output_df["rank"] = range(1, len(output_df) + 1)

# ‚úÖ Save to Excel
excel_path = os.path.join(os.getcwd(), "prediction_confidence_report.xlsx")
output_df.to_excel(excel_path, index=False)
print(f"‚úÖ Sorted report saved to: {excel_path}")


‚úÖ Sorted report saved to: c:\Users\corne\Documents\thesis-question-classification\ConfidenceHandling\prediction_confidence_report.xlsx
