# Top-k Method For In-Depth Model Performance Evaluation

In [1]:
import sys
if '../' not in sys.path:
    sys.path.append('../')
import pandas as pd
pd.options.display.max_colwidth = 500
import joblib
from helper_functions import format_raw_df, get_split_by_author, add_features, get_vectorised_series, get_feature_vectors_and_labels, get_top_k

2025-12-22 17:05:52.926588: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2, in other operations, rebuild TensorFlow with the appropriate compiler flags.


## Data Pre-Processing & Preparation

In [2]:
datascience_posts_df = pd.read_csv('../Data Science Posts.csv')
formatted_posts_df = format_raw_df(datascience_posts_df.copy())
questions_df = formatted_posts_df[formatted_posts_df['is_question']]
processed_df = add_features(questions_df.copy())
train_df, test_df = get_split_by_author(processed_df.copy(), test_size = 0.2)

### Loading The Fitted Vectoriser

In [3]:
vectoriser = joblib.load('../Models/Vectoriser_v1.pkl')
train_df['vectors'] = get_vectorised_series(train_df['full_text'], vectoriser)
test_df['vectors'] = get_vectorised_series(test_df['full_text'], vectoriser)

In [4]:
feature_columns = ['question_mark_full', 'question_word_full', 'action_verb_full', 'normalised_text_length']
train_features, train_labels = get_feature_vectors_and_labels(train_df, feature_columns)
test_features, test_labels = get_feature_vectors_and_labels(test_df, feature_columns)

### Loading The Pre-Trained Random Forest Classifier

In [5]:
randforest_classifier = joblib.load('../Models/Model_v1.pkl')

In [6]:
evaluation_df = test_df.copy()
evaluation_df['predicted_probabilities'] = randforest_classifier.predict_proba(test_features)[:, 1]
evaluation_df['actual_label'] = test_labels

In [7]:
columns_to_display = ['actual_label', 'predicted_probabilities', 'Score', 'Title', 'body_text',
                      'question_mark_full', 'question_word_full', 'action_verb_full', 'text_length']
best_positive, best_negative, worst_positive, worst_negative, most_uncertain = get_top_k(
    evaluation_df, 'actual_label', 'predicted_probabilities', k = 3)

## Top-k Best Performing Examples

In [8]:
best_positive[columns_to_display]

Unnamed: 0_level_0,actual_label,predicted_probabilities,Score,Title,body_text,question_mark_full,question_word_full,action_verb_full,text_length
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
11400,True,0.77,3,What is difference between Bayesian Networks and Belief Networks?,"While reading some articles about Bayesian Networks, I came across many occurrences of Belief Networks.\nDo both of these terms mean the same thing or is there any difference between Bayesian Networks and Belief Networks?\n",True,True,False,287
22408,True,0.75,5,What is the difference between C and lambda in the context of an SVM?,I don't understand the difference between the parameter $C$ and $\lambda$ in terms of the SVM. It seems to me that they are both involved in regulating over-fitting of the data. \nWhat difference between $C$ and $\lambda$?\n,True,True,False,292
36368,True,0.71,10,What is the difference between Perceptron and ADALINE?,What is the difference between Perceptron and ADALINE?\n,True,True,False,110


In [9]:
best_negative[columns_to_display]

Unnamed: 0_level_0,actual_label,predicted_probabilities,Score,Title,body_text,question_mark_full,question_word_full,action_verb_full,text_length
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
121726,False,0.08,0,val_accuracy and val_loss not changing while training transformer,recently i have been trying to learn transformer and using it in caption-generator model.\nWhile training for 4 hours val_loss and val_accuracy did not change. loss and accuracy for train_data was atleast moving a little.\n(this output is from different training session but is quite similar to previous one with 4 hour training)\nEpoch 1/10\n100/100 [==============================] - 123s 566ms/step - loss: 12.8864 - masked_accuracy: 0.0140 - val_loss: 12.9553 - val_masked_accuracy: 0.0216\nE...,False,True,False,12931
98334,False,0.09,0,Extra feature on test set,Suppose I convert categorical data into dummy variables with get_dummies and I get these columns in the training dataset:\n\n\n\n\nx_A\nx_B\nx_C\n\n\n\n\n0\n1\n0\n\n\n0\n0\n1\n\n\n1\n1\n0\n\n\n\n\nBut in the test dataset I have the following columns:\n\n\n\n\nx_A\nx_B\nx_C\nx_D\n\n\n\n\n0\n1\n0\n1\n\n\n0\n0\n1\n0\n\n\n1\n1\n0\n1\n\n\n\n\nShould I create a 'D' column in the training set with all the values set to zero to apply later the model on the test set?? Or what should I do?\n,True,True,False,449
60011,False,0.1,1,How to upload SOFT files in Orange?,"I have a dataset which I obtained from NCBI, but it is a SOFT file. When I upload the SOFT file to the file widget in Orange, it cannot be uploaded. Is there a way to upload SOFT files in Orange? \n",True,True,True,233


## Top-k Worst Performing Examples

In [10]:
worst_positive[columns_to_display]

Unnamed: 0_level_0,actual_label,predicted_probabilities,Score,Title,body_text,question_mark_full,question_word_full,action_verb_full,text_length
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
713,True,0.09,3,Cloudera QuickStart VM Error,I have installed cloudera CDH5 Quick start VM on VM player. When I login through HUE in the first page I am the following error\n“Potential misconfiguration detected. Fix and restart Hue.”\nHow to solve this issue.\n,False,True,False,242
81279,True,0.1,2,How do I deploy a model when using Stratified K fold?,"I have used Stratified K fold for learning the model . Below is the python code:\n>def stratified_cv_v1(X, y, clf, shuffle=True, n=10,):\n> stratified_k_fold = StratifiedKFold(n_splits=n,shuffle=shuffle)\n> y_pred_v1 = y.copy()\n> for ii, jj in stratified_k_fold.split(X,y): \n> X_train, X_test = X[ii], X[jj]\n> y_train = y[ii]\n> clf_v2 = clf()\n> clf_v2.fit(X_train,y_train)\n> y_pred[jj] = clf.predict(X_test)\n> return y_pred_v1\n\n\n>print(cla...",True,True,False,695
53023,True,0.12,4,Import data from google drive to Kaggle Kernel,"I want to import a csv file from google drive . I tried using the link in add dataset tab but it is taking some thing else as ""Open"". Please see the image.\n\n",False,False,False,204


In [11]:
worst_negative[columns_to_display]

Unnamed: 0_level_0,actual_label,predicted_probabilities,Score,Title,body_text,question_mark_full,question_word_full,action_verb_full,text_length
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
128374,False,0.66,0,How to derive the formula 13 in the Xavier Initialization paper,How to derive the formula 13 in the Xavier Initialization paper Understanding the difficulty of training deep feedforward neural networks from the formula 6?\n\n,True,True,False,223
16580,False,0.64,1,What is the difference between data-driven methods and machine learning?,"I was wondering (about a more semantic question), is there a difference between data-driven methods and machine learning? Or is it more correct to state that machine learning is a category of data-driven methods (and what then are other categories)? \n",True,True,False,324
42482,False,0.64,1,What is the difference between symmetric bipartite graphs and a complete bipartite graph?,"I am studying Restricted Boltzmann Machines (RBMs), and it is described as a symmetrical bipartite graph. Link \nHow is this different from a Complete bipartite graph? They seem to be the same to me, which is why I'm curious to why there is such a clear difference in terminology.\n",True,True,False,370


## Top-k Most Uncertain Examples

In [12]:
most_uncertain[columns_to_display]

Unnamed: 0_level_0,actual_label,predicted_probabilities,Score,Title,body_text,question_mark_full,question_word_full,action_verb_full,text_length
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
9746,True,0.5,3,Simple Explanation of Apache Kafka,"Can anybody explain Apache Kafka for me in a plain language? I'd appreciate an explanation with a practical example instead of abstract theoretical definitions, then I can understand better. \nWhat is it used for? What does messaging mean? Messaging between what?!! At which stage of a BigData analysis is it used?\nAnd what are prerequisites for learning it?\nPS: \nPlease explain as you would explain for a non-technical person\n",True,True,True,461
23832,False,0.5,1,Training a Graph model like an Artificial Neural Network,"I currently have a Graph model whereby I am mapping connections of different types between entities and attributing a weight to these connections based upon my own personal experience. Also, I would like to understand the connections between these entities in relation to a particular outcome. Looking at this problem, I can't help but notice it's similarity to a typical Artificial Neural Network (ANN) and am wondering if/how I can bake some of the theory there into my model.\nLet me explain f...",True,True,True,1835
66343,True,0.5,4,Drawing Neural Network diagram for academic papers,Is there any tool that one can use to draw neural network architecture diagram for research papers?\nExample diagram: \n\n,True,False,True,170
