# Bibliography Categorization: 'BibCat'
## Tutorial: Estimating performance of classifiers in bibcat.



---


## Introduction.

In this tutorial, we will use bibcat to estimate the performance of classifiers on sets of texts.


---

## User Workflow: Training a machine learning (ML) model.


The `Performance` class contains user-friendly methods for estimating the performance of given classifiers and outputting that performance as, e.g., confusion matrices.  We overview how this method can be run in the code blocks below.

For this tutorial, we assume that the user has already run the trainML tutorial, and so has generated and saved a machine learning model.

In [19]:
#Import external packages
import re
import os
import sys
import json
import numpy as np
sys.path.append("./../main/")
#
#Import bibcat packages
import bibcat_classes as bibcat
import bibcat_config as config
import bibcat_constants as preset

In [9]:
#Set some global variables
seed_test = 10 #Random seed for shuffling text dataset
do_shuffle = True #Whether or not to shuffle the text dataset
max_tests = 100 #Number of text entries to test the performance for; None for all tests available
is_text_processed = True #We are using preprocessed text for this tutorial (previously generated by trainML in a test set directory)
mode_modif = None #We are using preprocessed data in this tutorial, so we do not need a processing mode at all
buffer = 0
#
#Prepare some Keyword objects
kobj_hubble = bibcat.Keyword(
                keywords=["Hubble", "Hubble Telescope",
                          "Hubble Space Telescope"],
                acronyms=["hst", "ht"])
all_kobjs = [kobj_hubble]

In [10]:
#Fetch filepaths for model and data
name_model = config.name_model
filepath_json = config.path_json
dir_model = os.path.join(config.dir_allmodels, name_model)
#
#Set some directories
dir_info = dir_model
dir_test = os.path.join(dir_model, "dir_test")

Let's build a set of classifiers for which we'd like to test the performance.  We'll then feed each of those classifiers into an instance of the Operator class to handle them.

In [11]:
#Create a list of classifiers
#This can be modified to use whatever classifiers you'd like.
#Load a previously trained ML model
filepath_model = os.path.join(dir_model, (name_model+".npy"))
fileloc_ML = os.path.join(dir_model, (preset.tfoutput_prefix+name_model))
classifier_ML = bibcat.Classifier_ML(filepath_model=filepath_model, fileloc_ML=fileloc_ML, do_verbose=True)
#

#Load a rule-based classifier
classifier_rules = bibcat.Classifier_Rules()
#





In [17]:
#Load models into instances of the Operator class
operator_1 = bibcat.Operator(classifier=classifier_ML, mode=mode_modif, keyword_objs=all_kobjs,
                            name="Operator_1", do_verbose=True, load_check_truematch=True, do_verbose_deep=False)
operator_2 = bibcat.Operator(classifier=classifier_rules,
                            name="Operator_2", mode=mode_modif, keyword_objs=all_kobjs, do_verbose=True, do_verbose_deep=False)
list_operators = [operator_1, operator_2] #Feel free to add more/less operators here.
#

Instance of Operator successfully initialized!
Keyword objects:
0: Keyword Object:
Name: Hubble
Keywords: ['Hubble Space Telescope', 'Hubble Telescope', 'Hubble']
Acronyms: ['hst', 'ht']

Instance of Operator successfully initialized!
Keyword objects:
0: Keyword Object:
Name: Hubble
Keywords: ['Hubble Space Telescope', 'Hubble Telescope', 'Hubble']
Acronyms: ['hst', 'ht']



Now, let's fetch some text for our classifiers to classify. For this tutorial, we'll load previously processed texts from the directory containing the test set for the ML classifier.

In [21]:
#Load information for the processed text
dict_allinfo = np.load(os.path.join(dir_info, "dict_textinfo.npy"), allow_pickle=True).item()

#Prepare filepaths for each text
list_filenames = [item for item in os.listdir(dir_test) if item.endswith(".txt")]

#Shuffle the tests, if so requested
if do_shuffle:
    np.random.seed(seed_test)
    np.random.shuffle(list_filenames)
#

#Truncate the number of tests, if so requested
if (max_tests is not None):
    list_filenames = list_filenames[0:max_tests]
#

dict_texts = {}
#Process the tests into a dictionary of texts
for ii in range(0, len(list_filenames)):
    curr_filename = list_filenames[ii]
    curr_fileroot = re.sub("\.txt$", "", curr_filename) #Remove extension
    curr_info = dict_allinfo[curr_fileroot]

    #Load the text from this file
    with open(os.path.join(dir_test, curr_filename), 'r') as openfile:
        curr_text = openfile.read()
    #
    
    #Store info for this current text entry
    curr_dict = {"text":curr_text, "mission":curr_info["mission"], "forest":curr_info["forest"],
                "class":curr_info["class"], "id":curr_info["id"]}
    dict_texts[str(ii)] = curr_dict
#

Next, let's prepare some additional information for each of these classifiers.  We'll need to set, for example, the uncertainty thresholds for accepting or rejecting each classifier's output.

In [22]:
#Set parameters for each operator and its internal classifier
#Global parameters
do_verify_truematch = True
do_raise_innererror = False

#For operator 1
mapper_1 = None #Mapper to mask classifications; None if no masking
dict_texts_1 = dict_texts #Dictionary of texts to classify
threshold_1 = 0.70 #Uncertainty threshold for this classifier

#For operator 2
mapper_2 = None #Mapper to mask classifications; None if no masking
dict_texts_2 = dict_texts #Dictionary of texts to classify
threshold_2 = 0.70 #Uncertainty threshold for this classifier

#Gather parameters into lists
list_mappers = [mapper_1, mapper_2]
list_thresholds = [threshold_1, threshold_2]
list_dict_texts = [dict_texts_1, dict_texts_2]

Now, let's evaluate the performance of these classifiers.

In [23]:
#Create an instance of the Performance class
performer = bibcat.Performance()

NameError: name 'do_verbose' is not defined

In [None]:
#Run the pipeline for a basic evaluation of model performance
performer.evaluate_performance_basic(operators=list_operators, dicts_texts=list_dict_texts, mappers=list_mappers,
                                     thresholds=list_thresholds, is_text_processed=is_text_processed,
                                     do_verify_truematch=do_verify_truematch, do_raise_innererror=do_raise_innererror,
                                     do_save_evaluation=True, do_save_misclassif=True, filepath_output=filepath_output,
                                     fileroot_evaluation=fileroot_evaluation, fileroot_misclassif=fileroot_misclassif,
                                     print_freq=25, do_verbose=True, do_verbose_deep=False)

And with that, you should have new confusion matrices summarizing the basic performance for these classifiers saved in your requested directory!

---