# Inference

In this final Notebook we will show how:
 - [Our pre-trained models can be loaded in from the Hugging Face Hub](#load)
 - [We can use these models locally to run inference](#local)

In [1]:
# Imports
import pandas as pd
from setfit import SetFitModel, SetFitTrainer
from sentence_transformers.losses import CosineSimilarityLoss
from setfit import SetFitTrainer
from datasets import Dataset
from setfit import sample_dataset
from tqdm.auto import tqdm
from sklearn.metrics import f1_score
import joblib
import pickle
from sklearn.metrics import confusion_matrix
## Workaround for dashes in name
from importlib import import_module
nlbse_statistics = import_module('code-comment-classification.nlbse_statistics') 

tqdm.pandas()

In [8]:
# Load data
langs = ['java', 'python', 'pharo']
lan_cats = []
datasets = {}
for lan in langs: # for each language
    df = pd.read_csv(f'./code-comment-classification/{lan}/input/{lan}.csv')
    df['label'] = df.instance_type
    cats = list(map(lambda x: lan + '_' + x, list(set(df.category))))
    lan_cats = lan_cats + cats
    for cat in list(set(df.category)): # for each category
        filtered =  df[df.category == cat]
        train_data = Dataset.from_pandas(filtered[filtered.partition == 0])
        test_data = Dataset.from_pandas(filtered[filtered.partition == 1])
        datasets[f'{lan}_{cat}'] = {'train_data': train_data, 'test_data' : test_data}

<a id='load'></a>

## Load Model

For simplicity we will show how to use one of the classifiers, these same steps can be repeated for the other models

In [12]:
# Upon publication the repos will be made public and auth is no longer needed
token = 'hf_XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' 

model = SetFitModel.from_pretrained("aalkaswan/java-deprecation-classifier", 
#                                     device='cpu', #Use this if you don't have a GPU 
                                    use_auth_token=token)

<a id='local'></a>

## Local Inference
Now we will use our model to run inference locally. We run some custom examples and validate that it gives the same scores as in the previous notebook.

In [13]:
#Try some examples 
model(['This method will be removed in version 4.42', 'Init method to initialize object', 
       'SentenceTransformers are awesome!', 'I want a pet Capybara 🦫'])

array([1, 0, 0, 0])

In [14]:
# Score the test set again 
test_data = datasets['java_deprecation']['test_data']
y_hat = model(test_data['comment_sentence'])
y = test_data['label']
_, fp, fn, tp = confusion_matrix(y_hat, y).ravel()
wf1 = f1_score(y, y_hat, average='weighted')
precision, recall, f1 = nlbse_statistics.get_precision_recall_f1(tp, fp, fn)
print(f'precision: {precision}, recall: {recall}, f1 {f1} weighted f1: {wf1}')

precision: 0.7037037037037037, recall: 0.8636363636363636, f1 0.7755102040816326 weighted f1: 0.9763213659957574


## Conclusion

In this notebook we have shown how the classifiers, which we designed in [Notebook 1](./1-Model_selection.ipynb) and created in [Notebook 2](./2-Creating_classifiers.ipynb), can be loaded and used for inference. While a GPU was required for training, inference is still quick on a CPU and with a model size of around 420MB it would fit almost any GPU.

![display image](https://github.com/jglovier/gifs/blob/gh-pages/done/hand-wipe.gif?raw=true)

