# Testing the Machine Learning Model

I will keep this notebook updated in a way that it can always be used to test the newest version of the NLP model for classifying AI headlines.



### Setup

1. You need to train the model on your computer beforehand, or ask Kornel for the model weights. You can train the model by running `classifier.ipynb`. It is going to take a couple hours if run on a normal laptop without a GPU. With a GPU it should be in about 10 minutes.
    - **NOTE**: You might need to change the `MODEL_PATH` variable.
2. If you run all cells of this notebook in order, the bottom cell will launch an interactive interface which can be used to interactively test the model or issue API calls to it.

### Kornel TODO

TODO: Check our [multi-label case](https://github.com/abhimishra91/transformers-tutorials/blob/master/transformers_multi_label_classification.ipynb)

[this](https://github.com/Dirkster99/PyNotes/blob/master/Transformers/LocalModelUsage_Finetuning/30%20MultiClass%20Classification%20in%2010%20Minutes%20with%20BERT-TensorFlow-SoftMax-LocalModel.ipynb) may help too?

### Imports

In [1]:
# Updates the transformers library to the latest version. WARNING: If you don't have it installed, it's a couple GBs of data.
!pip install -Uqq transformers
!pip install -Uqq gradio

^C
[31mERROR: Operation cancelled by user[0m


In [9]:
import json
from pathlib import Path

import gradio as gr
from transformers import (
    AutoTokenizer,
    BertForSequenceClassification,
    TextClassificationPipeline,
)

### Set Model Path

In [10]:
MODEL_PATH = Path("test_trainer/checkpoint-500/")

### Loading Data

In [11]:
names = {}

with open("data/data.json") as f:
    d = json.load(f)
    for i, name in enumerate(d.keys()):
        names[f"LABEL_{i}"] = name

names

{'LABEL_0': 'agency',
 'LABEL_1': 'humanComparison',
 'LABEL_2': 'hyperbole',
 'LABEL_3': 'historyComparison',
 'LABEL_4': 'unjustClaims',
 'LABEL_5': 'deepSounding',
 'LABEL_6': 'skeptics',
 'LABEL_7': 'deEmphasize',
 'LABEL_8': 'performanceNumber',
 'LABEL_9': 'inscrutable',
 'LABEL_10': 'objective'}

### Loading Model

In [12]:
model = BertForSequenceClassification.from_pretrained(MODEL_PATH, local_files_only=True)
tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
pipe = TextClassificationPipeline(
    model=model, tokenizer=tokenizer, return_all_scores=True
)



In [13]:
def predict(text):
    pred_dict = dict()
    preds = pipe(text)[0]
    preds = {
        k: v
        for d in map(lambda pred: {names[pred["label"]]: pred["score"]}, preds)
        for k, v in d.items()
    }
    return preds

In [14]:
# Predict scores
predict("Machine Learning is at the forefront of education, replacing human jobs")
# Note that it does a weird prediction, saying that this is a deep-sounding headline, even though it also attributes agency, compares it with humans, uses the hyperbole and
# has unjust claims in the headline.
# TODO: Find a model suitable for multiple classification?
# Check out https://discuss.huggingface.co/t/fine-tune-for-multiclass-or-multilabel-multiclass/4035
# Even better: https://github.com/abhimishra91/transformers-tutorials/blob/master/transformers_multi_label_classification.ipynb

{'agency': 0.0001564613776281476,
 'humanComparison': 0.0009414521045982838,
 'hyperbole': 0.0003454808611422777,
 'historyComparison': 0.0010318453423678875,
 'unjustClaims': 0.0004964807303622365,
 'deepSounding': 0.99512779712677,
 'skeptics': 0.0005015040514990687,
 'deEmphasize': 8.166562474798411e-05,
 'performanceNumber': 0.00028639606898650527,
 'inscrutable': 0.00028850772650912404,
 'objective': 0.0007424494251608849}

### Launching online interface

In [15]:
examples = [
    "Machine Learning is at the forefront of education, replacing human jobs",
    "AI model leaves scientists confused",
    "This model is not really cool",
]

intf = gr.Interface(fn=predict, inputs="textbox", outputs="label", examples=examples)
intf.launch(inline=False)

Running on local URL:  http://127.0.0.1:7861

To create a public link, set `share=True` in `launch()`.


