# Assess predictions on binary text classification blbooksgenre data with a huggingface transformers model


This notebook demonstrates the use of the `responsibleai` API to assess a text classification huggingface transformers model trained on the blbooksgenre dataset (see https://huggingface.co/datasets/blbooksgenre for more information about the dataset). It walks through the API calls necessary to create a widget with model analysis insights, then guides a visual analysis of the model.

* [Launch Responsible AI Toolbox](#Launch-Responsible-AI-Toolbox)
    * [Load Model and Data](#Load-Model-and-Data)
    * [Create Model and Data Insights](#Create-Model-and-Data-Insights)

## Launch Responsible AI Toolbox

The following section examines the code necessary to create datasets and a model. It then generates insights using the `responsibleai` API that can be visually analyzed.

### Load Model and Data
*The following section can be skipped. It loads a dataset and trains a model for illustrative purposes.*

First we import all necessary dependencies

In [1]:
import datasets
import pandas as pd
import zipfile
from sklearn.model_selection import train_test_split
from transformers import (AutoModelForSequenceClassification, AutoTokenizer,
                          pipeline)

from raiutils.common.retries import retry_function

try:
    from urllib import urlretrieve
except ImportError:
    from urllib.request import urlretrieve

Next we load the blbooksgenre dataset from huggingface datasets

In [2]:
NUM_TEST_SAMPLES = 50 #20

def load_dataset(split):
    config_kwargs = {"name": "title_genre_classifiction"}
    dataset = datasets.load_dataset("blbooksgenre", split=split, **config_kwargs)
    return pd.DataFrame({"text": dataset["title"], "label": dataset["label"]})

pd_data = load_dataset("train")

pd_data, pd_valid_data = train_test_split(
    pd_data, test_size=0.2, random_state=0)

START_INDEX = 0
train_data = pd_data[NUM_TEST_SAMPLES:].reset_index(drop=True)
test_data = pd_valid_data[:NUM_TEST_SAMPLES].reset_index(drop=True)

You can avoid this message in future by passing the argument `trust_remote_code=True`.
Passing `trust_remote_code=True` will be mandatory to load this dataset from the next major release of `datasets`.


Downloading builder script:   0%|          | 0.00/16.0k [00:00<?, ?B/s]

Downloading metadata:   0%|          | 0.00/12.4k [00:00<?, ?B/s]

Downloading readme:   0%|          | 0.00/25.3k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/20.1M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/1736 [00:00<?, ? examples/s]

Fetch a pre-trained huggingface model on the blbooksgenre dataset

In [3]:
BLBOOKSGENRE_MODEL_NAME = "blbooksgenre_model"
NUM_LABELS = 2

class FetchModel(object):
    def __init__(self):
        pass

    def fetch(self):
        zipfilename = BLBOOKSGENRE_MODEL_NAME + '.zip'
        url = ('https://publictestdatasets.blob.core.windows.net/models/' +
               BLBOOKSGENRE_MODEL_NAME + '.zip')
        urlretrieve(url, zipfilename)
        with zipfile.ZipFile(zipfilename, 'r') as unzip:
            unzip.extractall(BLBOOKSGENRE_MODEL_NAME)

def retrieve_blbooksgenre_model():
    fetcher = FetchModel()
    action_name = "Model download"
    err_msg = "Failed to download model"
    max_retries = 4
    retry_delay = 60
    retry_function(fetcher.fetch, action_name, err_msg,
                   max_retries=max_retries,
                   retry_delay=retry_delay)
    model = AutoModelForSequenceClassification.from_pretrained(
        BLBOOKSGENRE_MODEL_NAME, num_labels=NUM_LABELS)
    return model

model = retrieve_blbooksgenre_model()

Model download attempt 1 of 4


Load the model and tokenizer

In [4]:
# load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

device = -1
if device >= 0:
    model = model.cuda()

# build a pipeline object to do predictions
pred = pipeline(
    "text-classification",
    model=model,
    tokenizer=tokenizer,
    device=device,
    return_all_scores=True
)

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]



In [5]:
from ml_wrappers import wrap_model
wrapped_model = wrap_model(pred, test_data, 'text_classification')

In [6]:
print("number of errors on test dataset: " + str(sum(wrapped_model.predict(test_data['text'].tolist()) != test_data['label'].tolist())))

number of errors on test dataset: 2


In [7]:
classes = train_data["label"].unique()
classes.sort()

### Create Model and Data Insights

In [8]:
from responsibleai_text import RAITextInsights, ModelTask
from raiwidgets import ResponsibleAIDashboard

Dataset download attempt 1 of 4


To use Responsible AI Dashboard, initialize a RAITextInsights object upon which different components can be loaded.

RAITextInsights accepts the model, the test dataset, the classes and the task type as its arguments.

In [9]:
rai_insights = RAITextInsights(pred, test_data,
                               "label",
                               task_type=ModelTask.TEXT_CLASSIFICATION,
                               classes=classes)

50it [00:00, 60.43it/s]


Add the components of the toolbox for model assessment.

In [10]:
rai_insights.explainer.add()
rai_insights.error_analysis.add()

Once all the desired components have been loaded, compute insights on the test set.

In [11]:
rai_insights.compute()

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:   2%|▏         | 1/50 [00:00<?, ?it/s]

  0%|          | 0/210 [00:00<?, ?it/s]

PartitionExplainer explainer:   6%|▌         | 3/50 [00:58<06:17,  8.02s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:   8%|▊         | 4/50 [01:33<14:58, 19.54s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  10%|█         | 5/50 [02:05<18:03, 24.07s/it]

  0%|          | 0/240 [00:00<?, ?it/s]

PartitionExplainer explainer:  12%|█▏        | 6/50 [02:16<14:12, 19.37s/it]

  0%|          | 0/210 [00:00<?, ?it/s]

PartitionExplainer explainer:  14%|█▍        | 7/50 [02:32<13:08, 18.33s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  16%|█▌        | 8/50 [03:25<20:33, 29.36s/it]

  0%|          | 0/306 [00:00<?, ?it/s]

PartitionExplainer explainer:  20%|██        | 10/50 [03:39<11:26, 17.15s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  22%|██▏       | 11/50 [04:28<17:28, 26.88s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  24%|██▍       | 12/50 [05:06<19:11, 30.30s/it]

  0%|          | 0/110 [00:00<?, ?it/s]

PartitionExplainer explainer:  26%|██▌       | 13/50 [05:13<14:17, 23.17s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  28%|██▊       | 14/50 [05:39<14:25, 24.03s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  30%|███       | 15/50 [06:17<16:30, 28.29s/it]

  0%|          | 0/380 [00:00<?, ?it/s]

PartitionExplainer explainer:  32%|███▏      | 16/50 [06:37<14:29, 25.58s/it]

  0%|          | 0/240 [00:00<?, ?it/s]

PartitionExplainer explainer:  34%|███▍      | 17/50 [06:49<11:55, 21.68s/it]

  0%|          | 0/110 [00:00<?, ?it/s]

PartitionExplainer explainer:  36%|███▌      | 18/50 [06:54<08:55, 16.74s/it]

  0%|          | 0/210 [00:00<?, ?it/s]

PartitionExplainer explainer:  38%|███▊      | 19/50 [07:09<08:18, 16.08s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  42%|████▏     | 21/50 [07:47<07:49, 16.18s/it]

  0%|          | 0/182 [00:00<?, ?it/s]

PartitionExplainer explainer:  44%|████▍     | 22/50 [07:57<06:42, 14.36s/it]

  0%|          | 0/240 [00:00<?, ?it/s]

PartitionExplainer explainer:  46%|████▌     | 23/50 [08:10<06:21, 14.15s/it]

  0%|          | 0/240 [00:00<?, ?it/s]

PartitionExplainer explainer:  50%|█████     | 25/50 [08:26<04:15, 10.20s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  52%|█████▏    | 26/50 [08:59<06:55, 17.30s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  54%|█████▍    | 27/50 [09:30<08:13, 21.44s/it]

  0%|          | 0/156 [00:00<?, ?it/s]

PartitionExplainer explainer:  56%|█████▌    | 28/50 [09:41<06:38, 18.12s/it]

  0%|          | 0/240 [00:00<?, ?it/s]

PartitionExplainer explainer:  58%|█████▊    | 29/50 [09:55<05:55, 16.91s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  60%|██████    | 30/50 [10:29<07:18, 21.94s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  64%|██████▍   | 32/50 [11:13<06:09, 20.53s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  66%|██████▌   | 33/50 [11:50<07:09, 25.29s/it]

  0%|          | 0/240 [00:00<?, ?it/s]

PartitionExplainer explainer:  68%|██████▊   | 34/50 [12:03<05:44, 21.56s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  70%|███████   | 35/50 [12:38<06:25, 25.69s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  72%|███████▏  | 36/50 [13:30<07:52, 33.72s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  74%|███████▍  | 37/50 [14:00<07:02, 32.53s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  76%|███████▌  | 38/50 [14:40<06:58, 34.85s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  78%|███████▊  | 39/50 [15:17<06:27, 35.26s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  80%|████████  | 40/50 [15:59<06:12, 37.29s/it]

  0%|          | 0/342 [00:00<?, ?it/s]

PartitionExplainer explainer:  82%|████████▏ | 41/50 [16:28<05:14, 34.92s/it]

  0%|          | 0/342 [00:00<?, ?it/s]

PartitionExplainer explainer:  86%|████████▌ | 43/50 [16:59<02:47, 23.98s/it]

  0%|          | 0/272 [00:00<?, ?it/s]

PartitionExplainer explainer:  88%|████████▊ | 44/50 [17:27<02:31, 25.24s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  90%|█████████ | 45/50 [18:33<03:06, 37.23s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  94%|█████████▍| 47/50 [19:09<01:18, 26.30s/it]

  0%|          | 0/272 [00:00<?, ?it/s]

PartitionExplainer explainer:  96%|█████████▌| 48/50 [19:26<00:46, 23.37s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  98%|█████████▊| 49/50 [20:08<00:29, 29.05s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer: 100%|██████████| 50/50 [20:40<00:00, 29.84s/it]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer: 51it [21:11, 25.43s/it]                        

Error Analysis
Current Status: Generating error analysis reports.
Current Status: Finished generating error analysis reports.
Time taken: 0.0 min 0.08359272699999565 sec





Finally, visualize and explore the model insights. Use the resulting widget or follow the link to view this in a new tab.

In [12]:
ResponsibleAIDashboard(rai_insights)

Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.


ResponsibleAI started at http://localhost:8705


<raiwidgets.responsibleai_dashboard.ResponsibleAIDashboard at 0x7f17ea7a9060>