# Using the Language Interpretability Tool in Notebooks

This notebook shows use of the [Language Interpretability Tool](https://pair-code.github.io/lit) on a binary classifier for labelling statement sentiment (0 for negative, 1 for positive).

The LitWidget object constructor takes a dict mapping model names to model objects, and a dict mapping dataset names to dataset objects. Those will be the datasets and models displayed in LIT. It also optionally takes in a `height` parameter for how tall to render the LIT UI in pixels (it defaults to 1000 pixels). Running the constructor will cause the LIT server to be started in the background, loading the models and datasets and enabling the UI to be served.

Render the LIT UI in an output cell by calling the `render` method on the LitWidget object. The LIT UI can be rendered multiple times in separate cells if desired. The widget also contains a `stop` method to shut down the LIT server.

Copyright 2020 Google LLC.
SPDX-License-Identifier: Apache-2.0

In [1]:
# Install LIT and transformers packages. The transformers package is needed by the model and dataset we are using.
# Replace tensorflow-datasets with the nightly package to get up-to-date dataset paths.
!pip uninstall -y tensorflow-datasets
!pip install lit_nlp tfds-nightly transformers==4.1.1



In [2]:
# Fetch the trained model weights
!wget https://storage.googleapis.com/what-if-tool-resources/lit-models/sst2_tiny.tar.gz
!tar -xvf sst2_tiny.tar.gz

wget: /shared/centos7/anaconda3/2022.01/lib/libuuid.so.1: no version information available (required by wget)
--2022-11-30 15:05:11--  https://storage.googleapis.com/what-if-tool-resources/lit-models/sst2_tiny.tar.gz
Connecting to 10.99.0.130:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 16362834 (16M) [application/octet-stream]
Saving to: ‘sst2_tiny.tar.gz’


2022-11-30 15:05:11 (94.1 MB/s) - ‘sst2_tiny.tar.gz’ saved [16362834/16362834]

./
./tokenizer_config.json
./tf_model.h5
./config.json
./train.history.json
./vocab.txt
./special_tokens_map.json


In [3]:
# Create the LIT widget with the model and dataset to analyze.
from lit_nlp import notebook
from lit_nlp.examples.datasets import glue
from lit_nlp.examples.models import glue_models

datasets = {'sst_dev': glue.SST2Data('validation')}
models = {'sst_tiny': glue_models.SST2Model('./')}

widget = notebook.LitWidget(models, datasets, height=800)

  from .autonotebook import tqdm as notebook_tqdm
INFO:absl:Load dataset info from /home/yun.hy/tensorflow_datasets/glue/sst2/2.0.0
INFO:absl:Reusing dataset glue (/home/yun.hy/tensorflow_datasets/glue/sst2/2.0.0)
INFO:absl:Constructing tf.data.Dataset glue for split validation, from /home/yun.hy/tensorflow_datasets/glue/sst2/2.0.0
All model checkpoint layers were used when initializing TFBertForSequenceClassification.

All the layers of TFBertForSequenceClassification were initialized from the model checkpoint at ./.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForSequenceClassification for predictions without further training.
INFO:absl:
 (    (           
 )\ ) )\ )  *   ) 
(()/((()/(` )  /( 
 /(_))/(_))( )(_))
(_)) (_)) (_(_()) 
| |  |_ _||_   _| 
| |__ | |   | |   
|____|___|  |_|   


INFO:absl:Starting LIT server...
INFO:absl:CachingModelWrapper 'sst_tiny': no cache path specified, not loading.
INFO:absl:Warm-start of 

INFO:absl:CachingModelWrapper 'sst_tiny': 872 misses out of 872 inputs
INFO:absl:Prepared 872 inputs for model
INFO:absl:Received 872 predictions from model
INFO:absl:Requested types: ['LitType']
INFO:absl:Will return keys: {'tokens', 'layer_2/attention', 'cls_grad', 'layer_1/avg_emb', 'input_embs_sentence', 'grad_class', 'probas', 'layer_1/attention', 'tokens_sentence', 'cls_emb', 'layer_2/avg_emb', 'layer_0/avg_emb', 'token_grad_sentence'}
INFO:absl:Warm-start of model 'sst_tiny' on dataset '_union_empty'
INFO:absl:CachingModelWrapper 'sst_tiny': misses (dataset=_union_empty): []
INFO:absl:CachingModelWrapper 'sst_tiny': 0 misses out of 0 inputs
INFO:absl:Prepared 0 inputs for model
INFO:absl:Received 0 predictions from model
INFO:absl:Requested types: ['LitType']
INFO:absl:Will return keys: {'tokens', 'layer_2/attention', 'cls_grad', 'layer_1/avg_emb', 'input_embs_sentence', 'grad_class', 'probas', 'layer_1/attention', 'tokens_sentence', 'cls_emb', 'layer_2/avg_emb', 'layer_0/avg_em

In [5]:
# Render the widget
widget.render()