<a href="https://colab.research.google.com/github/fengfrankgthb/Demonstrations/blob/main/LIT_salience_empt.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using the Learning Interpretability Tool for Saliance Map

## General Notice of LIT

This notebook shows use of the [Learning Interpretability Tool](https://pair-code.github.io/lit) on a binary classifier for labelling statement sentiment (0 for negative, 1 for positive).

The LitWidget object constructor takes a dict mapping model names to model objects, and a dict mapping dataset names to dataset objects. Those will be the datasets and models displayed in LIT. Running the constructor will cause the LIT server to be started in the background, loading the models and datasets and enabling the UI to be served.

Render the LIT UI in an output cell by calling the `render` method on the LitWidget object. The LIT UI can be rendered multiple times in separate cells if desired. The widget also contains a `stop` method to shut down the LIT server.

Copyright 2020 Google LLC.
SPDX-License-Identifier: Apache-2.0

# How to input data in the widget:

1. Run this revised code in your notebook.
2. The LIT widget will appear. You should see your "sst_tiny" model and the "manual_input_data" dataset (which will be empty).
3. In the LIT interface, look for a section like "Datapoint Editor" or "Data Table".
4. There should be an option to "Create new datapoint" or an "Add" button.
5. Click it. Input fields corresponding to your model's input (e.g., a field labeled "sentence") will appear.
6. Type or paste your sentence into the appropriate field.
7. Click "Analyze" or "Add datapoint". LIT will send this sentence to the model.
8. The model's predictions and interpretations for that sentence will then be displayed in the various LIT modules (like "Predictions", "Salience Maps", etc.).
9. You can repeat this process to add and analyze 10-20 sentences, or as many as you need. Each sentence you add will appear in the "Data Table".

**Important Notes**:

* **Model Path**: The `SST2Model('./')` assumes that the `sst2_tiny.tar.gz` file extracts its contents (like `vocab.txt`, `sst2_tiny.tflite`, etc.) directly into the current working directory or a subdirectory structure that `SST2Model` expects relative to './'. If `tar -xvf sst2_tiny.tar.gz` creates a directory (e.g., `sst2_tiny/`), you might need to change the path to models.`SST2Model('./sst2_tiny/')`. You can check what files are created after the !tar command to be sure.
* **Input Field Name**: The `SST2Model` expects the input text under a specific feature name (usually `"sentence"`). When you add a new datapoint in the UI, ensure you're typing into the field with that name. The `model.input_spec()` ensures the dataset is set up correctly for this.
* **Widget Port**: If port `8890` is in use, you can change it to another available port.

This setup should allow you to start with an empty slate and interactively build up a small dataset for analysis directly within the LIT interface.

In [1]:
# The pip installation will install all necessary prerequisite packages for use of the core LIT package.
!pip install lit-nlp

# Ensure numpy version is compatible if needed, though often lit-nlp handles its dependencies.
# The user specified this, so we'll keep it.
!pip install "numpy>=2.0.0,<3.0.0"

from lit_nlp import notebook
from lit_nlp.examples.glue import models # We only need the model
from lit_nlp.api import dataset as lit_dataset # For creating a Dataset
from lit_nlp.api import types as lit_types # For specifying data types if needed, though model.input_spec() is often enough

# Hide INFO and lower logs. Comment this out for debugging.
from absl import logging
logging.set_verbosity(logging.WARNING)

# Fetch the trained model weights (if not already present)
!wget -nc https://storage.googleapis.com/what-if-tool-resources/lit-models/sst2_tiny.tar.gz
!tar -xvf sst2_tiny.tar.gz

# 1. Load the model
# The path './' assumes the tar.gz extracts its contents into the current directory
# or a subdirectory that the SST2Model class knows how to find (e.g., './sst2_tiny/')
# If tar -xvf extracts to a specific folder like 'sst2_tiny', adjust the path:
# model_path = './sst2_tiny/'
# If it extracts to current dir, './' is fine if model files are at top level of what's extracted.
# Let's assume the SST2Model class handles the exact sub-path if 'sst2_tiny' is a dir created by tar.
# For robustness, it's often good to specify the direct model directory.
# The original code used './', so let's stick to that assuming it worked for the user.
sst_model = models.SST2Model('./')
loaded_models = {'sst_tiny': sst_model}

# 2. Create an empty dataset based on the model's input specification
# The model's input_spec() tells LIT what input fields it expects (e.g., 'sentence')
# and their types.
# We provide an empty list of examples: []
empty_examples = []
# The spec for the dataset should match the model's input spec.
# LIT will populate other parts of the spec (like output_spec from the model) as needed.
# For purely inputting data, matching model.input_spec() is key.
# Some models might also require certain metadata fields in the spec,
# but for SST2Model and text input, this should generally be fine.
# To be more explicit and ensure all necessary fields for the model are known:
initial_spec = sst_model.input_spec()
# If the model also has implicit requirements for its 'label' field (even if not predicted for new inputs)
# you might need to add it to the spec if errors occur, e.g.:
# spec = {**sst_model.input_spec(), 'label': lit_types.CategoryLabel(vocab=sst_model.output_spec()['label'].vocab)}
# However, for simply adding new datapoints for prediction, model.input_spec() should suffice for the Dataset.
# LIT is generally smart about handling this.
empty_dataset = lit_dataset.Dataset(spec=initial_spec, examples=empty_examples, description="Empty dataset for manual input.")
datasets_for_widget = {'manual_input_data': empty_dataset}

# 3. Create the LIT widget with the model and the empty dataset.
widget = notebook.LitWidget(loaded_models, datasets_for_widget, port=8890)

# 4. Render the widget
# You can adjust height as needed.
widget.render(height=1000)

Collecting numpy<2.0.0,>=1.24.1 (from lit-nlp)
  Using cached numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
Using cached numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.3 MB)
Installing collected packages: numpy
  Attempting uninstall: numpy
    Found existing installation: numpy 2.2.5
    Uninstalling numpy-2.2.5:
      Successfully uninstalled numpy-2.2.5
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
thinc 8.3.6 requires numpy<3.0.0,>=2.0.0, but you have numpy 1.26.4 which is incompatible.[0m[31m
[0mSuccessfully installed numpy-1.26.4
Collecting numpy<3.0.0,>=2.0.0
  Using cached numpy-2.2.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (62 kB)
Using cached numpy-2.2.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.4 MB)
Installing collected 

All model checkpoint layers were used when initializing TFBertForSequenceClassification.

All the layers of TFBertForSequenceClassification were initialized from the model checkpoint at ./.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForSequenceClassification for predictions without further training.


<IPython.core.display.Javascript object>