# Visualizing Multi-head Attention with the BertViz Library
----

In this notebook, we'll learn how to visually represent the attention mechanism in a NLP context. We will use the BertViz library which simplifies this process.

# Installing Packages and Data Loading
Let's start by installing BertViz library, and the Transformers library from HuggingFace. The latter will allow us to easily load a pre-trained model.

In [1]:
! pip install bertviz transformers

Collecting bertviz
  Downloading bertviz-1.4.0-py3-none-any.whl.metadata (19 kB)
Collecting boto3 (from bertviz)
  Downloading boto3-1.35.65-py3-none-any.whl.metadata (6.7 kB)
Collecting botocore<1.36.0,>=1.35.65 (from boto3->bertviz)
  Downloading botocore-1.35.65-py3-none-any.whl.metadata (5.7 kB)
Collecting jmespath<2.0.0,>=0.7.1 (from boto3->bertviz)
  Downloading jmespath-1.0.1-py3-none-any.whl.metadata (7.6 kB)
Collecting s3transfer<0.11.0,>=0.10.0 (from boto3->bertviz)
  Downloading s3transfer-0.10.3-py3-none-any.whl.metadata (1.7 kB)
Downloading bertviz-1.4.0-py3-none-any.whl (157 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m157.6/157.6 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading boto3-1.35.65-py3-none-any.whl (139 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.2/139.2 kB[0m [31m6.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading botocore-1.35.65-py3-none-any.whl (12.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Now we import the required modules from these libraries

In [2]:
from bertviz import model_view, head_view
from transformers import AutoTokenizer, AutoModel, utils

utils.logging.set_verbosity_error()  # Suppress standard warnings

## Setting up our Model
We now get our tokenizer and model using the transformers library from HuggingFace. We will use the "distilbert-base-uncased" for both, importing each with the AutoTokenizer and AutoModel modules.

In [3]:
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
model = AutoModel.from_pretrained("distilbert-base-uncased", output_attentions=True)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Let's now setup our own input for testing, and pass it onto the model

In [9]:
inputs = tokenizer.encode("I love 490. Professor Ammar is bad.", return_tensors='pt')
outputs = model(inputs)

## Visualizing Attention Weights
Finally, we set our attention as our outputs, and our tokens as our inputs. We then run the head_view function on these variables

In [10]:
attention = outputs[-1]  # Output includes attention weights when output_attentions=True
tokens = tokenizer.convert_ids_to_tokens(inputs[0])
head_view(attention, tokens)

<IPython.core.display.Javascript object>

Alternatively you can visualize with the model_view function to visualize each head separately

In [6]:
model_view(attention, tokens)

<IPython.core.display.Javascript object>