# Exploring BERT-style models

In this class, we're going to look at a number of different ways we can visualise and interpret different layers in transformer models.

There are three main tools which are useful here:

- BERTviz
    - https://github.com/jessevig/bertviz
- Ecco
    - https://github.com/jalammar/ecco
- Language Interpretability Toolkit (LIT)
    - https://github.com/PAIR-code/lit

Each of these has empirical results in peer reviewed journals as evidence of robustness, but each does something a little different.

## BERTviz - Attention heads

We're going to start with BERTviz, created by Jesse Vig. You can read more about how it works in the [research paper](https://aclanthology.org/P19-3007/) and [this blog post](https://towardsdatascience.com/deconstructing-bert-part-2-visualizing-the-inner-workings-of-attention-60a16d86b5c1).

!pip install transformers bertviz

In [1]:
from transformers import AutoTokenizer, AutoModel, utils
from bertviz import model_view
utils.logging.set_verbosity_error()  # Suppress standard warnings

We then start by naming the model we want to explore and some input text we want to inspect.

Feel free to change both of these! You can find a full list of models available via the HuggingFace models hub.

In this example, we're going to be using the original BERT architecture.

__Question:__
- What do you think *uncased* refers to here? Why do you think it's there?

In [2]:
model_name = "bert-base-uncased"  # Find popular HuggingFace models here: https://huggingface.co/models
input_text = "The cat sat on the mat"  

We then initalize the model and the associated pretrained tokenizer using the ```AutoModel``` and ```AutoTokenizer``` classes.

This essentially allows the ```transformers``` package to automatically infer the correct architecture for the pretrained weights that we choose. In the past, this all had to be defined manually...!

You can learn more about the ```AutoModel``` class via [this link](https://huggingface.co/docs/transformers/quicktour#autoclass).

Notice compared to last week, we're here saying that we want the attention weights to be part of the output. This is what allows us to explore the model.

In [3]:
model = AutoModel.from_pretrained(model_name, output_attentions=True)  # Configure model to return attention values
tokenizer = AutoTokenizer.from_pretrained(model_name)

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/440M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

We then tokenize our input text using the pretrained tokenizer and run it through the model to produce our outputs.

Question:
- What do you think ```return_tensors='pt'``` is doing?

In [4]:
inputs = tokenizer.encode(input_text, return_tensors='pt')  # Tokenize input text
outputs = model(inputs)  # Run model

The attention weights are the last element of the outputs. We get those weights and then convert ids back into tokens.

We then take the attention weights from the model and the tokens from our input, then we visualise using ```model_view()```.

__Question:__
- Why are we converting back to tokens from IDs? Why can't we just use the input string?

In [5]:
attention = outputs[-1]  # Retrieve attention from model outputs
tokens = tokenizer.convert_ids_to_tokens(inputs[0])  # Convert input ids to token strings

In [6]:
model_view(attention, tokens)  # Display model view

<IPython.core.display.Javascript object>

__TASK:__
- In small groups, choose a couple of different models to compare 
  - These might be different architectures, sizes, monolingual vs multilingual
- Have on visualisation per laptop
- Compare the kinds of patterns you see in the different models
  - Are the similar or different? How and in what ways to they differ?

## BERTviz - Neuron view

Viewing the attention heads above allows us to inspect how specific words attend to others in our input sequence(s).

However, we might also want to dig a little deeper here and see *why* this turns out to be the case. 

BERTviz allows us to do that digging, using what's called the *neuron view*.

In [7]:
# Import specialized versions of models (that return query/key vectors)
from bertviz.transformers_neuron_view import BertModel, BertTokenizer
from bertviz.neuron_view import show

model_type = 'bert'
model_version = 'bert-base-uncased'
do_lower_case = True

Here we can define to separate sequences for our input.

In [8]:
sentence_a = "The cat sat on the mat"
sentence_b = "It was a fat cat"

And again, we initialise our model and pretrained tokenizer. 

If we only want to make a pipeline that works with BERT architectures, and can't work with others, we can choose to initialize using ```BertModel``` instead of ```AutoModel```, like we do here.

In [9]:
model = BertModel.from_pretrained(model_version, output_attentions=True)
tokenizer = BertTokenizer.from_pretrained(model_version, do_lower_case=do_lower_case)

100%|██████████| 433/433 [00:00<00:00, 192224.14B/s]
100%|██████████| 440473133/440473133 [06:42<00:00, 1094328.55B/s]
100%|██████████| 231508/231508 [00:00<00:00, 531725.29B/s]


We can then show how different words relate to one another.

__Question:__
- Are these visuals immediately interpretable? Can you make sense of them in your group?
  - Try skimming Jesse Vig's blog post on BERTviz: https://towardsdatascience.com/deconstructing-bert-part-2-visualizing-the-inner-workings-of-attention-60a16d86b5c1

In [None]:
show(model, model_type, tokenizer, sentence_a, sentence_b, layer=2, head=0)

# Ecco

An alternative approach to visualising how transformers are working can be created using *Ecco* created by Jay Alammar. You probably know him as the guy from whom I steal all of my visualisations for teaching! 

**NB**: I have had problems with rendering this visualisation in UCloud for some reason. If you have the same problem, check out the Colab Notebook created by Jay Alammar himself, wehere we actually presents this example. You can find that Notebook [here](https://colab.research.google.com/github/jalammar/ecco/blob/main/notebooks/readme.md%20examples.ipynb#scrollTo=What_are_the_patterns_in_BERT_neuron_activation_when_it_processes_a_piece_of_text_).

Jay Alammar has also created some really nice documentation to go along with the tool. Check it out [here](https://ecco.readthedocs.io/en/main/).

__TASK__:
- Notice that here and in the notebook we're using something called *DistilBERT*. 
- In your groups, find out what the difference is between this and the regular BERT we've been using above.
  - Hint: Google is your friend

!pip install ecco


In [10]:
import ecco
lm = ecco.from_pretrained('distilbert-base-uncased', activations=True)

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Downloading:   0%|          | 0.00/268M [00:00<?, ?B/s]

We define a text that we want to work with.

In [16]:
text = ''' Now I ask you: what can be expected of man since he is a being endowed with strange qualities? Shower upon him every earthly blessing, drown him in a sea of happiness, so that nothing but bubbles of bliss can be seen on the surface; give him economic prosperity, such that he should have nothing else to do but sleep, eat cakes and busy himself with the continuation of his species, and even then out of sheer ingratitude, sheer spite, man would play you some nasty trick. He would even risk his cakes and would deliberately desire the most fatal rubbish, the most uneconomical absurdity, simply to introduce into all this positive good sense his fatal fantastic element. It is just his fantastic dreams, his vulgar folly that he will desire to retain, simply in order to prove to himself--as though that were so necessary-- that men still are men and not the keys of a piano, which the laws of nature threaten to control so completely that soon one will be able to desire nothing but by the calendar. And that is not all: even if man really were nothing but a piano-key, even if this were proved to him by natural science and mathematics, even then he would not become reasonable, but would purposely do something perverse out of simple ingratitude, simply to gain his point. And if he does not find means he will contrive destruction and chaos, will contrive sufferings of all sorts, only to gain his point! He will launch a curse upon the world, and as only man can curse (it is his privilege, the primary distinction between him and other animals), may be by his curse alone he will attain his object--that is, convince himself that he is a man and not a piano-key!
'''

Notice how this tools uses far fewer lines of code, and doesn't require us to do anything with the ```transformers``` library.

In [17]:
inputs = lm.tokenizer([text], return_tensors="pt")
output = lm(inputs)
print(output)

<LMOutput 'None' # of lm outputs: 1>


What makes Alammar's Ecco kind of interesting is that he uses a kind of *dimensionality reduction* to reduce all of the weights to a smaller number of more regular attention paterns - what here is indicated by ```n_components```. 

__Questions:__
- Are these results more or less interpretable than the BERTviz outputs?
- Can you explain any of the individual components in a natural language way?
- [More difficult] What does ```.run_nmf()``` actually do? How are we performing dimensionality reduction? 
  - Hint: Check the docs

__TASK__:
- Change the number of components from 8 to a smaller and a larger number. How does this affect the outputs?
- Compare different input texts. Do the same patterns appear?
- Choose a different model. Do the same patterns appear?
  - Note: You might need to check the docs or the Github repo to figure out which models can be used.

In [18]:
nmf_1 = output.run_nmf(n_components=8) 
nmf_1.explore()

<IPython.core.display.Javascript object>

## Language Interpretability Tool (LIT)

The last tool we're going to look at today is more full-featured compared to BERTviz and Ecco.

LIT can be used to created interactive apps hosted in your browser, sort of like Streamlit or Shiny. But it also can run as a widget within a Notebook, like we're going to do here.

The first thing we have to do (for the purposes of this classroom) is to make sure that we don't have any dependency clashes with existing packages. Note also that LIT uses ```TensorFlow``` for all of it's calcuations, not ```PyTorch```.

In [None]:
# Install LIT and transformers packages. The transformers package is needed by the model and dataset we are using.
# Replace tensorflow-datasets with the nightly package to get up-to-date dataset paths.
!pip uninstall -y tensorflow-datasets transformers
!pip install transformers -U
!pip install lit_nlp tfds-nightly transformers==4.1.1

In contrast to BERTvis and Ecco, LIT is not necessarily used to visualise attention or weights throughtout the model. Instead, it's used to visualise results from models trained for specific tasks, such as document classification.

__Question:__
- Why might this be important or beneficial for us as researchers?
- What benefit does this have over the previous two approaches?

We're going to make use of an example which has been pretrained on the SST2 sentiment dataset using [BERT-tiny](https://huggingface.co/google/bert_uncased_L-2_H-128_A-2). BERT-tiny consists of only two layers, and vectors of H=128.

In [None]:
# Fetch the trained model weights
!wget https://storage.googleapis.com/what-if-tool-resources/lit-models/sst2_tiny.tar.gz
!tar -xvf sst2_tiny.tar.gz

We can then load the LIT tools we need, define our dataset and our model, and then feed that to the visualization widget.

**NB:** Again, I've had problems making this work on UCloud. If you have similar issues, check out the LIT Colab Notebook [here](https://colab.research.google.com/github/PAIR-code/lit/blob/main/lit_nlp/examples/notebooks/LIT_sentiment_classifier.ipynb#scrollTo=ukXamAB_FBM8)

In [None]:
# Create the LIT widget with the model and dataset to analyze.
from lit_nlp import notebook
from lit_nlp.examples.datasets import glue
from lit_nlp.examples.models import glue_models

datasets = {'sst_dev': glue.SST2Data('validation')}
models = {'sst_tiny': glue_models.SST2Model('./')}

widget = notebook.LitWidget(models, datasets, height=800)

In [None]:
# Render the widget
widget.render()

Questions:
- Is this easier or more difficult to intepret than previous tools?
  - Which do you prefer? Why?
- What features would you like to see which are currently note part of any of these tools?