# Visualising attention

Despite the claim that the interpretation capacity of models decreases as model size increases, it is useful to have access to the model from the inside. [bertviz](https://github.com/jessevig/bertviz) allows us to plot attention modules for each head and each layer given an input.

In [1]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, utils
from bertviz import model_view, head_view
import csv
utils.logging.set_verbosity_error()  # Suppress standard warnings
from config import rank_emojis_text, labels_es

In [2]:
MODEL = f"ccarvajal/beto-emoji"
folder = MODEL.replace('ccarvajal/','')

try:
    tokenizer = AutoTokenizer.from_pretrained(folder)
    model = AutoModelForSequenceClassification.from_pretrained(folder, output_attentions=True)
except ValueError:
    tokenizer = AutoTokenizer.from_pretrained(MODEL)
    tokenizer.save_pretrained(folder)
    model = AutoModelForSequenceClassification.from_pretrained(MODEL, output_attentions=True)
    model.save_pretrained(folder)

In [3]:
ejemplo = "Tapas + sangria = @ Ocaña Barcelona"
rank_emojis_text(ejemplo,model,tokenizer,labels_es)

1) 😍 0.418
2) 🇪🇸 0.2464
3) 👌 0.097
4) 😊 0.0895
5) ❤ 0.0742
6) 😁 0.0254
7) 💙 0.0105
8) 😎 0.0093
9) 💜 0.0064
10) 💕 0.0048
11) 😉 0.004
12) 😜 0.0032
13) ✨ 0.0026
14) 💞 0.0025
15) 💘 0.0018
16) 😘 0.0016
17) 😂 0.0013
18) 💪 0.0008
19) 🎶 0.0007


This is an example of missclassification. The emoji that actually appears in the tweet is  🇪🇸 instead of 😍.

We can use this sentence to visualise the attention module.

In [4]:
def display_model_view(input_text):
    inputs = tokenizer.encode(input_text, return_tensors='pt')  # Tokenize input text
    outputs = model(inputs)  # Run model
    attention = outputs[-1]  # Retrieve attention from model outputs
    tokens = tokenizer.convert_ids_to_tokens(inputs[0])  # Convert input ids to token strings
    model_view(attention, tokens, include_layers=[11])

In [5]:
display_model_view(ejemplo)

<IPython.core.display.Javascript object>

In [6]:
def display_head_view(input_text):
    inputs = tokenizer.encode(input_text, return_tensors='pt')  # Tokenize input text
    outputs = model(inputs)  # Run model
    attention = outputs[-1]  # Retrieve attention from model outputs
    tokens = tokenizer.convert_ids_to_tokens(inputs[0])  # Convert input ids to token strings
    head_view(attention, tokens, layer=11)

In [7]:
display_head_view(ejemplo)

<IPython.core.display.Javascript object>

The contextualised representation of the [CLS] token through the is the one used for classification using a feedforward layer (and an activation function). By placing the cursor on top of [CLS] on the left we can observe which tokens are contributing more to the representation that will be used for prediction.

In this case there is a strong influence of the token `tapa`, which is a common bar food in Barcelona. We may claim that the token `tapa` is causing the miss-classification. Let's see when we only predict the label for the phrase solely composed of the word `tapa`.

In [8]:
rank_emojis_text("tapas",model,tokenizer,labels_es)

1) 😍 0.6418
2) ❤ 0.1665
3) 😊 0.0505
4) 👌 0.0341
5) 🇪🇸 0.0297
6) 💙 0.0228
7) 💜 0.0148
8) 💕 0.0143
9) 💘 0.0063
10) 💞 0.0045
11) 😁 0.0043
12) 😎 0.0039
13) ✨ 0.0028
14) 😘 0.0008
15) 😉 0.0008
16) 😜 0.0007
17) 😂 0.0006
18) 🎶 0.0004
19) 💪 0.0004


😍 is the emoji with the highest probability (64%). 🇪🇸 only has 24%.

Let's see another example

In [9]:
otro_ejemplo = "Vamoooos!! #Gym #healthy @ Albacete Capital"
rank_emojis_text(otro_ejemplo,model,tokenizer,labels_es)

1) 💪 0.9921
2) 😁 0.0012
3) 😂 0.0011
4) 🇪🇸 0.0009
5) 👌 0.0007
6) 😜 0.0006
7) 😘 0.0006
8) 😉 0.0006
9) 😎 0.0004
10) 😊 0.0003
11) 💙 0.0003
12) 🎶 0.0003
13) 😍 0.0002
14) ❤ 0.0002
15) 💜 0.0001
16) ✨ 0.0001
17) 💘 0.0001
18) 💕 0.0001
19) 💞 0.0001


The classification here is correct because it predicts 💪 with a high probability, which is the correct label. Let's check the attention layers

In [10]:
display_model_view(otro_ejemplo)

<IPython.core.display.Javascript object>

In [11]:
display_head_view(otro_ejemplo)

<IPython.core.display.Javascript object>

Those tokens presenting the highest influence regarding [CLS] are `gym`, `healthy` and `@`. Interestingly, these are loan words from English. As such, they appear cut, since the tokenizer is designed for words in Spanish. If we classify a "more-spanish" version of the same tweet we get the following probabilities.

In [12]:
otro_ejemplo = "Vamos!! #Gimnasio #saludable @ Albacete Capital"
rank_emojis_text(otro_ejemplo,model,tokenizer,labels_es)

1) 💪 0.9857
2) 😁 0.0023
3) 🇪🇸 0.0023
4) 😊 0.002
5) 😘 0.0015
6) 😉 0.0011
7) 👌 0.001
8) 😜 0.0009
9) 😎 0.0005
10) 💙 0.0005
11) 😂 0.0004
12) 🎶 0.0004
13) ✨ 0.0003
14) ❤ 0.0003
15) 💜 0.0002
16) 😍 0.0002
17) 💘 0.0002
18) 💕 0.0002
19) 💞 0.0002


In [13]:
display_head_view(otro_ejemplo)

<IPython.core.display.Javascript object>

The prediction stays the same, except perhaps for the higher influence from the token `vamos` than `@`.

Let's notice that the transformers takes tokens and models the interaction between them in such a way that the sentence is consistent with its true meaning. Having cut tokens from loan words and having `vamos` written in a more informal way does not prevent a correct classification with by a wide margin.