<a href="https://colab.research.google.com/github/TurkuNLP/textual-data-analysis-course/blob/main/model_explainability_qa_solved.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Model load

* Now we load the model and repeat some of the imports so it is possible to run the notebook from this point onwards

In [1]:
!pip3 install transformers



In [2]:
!wget http://dl.turkunlp.org/TKO_8964_2023/english-binarized-weighted.model.tgz
!tar zxvf english-binarized-weighted.model.tgz

--2025-02-11 11:09:45--  http://dl.turkunlp.org/TKO_8964_2023/english-binarized-weighted.model.tgz
Resolving dl.turkunlp.org (dl.turkunlp.org)... 195.148.30.23
Connecting to dl.turkunlp.org (dl.turkunlp.org)|195.148.30.23|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 402134026 (384M) [application/octet-stream]
Saving to: ‘english-binarized-weighted.model.tgz’


2025-02-11 11:10:04 (20.4 MB/s) - ‘english-binarized-weighted.model.tgz’ saved [402134026/402134026]

english-binarized-weighted.model/
english-binarized-weighted.model/training_args.bin
english-binarized-weighted.model/pytorch_model.bin
english-binarized-weighted.model/tokenizer.json
english-binarized-weighted.model/vocab.txt
english-binarized-weighted.model/config.json
english-binarized-weighted.model/special_tokens_map.json
english-binarized-weighted.model/tokenizer_config.json


In [3]:
from transformers import AutoTokenizer
from transformers import AutoModelForSequenceClassification

example_pairs = [
    {"question":"When was University of Turku founded?",
     "context": """The University of Turku (Finnish: Turun yliopisto,
      in Swedish: Åbo universitet, shortened UTU), located in Turku in
      southwestern Finland, is the third largest university in the country
      as measured by student enrollment, after the University of Helsinki
      and Tampere University. It is a multidisciplinary university with
      eight faculties. It was established in 1920 and also has facilities
      at Rauma, Pori, Kevo and Seili. The university is a member of
      the Coimbra Group and the European Campus of City-Universities (EC2U)."""
     }
]

Q=example_pairs[0]["question"]
CTX=example_pairs[0]["context"]

MODEL_NAME = 'english-binarized-weighted.model'
model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenized = tokenizer(text=Q, text_pair=CTX, return_tensors='pt')
prediction = model(**tokenized)
prediction

SequenceClassifierOutput(loss=None, logits=tensor([[-1.4038,  1.4566]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)

In [4]:
import torch
model.to('cpu')    # simplifies input placement, but this is not needed since I run on CPU colab anyway

label_names = [
    'no_answer',
    'does_answer'
]

def predict_qa(question,context):
    tokenized = tokenizer(text=question, text_pair=context, return_tensors='pt')
    pred = model(**tokenized)
    pred_idx = pred.logits.detach().numpy().argmax()
    return label_names[pred_idx]

In [5]:
!pip install captum pandas matplotlib seaborn

Collecting captum
  Downloading captum-0.7.0-py3-none-any.whl.metadata (26 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.6->captum)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.6->captum)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.6->captum)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.6->captum)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=1.6->captum)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch>=1.6->captum)
  Downloading nvidia_cufft_cu1

In [6]:
from captum.attr import visualization as viz
from captum.attr import LayerIntegratedGradients


In [7]:
#Tells the model that it is in evaluation mode, and zeroes out the gradients
model.eval()
model.zero_grad()

# Helper functions

In [8]:
# Forward on the model -> data in, prediction out, nothing fancy really
def predict(input_ids,token_type_ids,attention_mask):
    pred=model(input_ids,token_type_ids=token_type_ids,attention_mask=attention_mask)
    return pred.logits #return the output of the classification layer



In [9]:
##### This needs change, since now I have question-context pair
## I only blank the context, since I don't want the explanation
## in the question string
## but probably that would've worked fine too
def construct_input_ref_pair(question,context):
    inp=tokenizer(text=question,text_pair=context,return_tensors="pt")
    #how long is the context?
    context_tok=tokenizer(context,add_special_tokens=False)["input_ids"]
    ref=tokenizer(text=question,text_pair=" ".join(["[PAD]"]*len(context_tok)),return_tensors="pt")
    return (inp["input_ids"], inp["token_type_ids"], inp["attention_mask"]),\
    (ref["input_ids"], ref["token_type_ids"], ref["attention_mask"])



In [10]:
### Here is the bulk of the solution
### I moved the aggregated function from
### the lecture notebook and modified a little
### to work with question-context pairs

from IPython.core.display import HTML, display
import torch

### This is copied verbatim ... no need to change
def summarize_attributions(attributions):
    attributions = attributions.sum(dim=-1).squeeze(0)
    attributions = attributions / torch.linalg.vector_norm(attributions,dim=0)
    return attributions

### This is modified to work with the question-context pairs
def predict_and_explain(model,question,context):
    model.zero_grad() #to be safe perhaps it's not needed
    inp,ref = construct_input_ref_pair(question,context) #Changed this for the exercise
    all_tokens = tokenizer.convert_ids_to_tokens(inp[0][0])

    lig = LayerIntegratedGradients(predict, model.bert.embeddings)
    prediction=predict(*inp)[0]
    prediction_cls=int(torch.argmax(prediction))
    print("Prediction:", ("negative","positive")[prediction_cls],"Weights:",prediction.tolist())
    for target,classname in enumerate(("negative","positive")):

        attrs, delta = lig.attribute(inputs=inp,
                                  baselines=ref,
                                  return_convergence_delta=True,target=target)
        attrs_sum = summarize_attributions(attrs)

        x=viz.format_word_importances(all_tokens,attrs_sum)
        print("ATTRIBUTION WITH RESPECT TO",classname)
        display(HTML(x))
        print()

predict_and_explain(model,Q,CTX)

Prediction: positive Weights: [-1.403843879699707, 1.4565602540969849]
ATTRIBUTION WITH RESPECT TO negative



ATTRIBUTION WITH RESPECT TO positive



