# Getting started

In this notebook, we'll walk through the basic functionality of AT2 by using it for a *context attribution* task.
Given a context and a query, we use a language model to generate a response.
From here, AT2 can tell us which sources from the context (if any) the language model *uses* to generate its response.

Applying AT2 requires learning a *score estimator* for a particular model to estimate the influence of a given source on the model's generation.
In this notebook, we'll use an existing AT2 score estimator for [`microsoft/Phi-4-mini-instruct`](https://huggingface.co/microsoft/Phi-4-mini-instruct).
To see how to train a score estimator from scratch, check out [this tutorial](https://github.com/MadryLab/AT2/blob/main/notebooks/train_at2.ipynb).

In [1]:
import torch as ch
from datasets import load_dataset

In [2]:
from at2.tasks import SimpleContextAttributionTask
from at2.utils import get_model_and_tokenizer
from at2 import AT2Attributor, AT2ScoreEstimator

[nltk_data] Downloading package punkt_tab to
[nltk_data]     /mnt/xfs/home/bencw/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!


We'll start by loading the model and its tokenizer.

In [3]:
model_name = "microsoft/Phi-4-mini-instruct"
dtype = ch.bfloat16
model, tokenizer = get_model_and_tokenizer(model_name, dtype=dtype)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Next, we'll create an "attribution task."
An attribution task consists of an input sequence, a generated sequence, a model/tokenizer and a set of sources to which we would like to attribute the generated sequence.
In this case, the input sequence is a news article from [CNN DailyMail](https://huggingface.co/datasets/abisee/cnn_dailymail) and a request to summarize it.
The generated sequence is the model's response and the sources are sentences from the news article.
By pinpointing the sentences from the news article that the model uses to generate a given statement, we obtain a "citation" for this statement.
We've defined a class, `SimpleContextAttributionTask` to be able to quickly create an attribution task from an example in the dataset.

In [4]:
dataset = load_dataset("cnn_dailymail", "3.0.0", split="validation")
example = dataset[0]
context = example["article"]
query = "Summarize the article in up to three sentences."

task = SimpleContextAttributionTask(
    context=context,
    query=query,
    model=model,
    tokenizer=tokenizer,
    source_type="sentence",
)

The first step is to generate a response for the context (a news article) and instruction (a request to summarize it).
The `AttributionTask` class handles this for us.

In [5]:
print("### Context ###")
print(context[:500] + "..." if len(context) > 500 else context)
print()
print("### Instruction ###")
print(query)
print()
# Generates a response and caches relevant information for attribution
print("### Generated response ###")
print(task.generation)

### Context ###
(CNN)Share, and your gift will be multiplied. That may sound like an esoteric adage, but when Zully Broussard selflessly decided to give one of her kidneys to a stranger, her generosity paired up with big data. It resulted in six patients receiving transplants. That surprised and wowed her. "I thought I was going to help this one person who I don't know, but the fact that so many people can have a life extension, that's pretty big," Broussard told CNN affiliate KGO. She may feel guided in her ge...

### Instruction ###
Summarize the article in up to three sentences.

### Generated response ###
Zully Broussard's selfless kidney donation led to a chain of six transplants, thanks to a data-driven matching system that connected her with other donors and recipients. The process, facilitated by a computer program called MatchGrid, significantly reduced the time needed to find compatible donor-recipient pairs, from months to weeks. This rare long-chain transplant showcases the

These are this task's "sources," i.e., the units to which we would like to attribute the model's generation.
The AT2 score estimator assigns a score to each of these to signify its influence.

In [6]:
print("Total sources:", task.num_sources)
# This is the first few sources (sentences from the context)
for i in range(3):
    print(f"Source #{i}:")
    print(task.sources[i].strip())
    print()

Total sources: 43
Source #0:
(CNN)Share, and your gift will be multiplied.

Source #1:
That may sound like an esoteric adage, but when Zully Broussard selflessly decided to give one of her kidneys to a stranger, her generosity paired up with big data.

Source #2:
It resulted in six patients receiving transplants.



Next, we'll create an `AT2Attributor` (which uses attention weights to estimate an attribution score to each source) for `Phi-4-min-instruct`.

In [7]:
attributor = AT2Attributor.from_hub(task, "madrylab/at2-phi-4-mini-instruct")

We'll be interested in attributing a particular sentence from the response.
The `AttributionTask` class has a nice utility to help us out with this.

In [8]:
task.show_target_with_indices()

[36m[(0, 170)][0mZully Broussard's selfless kidney donation led to a chain of six transplants, thanks to a data-driven matching system that connected her with other donors and recipients. [36m[(171, 337)][0mThe process, facilitated by a computer program called MatchGrid, significantly reduced the time needed to find compatible donor-recipient pairs, from months to weeks. [36m[(338, 433)][0mThis rare long-chain transplant showcases the power of altruism and technology in saving lives.


Let's attribute the second sentence!

In [9]:
start, end = (171, 337)
attributor.show_attribution(start=start, end=end, verbose=True)

Computing attribution scores for:
 The process, facilitated by a computer program called MatchGrid, significantly reduced the time needed to find compatible donor-recipient pairs, from months to weeks.


Unnamed: 0,Score,Source
0,0.003,"Jacobs paid it forward with his programming skills, creating MatchGrid, a program that genetically matches up donor pairs or chains quickly."
1,0.002,That changed when a computer programmer named David Jacobs received a kidney transplant.
2,0.001,But the power that multiplied Broussard's gift was data processing of genetic profiles from donor-recipient pairs.
3,0.001,"""When we did a five-way swap a few years ago, which was one of the largest, it took about three to four months."
4,0.001,"""The significance of the altruistic donor is that it opens up possibilities for pairing compatible donors and recipients,"" said Dr. Steven Katznelson."
5,0.001,"We did this in about three weeks,"" Jacobs said."
6,0.001,"It's been done before, California Pacific Medical Center said in a statement, but matching up the people in the chain has been laborious and taken a long time."
7,0.0,"It works on a simple swapping principle but takes it to a much higher level, according to California Pacific Medical Center in San Francisco."


We can also directly access the attribution scores as follows.

In [10]:
scores = attributor.get_attribution_scores(start=start, end=end)
scores

array([ 1.14917755e-04,  2.26020813e-04,  9.82284546e-05,  7.10487366e-05,
        1.42097473e-04,  1.07288361e-04,  7.82012939e-05,  3.05175781e-05,
        2.70605087e-05,  9.53674316e-05,  1.38092041e-03,  4.69207764e-04,
        3.35693359e-04,  1.35421753e-04,  3.64303589e-04,  1.64031982e-04,
        1.44004822e-04,  2.76565552e-04,  1.26838684e-04,  1.90734863e-04,
        2.03132629e-04,  1.37329102e-04,  2.46047974e-04,  7.24792480e-05,
        5.84125519e-05,  1.90734863e-04,  2.26020813e-04,  2.76565552e-04,
        2.06947327e-04,  1.22070312e-04,  3.39508057e-04,  1.51634216e-04,
        8.62121582e-04,  1.50299072e-03,  1.08718872e-04,  2.92968750e-03,
        1.25122070e-03,  9.15527344e-04,  2.65121460e-04,  9.61303711e-04,
        4.36782837e-04, -4.62532043e-05, -5.67436218e-05], dtype=float32)

If we'd like to perform attribution for multiple examples, we can load the trained AT2 score estimator once as follows and then pass it in to the `AT2Attributor` constructor.

In [11]:
score_estimator = AT2ScoreEstimator.from_hub("madrylab/at2-phi-4-mini-instruct")
attributor = AT2Attributor(task, score_estimator)