# Thought attribution with AT2

In this notebook, we'll walk through attributing a model's final response to its intermediate thoughts using AT2.
We'll be working with [DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B), a small open-weight reasoning model distilled from a larger model.

In [1]:
from at2.utils import get_model_and_tokenizer
from at2.tasks import SimpleThoughtAttributionTask
from at2 import AT2Attributor, AT2ScoreEstimator, AT2FeatureExtractor

[nltk_data] Downloading package punkt_tab to
[nltk_data]     /mnt/xfs/home/bencw/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!


In [2]:
model_name = "deepseek-ai/DeepSeek-R1-Distill-Qwen-7B"
# You may need to install flash-attn
attn_implementation = "flash_attention_2"
model, tokenizer = get_model_and_tokenizer(model_name, attn_implementation=attn_implementation)

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [3]:
query = "What if the printing press had never been invented—how would today's world look? Please respond in a paragraph format."
task = SimpleThoughtAttributionTask(query, model, tokenizer, source_type="sentence")

In [4]:
print(task.response)

Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.


If the printing press had never been invented, the world would have undergone significant changes across multiple domains. Education would likely be more localized, as the absence of mass-produced books would necessitate physical carrying of texts, making access to knowledge less universal. Without newspapers and magazines, the spread of news and information would be slower, potentially leading to a less informed society. Science and technology might progress at a slower pace due to the lack of widespread access to the latest research and knowledge sharing. Economically, the printing industry and trade could be impacted, affecting the economy and the dissemination of financial information. Culturally, literature and storytelling might evolve differently, with a slower spread of ideas and possibly a different emphasis on written works. Socially, education systems could become more fragmented, with localized learning and less emphasis on a global exchange of ideas. Overall, the absence o

In [5]:
feature_extractor = AT2FeatureExtractor.from_model(model)
attributor = AT2Attributor.from_hub(task, "madrylab/at2-deepseek-r1-distill-qwen-7b")

In [6]:
task.show_target_with_indices()

[36m[(0, 122)][0mIf the printing press had never been invented, the world would have undergone significant changes across multiple domains. [36m[(123, 291)][0mEducation would likely be more localized, as the absence of mass-produced books would necessitate physical carrying of texts, making access to knowledge less universal. [36m[(292, 425)][0mWithout newspapers and magazines, the spread of news and information would be slower, potentially leading to a less informed society. [36m[(426, 563)][0mScience and technology might progress at a slower pace due to the lack of widespread access to the latest research and knowledge sharing. [36m[(564, 698)][0mEconomically, the printing industry and trade could be impacted, affecting the economy and the dissemination of financial information. [36m[(699, 846)][0mCulturally, literature and storytelling might evolve differently, with a slower spread of ideas and possibly a different emphasis on written works. [36m[(847, 977)][0mSocially

In [7]:
start, end = (292, 425)
attributor.show_attribution(start=start, end=end, verbose=True)

Computing attribution scores for:
 Without newspapers and magazines, the spread of news and information would be slower, potentially leading to a less informed society.


Unnamed: 0,Score,Source
0,0.006,"So news might take longer to spread, and maybe there wouldn't be as many newspapers."
1,0.005,"Also, without the printing press, newspapers and magazines probably wouldn't exist in the same way."
2,0.003,That could affect how people stay informed and how quickly they learn about important events.
3,0.002,"Without it, maybe books and knowledge wouldn't spread as quickly."
4,0.001,"Maybe people would have to carry books around a lot more, which could make education less accessible to everyone."
5,0.001,"Also, without newspapers, maybe there wouldn't be as much advertising or financial information available, which could affect how people make economic decisions."
6,0.001,"It might lead to slower progress in these areas, more localized knowledge sharing, and perhaps a different cultural landscape where the value of books and written words is less emphasized."
7,0.001,"Hmm, the printing press was a big deal because it changed how information is shared."


In [8]:
print(attributor.highlight_attribution(start=start, end=end))

[38;2;255;255;255m<｜begin▁of▁sentence｜>[0m[38;2;255;255;255m<｜User｜>[0m[38;2;255;255;255mWhat[0m[38;2;255;255;255m if[0m[38;2;255;255;255m the[0m[38;2;255;255;255m printing[0m[38;2;255;255;255m press[0m[38;2;255;255;255m had[0m[38;2;255;255;255m never[0m[38;2;255;255;255m been[0m[38;2;255;255;255m invented[0m[38;2;255;255;255m—[0m[38;2;255;255;255mhow[0m[38;2;255;255;255m would[0m[38;2;255;255;255m today[0m[38;2;255;255;255m's[0m[38;2;255;255;255m world[0m[38;2;255;255;255m look[0m[38;2;255;255;255m?[0m[38;2;255;255;255m Please[0m[38;2;255;255;255m respond[0m[38;2;255;255;255m in[0m[38;2;255;255;255m a[0m[38;2;255;255;255m paragraph[0m[38;2;255;255;255m format[0m[38;2;255;255;255m.[0m[38;2;255;255;255m<｜Assistant｜>[0m[38;2;255;255;255m<think>[0m[38;2;255;255;255m
[0m[38;2;243;243;255mOkay[0m[38;2;243;243;255m,[0m[38;2;243;243;255m so[0m[38;2;243;243;255m I[0m[38;2;243;243;255m need[0m[38;2;243;243;255m to[0m[38;2;24