>Feature Attribution（特征归因）：你可以将其当做对模型输出的解释，就像在图像分类中可视化模型关注的区域一样。本文将介绍 Inseq，这是一个用于解释和可视化序列生成模型输出的工具。我们将通过翻译任务（关注整个序列）和文本生成任务（关注前面的词）来演示如何使用 Inseq 来了解输入文本的哪些部分对模型生成下一个单词的影响最大。

In [1]:

import sys
import IPython.display
import IPython.core
IPython.core.display = IPython.display
sys.modules['IPython.core.display'] = IPython.display

In [4]:
import inseq
import torch

print("使用 Helsinki-NLP/opus-mt-zh-en 模型")

# 定义要使用的归因方法列表
attribution_methods = ['saliency', 'attention']

for method in attribution_methods:
    print(f"\n======= 归因方法: {method} =======")

    # 直接用 inseq 加载模型
    inseq_model = inseq.load_model(
        "./opus-mt-zh-en",  # 如果使用了本地下载，换成对应路径，比如：opus-mt-zh-en
        attribution_method=method,
        model_kwargs={
            "attn_implementation": "eager" if method == "attention" else None
        }
    )

    # 准备输入文本
    input_text = "我喜欢机器学习和人工智能。"

    # 进行归因分析
    attribution_result = inseq_model.attribute(
        input_texts=input_text,
        show_progress=True
    )

    # 清理 tokenizer 中的特殊字符（可选）
    for attr in attribution_result.sequence_attributions:
        for item in attr.source:
            item.token = item.token.replace('▁', '')
        for item in attr.target:
            item.token = item.token.replace('▁', '')

    # 显示归因结果
    attribution_result.show()

    # 打印生成的翻译
    if attribution_result.sequence_attributions:
        # 获取生成的 tokens
        generated_tokens = attribution_result.sequence_attributions[0].target
        generated_text = " ".join([token.token for token in generated_tokens])
        print(f"翻译结果: {generated_text}")

    # 清理内存
    del inseq_model
    torch.cuda.empty_cache()

The following generation flags are not valid and may be ignored: ['output_attentions']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
The following generation flags are not valid and may be ignored: ['output_attentions']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


使用 Helsinki-NLP/opus-mt-zh-en 模型



Attributing with saliency...:  10%|█         | 1/10 [00:00<?, ?it/s]Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.58.0. You should pass an instance of `EncoderDecoderCache` instead, e.g. `past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)`.
Attributing with saliency...: 100%|██████████| 10/10 [00:00<00:00, 31.72it/s]


Unnamed: 0_level_0,I,like,machine,learning,and,artificial,intelligence,.,</s>
我喜欢,0.24,0.44,0.083,0.113,0.126,0.048,0.041,0.116,0.203
机器,0.159,0.116,0.447,0.178,0.129,0.056,0.059,0.086,0.114
学习,0.074,0.076,0.155,0.283,0.138,0.042,0.048,0.054,0.07
和,0.066,0.055,0.041,0.065,0.179,0.039,0.033,0.048,0.059
人工,0.105,0.073,0.081,0.121,0.134,0.375,0.156,0.158,0.136
智能,0.146,0.104,0.114,0.151,0.171,0.335,0.524,0.407,0.17
。,0.14,0.082,0.035,0.044,0.079,0.054,0.064,0.075,0.17
</s>,0.07,0.053,0.045,0.046,0.044,0.051,0.075,0.056,0.077


The following generation flags are not valid and may be ignored: ['output_attentions']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
The following generation flags are not valid and may be ignored: ['output_attentions']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


翻译结果: <pad> I like machine learning and artificial intelligence . </s>



Attributing with attention...: 2it [00:00, 47.22it/s]               


Unnamed: 0_level_0,I,like,machine,learning,and,artificial,intelligence,.,</s>
我喜欢,0.21,0.537,0.233,0.012,0.026,0.045,0.015,0.042,0.047
机器,0.087,0.033,0.217,0.174,0.043,0.03,0.02,0.017,0.028
学习,0.038,0.038,0.123,0.288,0.142,0.022,0.021,0.029,0.026
和,0.049,0.039,0.056,0.038,0.128,0.093,0.013,0.032,0.028
人工,0.021,0.023,0.037,0.043,0.066,0.254,0.184,0.036,0.032
智能,0.021,0.023,0.063,0.057,0.067,0.113,0.346,0.136,0.027
。,0.135,0.067,0.05,0.029,0.06,0.053,0.043,0.153,0.186
</s>,0.438,0.24,0.221,0.359,0.468,0.391,0.358,0.555,0.625


翻译结果: <pad> I like machine learning and artificial intelligence . </s>
