Aggregation functions, named aggregators, contrastive context step functions, `inseq.explain` #182

gsarti · 2023-05-16T11:04:18Z

Description

inseq.explain: The ìnseq.explain(ID) function can be used to easily get more information about the class or function indexed by ID in the inseq library (attribution methods, step function, aggregators, aggregation functions at the moment). For example:

import inseq

inseq.explain("saliency")
>>> Saliency attribution method.
...
... Reference implementation:
... https://captum.ai/api/saliency.html <https://captum.ai/api/saliency.html>`__.

This is intended to be used together with list_aggregators, list_feature_attribution_methods, etc. to get more information without having to navigate the documentation.

Aggregation functions: This PR generalizes the aggregation logics that were previously used inside the AttentionWeightAttribution moving them to the post-attribution aggregate step. These logics include:

Specify select_idx in aggregate to perform the chosen aggregation only across the selected dimensions (e.g. select_idx=(0,5) will take into account only the first five elements of the last dimension for aggregation.
Use a custom aggregate_fn among the ones made available through the new AggregationFunction class (e.g. `"max", "mean", "vnorm")
Normalization is applied by default after every normalization step, can be turned off via normalize=False

Named aggregators: Using aggregators becomes much easier in this PR thanks to named aliases that do not require imports. Similarly, AggregatorPipeline can be defined simply as a list of strings. Moreover, specifying the name of an aggregate_fn instead of the one of an aggregator will automatically aggregate last attribution dimension using that function (see choices with inseq.list_aggregation_functions)

Before:

import inseq
from inseq.data.aggregator import AggregatorPipeline, SubwordAggregator, SequenceAttributionAggregator

model = inseq.load_model("Helsinki-NLP/opus-mt-en-fr", "saliency")
# Source attribution with shape [source_len, tgt_len, hidden_size]
out = model.attribute(input_texts="Hello world, here's the Inseq library!")
# Aggregation pipeline composed by two steps:
# 1. Aggregate subword tokens resulting from SentencePiece tokenization using per-hidden size abs max function
# 	   Shape: [agg_source_len, agg_tgt_len, hidden_size]
# 2. Takes the norm of the the last dimension of the resulting word-level neuron-level attribution
#     Shape: [agg_source_len, agg_tgt_len]
agg_out = out.aggregate(AggregatorPipeline([SubwordAggregator, SequenceAttributionAggregator]))

Now:

import inseq

model = inseq.load_model("Helsinki-NLP/opus-mt-en-fr", "saliency")
out = model.attribute(input_texts="Hello world, here's the Inseq library!")
# Same as above. vnorm is an aggregate_fn, so SequenceAttributionAggregator is used by default
# no select_idx = all dimensions are used
agg_out = out.aggregate(["subwords", "vnorm"])
# Changing aggregation function is now easy, e.g. gradient of dimension 30 only
agg_out_filtered_mean = out.aggregate("subwords").aggregate(select_idx=30)

New step functions: The contrast_prob, pcxmi and kl_divergence functions were added to pre-registered step functions inside the library. While contrast_prob_diff returns the delta in probabilities between regular and contrastive prediction options, contrast_prob is the delta in probabilities for the same target token given different source and/or preceding target context. pcxmi and kl_divergence use contrast_prob to derive information-theoretic quantities given the two context options.

💥 Breaking changes:

The contrast_prob_diff step function now takes contrast_targets directly in (list of) string, BatchEncoding or Batch format instead of contrast_ids and contrast_attention_mask. Examples in docs and README were updated accordingly.
The attribute call for the attention attribution method now does not take any parameter since the aggregation is postponed after the attribution (before, it used to accept indexes and aggregation function names). Hence, now the method returns the full attention tensor of size [source_len, target_len, num_layers, num_heads] rather than the pre-filtered and pre-aggregated one.

Here is an example of how previous attention attribution results can be reproduced with the new postponed aggregation:

Before (v0.4.0): Aggregation during attribution

import inseq

model = inseq.load_model("gpt2", "attention")

# will return the maximum attention weights for the first 5 heads averaged across the first, third, and eighth layers
out = model.attribute(
	"Hello world",
	generation_args={"max_new_tokens": 10},
	heads=(0, 5),
	aggregate_heads_fn="max",
	layers=[0, 2, 7]
)
# Attribution has size [tgt_len, gen_len]
out.sequence_attributions[0].target_attribution.shape

Now: postponed aggregation. Same result, but gives users more control on the aggregation process.

import inseq

model = inseq.load_model("gpt2", "attention")

out = model.attribute("Hello world", generation_args={"max_new_tokens": 10})
# Attribution has size [tgt_len, gen_len, num_layers, num_heads]. Note simplified indexing of sequence attributions
out[0].target_attributions.shape
# do_post_aggregation_checks=False avoids checks ensuring the output is 2D for visualization. Aggregates dim -1  by default
agg_heads = out.aggregate("max", select_idx=(0, 5), normalize=False, do_post_aggregation_checks=False)
# Attribution has size [tgt_len, gen_len, num_layers]
agg_heads[0].target_attributions.shape
agg_layers = agg_heads.aggregate("mean", select_idx=[0, 2, 7], normalize=False)
# Attribution has size [tgt_len, gen_len], matches original example
agg_layers[0].target_attributions.shape

gsarti added 6 commits May 16, 2023 12:01

Added aggregation functions, named aggregators and new step functions

566a667

Minor fixes

37cda77

Centralized contrastive target/context options in step functions

98c2cb2

Fix select_idx logic

c5363a0

Rename do_pre and do_post aggregation checks

632c95b

Update docs, add inseq.explain

ca2ac8a

gsarti changed the title ~~Aggregation functions, named aggregators, contrastive context step functions~~ Aggregation functions, named aggregators, contrastive context step functions, inseq.explain May 18, 2023

gsarti added this to the v0.5 milestone May 18, 2023

gsarti added enhancement New feature or request user qol Quality of life improvements for library users labels May 18, 2023

gsarti merged commit 1046230 into main May 18, 2023

gsarti deleted the agg-fns branch May 18, 2023 14:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aggregation functions, named aggregators, contrastive context step functions, `inseq.explain` #182

Aggregation functions, named aggregators, contrastive context step functions, `inseq.explain` #182

gsarti commented May 16, 2023 •

edited

Loading

Aggregation functions, named aggregators, contrastive context step functions, inseq.explain #182

Aggregation functions, named aggregators, contrastive context step functions, inseq.explain #182

Conversation

gsarti commented May 16, 2023 • edited Loading

Description

Aggregation functions, named aggregators, contrastive context step functions, `inseq.explain` #182

Aggregation functions, named aggregators, contrastive context step functions, `inseq.explain` #182

gsarti commented May 16, 2023 •

edited

Loading