Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregation functions, named aggregators, contrastive context step functions, inseq.explain #182

Merged
merged 6 commits into from
May 18, 2023

Conversation

gsarti
Copy link
Member

@gsarti gsarti commented May 16, 2023

Description

inseq.explain: The ìnseq.explain(ID) function can be used to easily get more information about the class or function indexed by ID in the inseq library (attribution methods, step function, aggregators, aggregation functions at the moment). For example:

import inseq

inseq.explain("saliency")
>>> Saliency attribution method.
...
... Reference implementation:
... https://captum.ai/api/saliency.html <https://captum.ai/api/saliency.html>`__. 

This is intended to be used together with list_aggregators, list_feature_attribution_methods, etc. to get more information without having to navigate the documentation.

Aggregation functions: This PR generalizes the aggregation logics that were previously used inside the AttentionWeightAttribution moving them to the post-attribution aggregate step. These logics include:

  • Specify select_idx in aggregate to perform the chosen aggregation only across the selected dimensions (e.g. select_idx=(0,5) will take into account only the first five elements of the last dimension for aggregation.
  • Use a custom aggregate_fn among the ones made available through the new AggregationFunction class (e.g. `"max", "mean", "vnorm")
  • Normalization is applied by default after every normalization step, can be turned off via normalize=False

Named aggregators: Using aggregators becomes much easier in this PR thanks to named aliases that do not require imports. Similarly, AggregatorPipeline can be defined simply as a list of strings. Moreover, specifying the name of an aggregate_fn instead of the one of an aggregator will automatically aggregate last attribution dimension using that function (see choices with inseq.list_aggregation_functions)

Before:

import inseq
from inseq.data.aggregator import AggregatorPipeline, SubwordAggregator, SequenceAttributionAggregator

model = inseq.load_model("Helsinki-NLP/opus-mt-en-fr", "saliency")
# Source attribution with shape [source_len, tgt_len, hidden_size]
out = model.attribute(input_texts="Hello world, here's the Inseq library!")
# Aggregation pipeline composed by two steps:
# 1. Aggregate subword tokens resulting from SentencePiece tokenization using per-hidden size abs max function
# 	   Shape: [agg_source_len, agg_tgt_len, hidden_size]
# 2. Takes the norm of the the last dimension of the resulting word-level neuron-level attribution
#     Shape: [agg_source_len, agg_tgt_len]
agg_out = out.aggregate(AggregatorPipeline([SubwordAggregator, SequenceAttributionAggregator]))

Now:

import inseq

model = inseq.load_model("Helsinki-NLP/opus-mt-en-fr", "saliency")
out = model.attribute(input_texts="Hello world, here's the Inseq library!")
# Same as above. vnorm is an aggregate_fn, so SequenceAttributionAggregator is used by default
# no select_idx = all dimensions are used
agg_out = out.aggregate(["subwords", "vnorm"])
# Changing aggregation function is now easy, e.g. gradient of dimension 30 only
agg_out_filtered_mean = out.aggregate("subwords").aggregate(select_idx=30)

New step functions: The contrast_prob, pcxmi and kl_divergence functions were added to pre-registered step functions inside the library. While contrast_prob_diff returns the delta in probabilities between regular and contrastive prediction options, contrast_prob is the delta in probabilities for the same target token given different source and/or preceding target context. pcxmi and kl_divergence use contrast_prob to derive information-theoretic quantities given the two context options.

💥 Breaking changes:

  • The contrast_prob_diff step function now takes contrast_targets directly in (list of) string, BatchEncoding or Batch format instead of contrast_ids and contrast_attention_mask. Examples in docs and README were updated accordingly.
  • The attribute call for the attention attribution method now does not take any parameter since the aggregation is postponed after the attribution (before, it used to accept indexes and aggregation function names). Hence, now the method returns the full attention tensor of size [source_len, target_len, num_layers, num_heads] rather than the pre-filtered and pre-aggregated one.

Here is an example of how previous attention attribution results can be reproduced with the new postponed aggregation:

Before (v0.4.0): Aggregation during attribution

import inseq

model = inseq.load_model("gpt2", "attention")

# will return the maximum attention weights for the first 5 heads averaged across the first, third, and eighth layers
out = model.attribute(
	"Hello world",
	generation_args={"max_new_tokens": 10},
	heads=(0, 5),
	aggregate_heads_fn="max",
	layers=[0, 2, 7]
)
# Attribution has size [tgt_len, gen_len]
out.sequence_attributions[0].target_attribution.shape

Now: postponed aggregation. Same result, but gives users more control on the aggregation process.

import inseq

model = inseq.load_model("gpt2", "attention")

out = model.attribute("Hello world", generation_args={"max_new_tokens": 10})
# Attribution has size [tgt_len, gen_len, num_layers, num_heads]. Note simplified indexing of sequence attributions
out[0].target_attributions.shape
# do_post_aggregation_checks=False avoids checks ensuring the output is 2D for visualization. Aggregates dim -1  by default
agg_heads = out.aggregate("max", select_idx=(0, 5), normalize=False, do_post_aggregation_checks=False)
# Attribution has size [tgt_len, gen_len, num_layers]
agg_heads[0].target_attributions.shape
agg_layers = agg_heads.aggregate("mean", select_idx=[0, 2, 7], normalize=False)
# Attribution has size [tgt_len, gen_len], matches original example
agg_layers[0].target_attributions.shape

@gsarti gsarti changed the title Aggregation functions, named aggregators, contrastive context step functions Aggregation functions, named aggregators, contrastive context step functions, inseq.explain May 18, 2023
@gsarti gsarti added this to the v0.5 milestone May 18, 2023
@gsarti gsarti added enhancement New feature or request user qol Quality of life improvements for library users labels May 18, 2023
@gsarti gsarti merged commit 1046230 into main May 18, 2023
@gsarti gsarti deleted the agg-fns branch May 18, 2023 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request user qol Quality of life improvements for library users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant