Custom alignment of `contrast_targets` for contrastive attribution methods #195

gsarti · 2023-06-22T16:57:29Z

Description

The current implementation of contrastive attribution method can only be applied to tokens in the same positions across the original and contrastive sequences, limiting the applicability of such methods for real-world contrastive pairs in which differences are not necessarily minimal.

This PR introduces a new parameter contrast_targets_alignments (List[Tuple[int, int]] or List[List[Tuple[int, int]]], if more than one sequence is attributed) that can be used to provide custom alignments between the original generated_texts used for attribution and the contrast_targets used as contrastive pairs when attributed_fn is set to a contrastive function (contrast_prob or contrast_prob_diff).

Example

The following example shows the current problematic behavior of attributed_fn=contrast_prob_diff when two largely different sentences are provided:

import inseq

model = inseq.load_model("Helsinki-NLP/opus-mt-en-it", "saliency")

print([(i, x) for i, x in enumerate(model.encode("I soldati della pace ONU", as_targets=True).input_tokens[0])])
print([(i, x) for i, x in enumerate(model.encode("Le forze militari di pace delle Nazioni Unite", as_targets=True).input_tokens[0])])

>>> [(0, '<pad>'), (1, '▁I'), (2, '▁soldati'), (3, '▁della'), (4, '▁pace'), (5, '▁ONU'), (6, '</s>')]
>>> [(0, '<pad>'), (1, '▁Le'), (2, '▁forze'), (3, '▁militari'), (4, '▁di'), (5, '▁pace'), (6, '▁delle'), (7, '▁Nazioni'), (8, '▁Unite'), (9, '</s>')]

out = model.attribute(
    "UN peacekeepers",
    "I soldati della pace ONU",
    attributed_fn="contrast_prob_diff",
    step_scores=["contrast_prob_diff"],
    contrast_targets="Le forze militari di pace delle Nazioni Unite",
)
out.show()

Using contrast_targets_alignments we can specify pairs of original_idx, contrast_idx to align the contents of contrast_targets to the attributed sequence:

out = model.attribute(
    "UN peacekeepers",
    "I soldati della pace ONU",
    attributed_fn="contrast_prob_diff",
    step_scores=["contrast_prob_diff"],
    contrast_targets="Le forze militari di pace delle Nazioni Unite",
    contrast_targets_alignments=[[(0, 0), (1, 1), (2, 2), (3, 4), (4, 5), (5, 7), (6, 9)]],
)
out.show()

Finally, a contrast_targets_alignments="auto" option is provided to allow automatic word alignment. The words between the original and contrastive target sequences are aligned automatically using cosine similarity of the embeddings formed by a massively multilingual encoder model (sentence-transformers/LaBSE).

out = model.attribute(
    "UN peacekeepers",
    "I soldati della pace ONU",
    attributed_fn="contrast_prob_diff",
    step_scores=["contrast_prob_diff"],
    contrast_targets="Le forze militari di pace delle Nazioni Unite",
    contrast_targets_alignments="auto",
)
out.show()

Notes

If the new contrast_targets_alignments is not specified, current behavior is preserved (1:1 match, example 1).
Provided alignments need to encompass all tokens of the original sequence, since contrastive attribution is performed at every generation step for that sequence. This is likely to produce nonsensical contrast pairs, so meaningful pairs need to be selected post-attribution for further analysis. If provided alignments do not cover all tokens, the current behavior is to raise a warning and add 1:1 alignments for the missing original tokens.
If a token of the original sequence is aligned with multiple tokens in the contrast, the current behavior is to use the first (in terms of position in the sentence) non-aligned token among those (if any), or the first if they are all aligned.
In the presence of contrast_targets differing from the aligned original tokens, the output tokens produced by model.attribute are modified to reflect this using the Contrast → Original notation. This might change in the future with the introduction of an ad-hoc field for contrast targets in the output to preserve maximal information.
The sentence-transformers/LaBSE is chosen as default aligner since it encompasses 109 languages. At the moment, the model cannot be set programmatically by users. The model is loaded lazily when the "auto" option is used and kept cached for subsequent calls.

gsarti added 7 commits June 22, 2023 18:25

Add support for alignment of contrast targets

88d6049

Fix attribution positions

c974f9e

Update deps

63a8cf2

Alignment utils and tests, todo auto align

46831bb

Started auto align logic

6b666c6

Auto align working, tests missing

195d87f

Add tests for auto align

07f18ac

gsarti merged commit c0cc551 into main Jun 30, 2023

gsarti added this to the v0.5 milestone Jul 21, 2023

gsarti mentioned this pull request Jul 28, 2023

Add registered contrastive logits difference step function #147

Closed

gsarti deleted the contrast-attr-align branch August 2, 2023 06:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom alignment of `contrast_targets` for contrastive attribution methods #195

Custom alignment of `contrast_targets` for contrastive attribution methods #195

gsarti commented Jun 22, 2023 •

edited

Loading

Custom alignment of contrast_targets for contrastive attribution methods #195

Custom alignment of contrast_targets for contrastive attribution methods #195

Conversation

gsarti commented Jun 22, 2023 • edited Loading

Description

Example

Notes

Custom alignment of `contrast_targets` for contrastive attribution methods #195

Custom alignment of `contrast_targets` for contrastive attribution methods #195

gsarti commented Jun 22, 2023 •

edited

Loading