Value Zeroing

The official repo for the EACL 2023 paper "Quantifying Context Mixing in Transformers"

Abstract

Self-attention weights and their transformed variants have been the main source of information for analyzing token-to-token interactions in Transformer-based models. But despite their ease of interpretation, these weights are not faithful to the models’ decisions as they are only one part of an encoder, and other components in the encoder layer can have considerable impact on information mixing in the output representations. In this work, by expanding the scope of analysis to the whole encoder block, we propose Value Zeroing, a novel context mixing score customized for Transformers that provides us with a deeper understanding of how information is mixed at each encoder layer. We demonstrate the superiority of our context mixing score over other analysis methods through a series of complementary evaluations with different viewpoints based on linguistically informed rationales, probing, and faithfulness analysis.

External links

Models, Data and Preprocessing Toolkits

Baselines

Attention Rollout
Attention-norm
GlobEnc
ALTI
Captum for Gradient-based methods

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
finetuning		finetuning
modeling		modeling
probing		probing
scoring		scoring
utils		utils
.gitignore		.gitignore
README.md		README.md
demo.py		demo.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

finetuning

finetuning

modeling

modeling

probing

probing

scoring

scoring

utils

utils

.gitignore

.gitignore

README.md

README.md

demo.py

demo.py

requirements.txt

requirements.txt

Repository files navigation

Value Zeroing

Abstract

External links

Models, Data and Preprocessing Toolkits

Baselines

About

Releases

Packages

Languages

hmohebbi/ValueZeroing

Folders and files

Latest commit

History

Repository files navigation

Value Zeroing

Abstract

External links

Models, Data and Preprocessing Toolkits

Baselines

About

Topics

Resources

Stars

Watchers

Forks

Languages