[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
-
Updated
Jan 24, 2024 - Jupyter Notebook
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
TensorFlow implementation of Graphical Attention Recurrent Neural Networks based on work by Cirstea et al., 2019.
Attention Saver lets you extract entire attention matrices or row-wise statistics (e.g. entropy) from any HuggingFace causal LLM layer for ultra-long context when using flash-attention without running out of GPU memory.
Add a description, image, and links to the attention-matrix topic page so that developers can more easily learn about it.
To associate your repository with the attention-matrix topic, visit your repo's landing page and select "manage topics."