ALPS_2021 - LAB 2: XAI in NLP - January 22 2021

Repository for the Explainable AI track in the ALPS winter school 2021 - schedule.

This lab consists of two parts - one on explainability and one on explorative interpretability. You should try to split your time between the two labs.

Lab 2.1

The first part of the lab focuses on explainability for Natural Language Processing Models. In this part, we will lay the foundations of post-hoc explainability techniques and ways of evaluating them.

Lab 2.1 code

CoLAB <- copy this Colab notebook and add code to it for the exercises. CoLAB Solutions

For this notebook of the lab, we encourage you to work in groups, so that you could split the work and discuss the outcomes.

Goals of LAB 2.1:

learn how to implement two basic and commonly used types of gradient-based explainability techniques
learn how to implement an approximation-based explainability technique
exercise how to apply explainability techniques to discover flaws of machine learning models and construct adversarial explamples using them
learn how to evaluate explainability techniques with common diagnostic properties (based on this paper)
exercise using the diagnostic properties to find which architecture parameters of a model make it harder to explain

If you find this code useful for your research, please consider citing:

@inproceedings{atanasova2020diagnostic,
title={A Diagnostic Study of Explainability Techniques for Text Classification},
author={Pepa Atanasova and Jakob Grue Simonsen and Christina Lioma and Isabelle Augenstein},
booktitle = {Proceedings of EMNLP},
publisher = {Association for Computational Linguistics},
year = 2020
}

Lab 2.2

The second lab focuses explorative interpretability via acivation maximization - i.e. TX-Ray https://arxiv.org/abs/1912.00982. Activation maximization works for supervised and self/un-supervised settings alike, but the lab focuses analyzing CNN filters in a simple supervised setting.

Lab 2.2 code

CoLAB2 <- copy this Colab notebook and add code to it for the exercises. There are two types of exercises:

Familiarization exercise: to 'play with and understand' the technique. These allow quickly changes data collection and visualization parameters. They are intended for explorative analysis.

Advanced Exercises: these are optional and concern applications of the technique. They have no solution, but give solution outlines (starter code). Opt-EX1: XAI based pruning with hooks, Opt-EX2 influence of overparameterization (wider CNN with more filters), Opt-Ex3: filter redundancy. Opt-Ex2-3 belong together

Goals of LAB 2.2:

learn how to explore and interpret activations in NNs using 'activation maximization' principles
learn how to extract activations via forward_hooks
exercise how to usefully interpret and visualize activation behaviour
exercise how to prune activations -- advanced
analyze neuron/ filter redundancy, specialization, generalization - advanced
Overall: explore/ develop ideas towards 'model understanding' -- see https://arxiv.org/abs/1907.10739 for a great introduction of 'decision understanding' vs. 'model understanding'
- this tutorial focuses on explorative 'model understanding' via TX-Ray https://arxiv.org/abs/1912.00982

If you find this code useful for your research, please consider citing:

@inproceedings{Rethmeier19TX-Ray,
title = {TX-Ray: Quantifying and Explaining Model-Knowledge Transfer in (Un-)Supervised NLP},
author = {Rethmeier, Nils and Kumar Saxena, Vageesh and Augenstein, Isabelle},
booktitle = {Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI)},
year = 2020
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
trained_models		trained_models
tutorial_src		tutorial_src
2021_ALPS_Talk.pdf		2021_ALPS_Talk.pdf
LAB2.1_Explainability.ipynb		LAB2.1_Explainability.ipynb
LAB2.1_Explainability_Solutions.ipynb		LAB2.1_Explainability_Solutions.ipynb
LAB2.2_TX_Ray_4_CNNs.ipynb		LAB2.2_TX_Ray_4_CNNs.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ALPS_2021 - LAB 2: XAI in NLP - January 22 2021

Lab 2.1

Lab 2.1 code

Goals of LAB 2.1:

Lab 2.2

Lab 2.2 code

Goals of LAB 2.2:

About

Contributors 3

Languages

copenlu/ALPS_2021

Folders and files

Latest commit

History

Repository files navigation

ALPS_2021 - LAB 2: XAI in NLP - January 22 2021

Lab 2.1

Lab 2.1 code

Goals of LAB 2.1:

Lab 2.2

Lab 2.2 code

Goals of LAB 2.2:

About

Topics

Resources

Stars

Watchers

Forks

Contributors 3

Languages