TextAttack-Fragile-Interpretations

Code for the paper: "Perturbing Inputs for Fragile Interpretations in Deep Natural Language Processing" (EMNLP BlackboxNLP - 2021)

Pre-calculated candidates and interpretations are available on Google drive here. The results can be replicated by running the results-metric.py script. The exact commmands are detailed in Step-5.

We strongly recommend using conda to manage dependencies.

Run conda create -n frag-exp python=3.6 and subsequently conda activate frag-exp.

Run pip install -r requirements.txt

Following steps re-run the candidate generation process and re-calculate interpretations.

Install Textattack from the TextAttack folder's dist folder by installing the wheel: pip install Textattack/dist/textattack-0.2.14-py3-none-any.whl
Run python generate_candidates.py --model=distilbert --dataset=sst2 --number=500 --split=validation. All options can be edited for different datasets and models. By default save paths are ./candidates.
Run python calculate_interpretations.py --model=distilbert --dataset=sst2 --interpretmethod=IG --number=500 --split=validation. All options can be edited for different datasets and models. By default save paths are ./interpretations.
Once all interpretations have been calculated, run python results-metrics.py --model=distilbert --dataset=sst2 --interpretmethod=IG --number=500 --split=validation --metric=rkc.

The available metrics are rkc (Rank Correlation), topk (Top-K Intersection),ppl (Perplexity), grm (Grammar errors) and conf (Model Confidence). Results are stored in ./results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TextAttack

TextAttack

candidates

candidates

interpretations

interpretations

models

models

results

results

LICENSE

LICENSE

README.md

README.md

calculate_interpretations.py

calculate_interpretations.py

generate_candidates.py

generate_candidates.py

requirements.txt

requirements.txt

results-ablation.py

results-ablation.py

results-metrics.py

results-metrics.py

Repository files navigation

TextAttack-Fragile-Interpretations

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
TextAttack		TextAttack
candidates		candidates
interpretations		interpretations
models		models
results		results
LICENSE		LICENSE
README.md		README.md
calculate_interpretations.py		calculate_interpretations.py
generate_candidates.py		generate_candidates.py
requirements.txt		requirements.txt
results-ablation.py		results-ablation.py
results-metrics.py		results-metrics.py

License

QData/TextAttack-Fragile-Interpretations

Folders and files

Latest commit

History

Repository files navigation

TextAttack-Fragile-Interpretations

About

Resources

License

Stars

Watchers

Forks

Languages