Adversarial Insertions

Short guide

Install environment:

Create a conda environment with suitable name and python version 3.8

conda create --name adversarial_insertions python=3.8
conda activate adversarial_insertions

Install needed libraries such as pytorch, transformers etc.

conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch
pip install transformers==4.2.1 bert-score==0.3.7 spacy==2.3.5 pytorch-lightning==1.1.2 matplotlib==3.3.3 jsonlines==1.2.0

Preprocessing

Use preprocessing.py to preprocess the datasets used in this project. The datasets were downloaded from these sources:

MNLI: https://cims.nyu.edu/~sbowman/multinli/
MRPC: https://www.microsoft.com/en-us/download/details.aspx?id=52398
QQP: https://dl.fbaipublicfiles.com/glue/data/QQP-clean.zip
RTE: https://dl.fbaipublicfiles.com/glue/data/RTE.zip
WiC: https://dl.fbaipublicfiles.com/glue/superglue/data/v2/WiC.zip
SciEntsBank 3way seems to be only available here: https://www.kaggle.com/datasets/smiles28/semeval-2013-2-and-3-way

Training

Models were trained with the reported hyperparameters using training.py

Attack

Use find_top_words.py to find most common adjectives and adverbs
Get correct predictions for a given label and the respective model with get_correct_predictions.py
Prepare and execute the attack with prepare_attack.py and attack.py
Run plot_results.py to get confidence value histograms for BERT models and samples for successful adversarial examples for both T5 and BERT
Use analyze_results.py to get success rates of the attack
Print out samples with pretty_displaying.py
To analyze adjective and adverb occurrences in different classes, use frequency_analysis.py (currently WIP)

Make sure that each function gets proper input, such as folder or file paths as strings.

In cooperation with Anna Filighera and the Multimedia Communications Lab at TU Darmstadt.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
analyze_results.py		analyze_results.py
attack.py		attack.py
attack_data.npy		attack_data.npy
dataloading.py		dataloading.py
find_top_words.py		find_top_words.py
frequency_analysis.py		frequency_analysis.py
get_correct_predictions.py		get_correct_predictions.py
get_new_confidence_scores.py		get_new_confidence_scores.py
lit_Model.py		lit_Model.py
plot_results.py		plot_results.py
prepare_attack.py		prepare_attack.py
preprocessing.py		preprocessing.py
pretty_displaying.py		pretty_displaying.py
requirements.txt		requirements.txt
testing.py		testing.py
textfooler.py		textfooler.py
top_adjectives_adverbs.npy		top_adjectives_adverbs.npy
training.py		training.py
utils.py		utils.py

License

SebOchs/adversarial_insertions

Folders and files

Latest commit

History

Repository files navigation

Adversarial Insertions

Short guide

Install environment:

Preprocessing

Training

Attack

About

Resources

License

Stars

Watchers

Forks

Languages