GitHub

About The Project

Source code for the paper: Detecting Textual Adversarial Examples Based on Distributional Characteristics of Data Representations.

Requirements:

numpy(1.19.5)
transformers(4.1.1)
pandas(1.1.5)
spacy(3.0.5)
torch(1.7.1)
nltk(3.5)
scipy(1.4.1)
scikit-learn(0.24.0)
tqdm(4.62.3)
torchtext(0.1.1)
editdistance(0.5.3)

The code is built in Python 3.6.3. To install all required packages, please run

pip install -r requirements.txt

and run followings on GPU:

git clone https://github.com/marcotcr/OpenNMT-py

cd OpenNMT-py/

python setup.py install

cd ..

Obtaining the data

To download the IMDB, MultiNLI datasets and counter-fitted vectors

bash ./download.sh

Fine-tune transformers models/representation learning models

IMDB dataset

python fine_tune.py --model-name bert --dataset-name IMDB --dataset-path ./data/aclImdb --max-length 512

change the --model-name to roberta, xlnet, bart to fine-tune RoBERTa, XLNet, BART models.

MultiNLI dataset

python fine_tune.py --model-name bert --dataset-name Mnli --dataset-path ./data/multinli_1.0 --max-length 256

change the --model-name to roberta, xlnet, bart to fine-tune RoBERTa, XLNet, BART models.

Generate textual adversarial examples

This project uses character-level, word-level, and phrase-level textual adversarial examples, with allowable values for --attack-class argument of 'typo', 'synonym', or 'seas', with --topk 30 for the MultiNLI dataset SEAs attack.

To generate character-level/typo textual adversarial examples for the IMDB dataset:

python generate_adv.py --dataset-name IMDB --dataset-path ./data/aclImdb --attack-class typo --max-length 512 --batch 0 --boxsize 25

--boxsize is the total number of batches for test examples, --batch is the number of batch. This command will generate adversarial examples for the top 1,000 IMDB test examples.

To generate character-level/typo textual adversarial examples for the MultiNLI dataset:

python generate_adv.py --dataset-name Mnli --dataset-path ./data/multinli_1.0 --attack-class typo --max-length 256 --boxsize 10 --batch 0

before generate word-level/synonym adversarial examples, please run:

python get_neighbours.py --dataset-name IMDB --dataset-path ./data/aclImdb --max-length 512

or

python get_neighbours.py --dataset-name Mnli --dataset-path ./data/multinli_1.0 --max-length 256
before generate phrase-level/seas adversarial examples, please download and unpack the translation models into the translation_models folder.

Detect textual adversarial examples

We use MDRE, adapted LID, FGWS, and a language model as detection classifiers, with --detect argument choices of 'mdre', 'lid', 'fgws', and 'language_model'.

To detect character-level adversarial examples using MDRE on the IMDB dataset, please run:

python detect.py --dataset-name IMDB --dataset-path ./data/aclImdb --attack-class typo --max-length 512 --batch-size 32 --detect mdre

To detect character-level adversarial examples using MDRE on the MultiNLI dataset, please run:

python detect.py --dataset-name Mnli --dataset-path ./data/multinli_1.0 --attack-class typo --max-length 256 --batch-size 32 --detect mdre

Citation

If you find MDRE helpful, please cite our paper:

@inproceedings{liu-etal-2022-detecting,
    title = "Detecting Textual Adversarial Examples Based on Distributional Characteristics of Data Representations",
    author = "Liu, Na  and
      Dras, Mark  and
      Emma Zhang, Wei",
    booktitle = "Proceedings of the 7th Workshop on Representation Learning for NLP",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.repl4nlp-1.9",
    doi = "10.18653/v1/2022.repl4nlp-1.9",
    pages = "78--90",
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Repl4NLP		Repl4NLP
translation_models		translation_models
README.md		README.md
attack.py		attack.py
detect.py		detect.py
download.sh		download.sh
fine_tune.py		fine_tune.py
generate_adv.py		generate_adv.py
get_neighbours.py		get_neighbours.py
glove_utils.py		glove_utils.py
models.py		models.py
onmt_model.py		onmt_model.py
paraphrase_scorer.py		paraphrase_scorer.py
replace_rules.py		replace_rules.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About The Project

Obtaining the data

Fine-tune transformers models/representation learning models

IMDB dataset

MultiNLI dataset

Generate textual adversarial examples

Detect textual adversarial examples

Citation

About

Releases

Packages

Languages

NaLiuAnna/MDRE

Folders and files

Latest commit

History

Repository files navigation

About The Project

Obtaining the data

Fine-tune transformers models/representation learning models

IMDB dataset

MultiNLI dataset

Generate textual adversarial examples

Detect textual adversarial examples

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages