Skip to content

uclanlp/AdvExDetection-ADDMU

Repository files navigation

ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation

Code for our EMNLP 2022 paper. The framework and metrics are adapted from https://github.com/bangawayoo/adversarial-examples-in-text-classification

Dependencies

Python >= 3.6

Pytorch=1.8.1

Install python requirments via requirments file: pip install -r requirements.txt

Data

We use TextAttack to generate attack data of the four attacks, TextFooler, BAE, Pruthi, and TextBugger. If you want to generate your own adversarial data, please refer to their repos. We also provide here with some of the data we generate, including both regular and far-boundary data. Please download the whole folder and put them under the main directory.

Usage

The experiments can be reproduced by simply running the following shell script:

bash run_test_sst2.sh

This is the example script for sst2. Changing the datasets, attack types, and detectors with the following options.

Options for the datasets are sst2, imdb, ag-news, and snli, which are listed with the DATASET variable.

Options for the type of attacks are textfooler, bae, pruthi, and textbugger, which are listed with the RECIPE variable. Also, use *_high_confidence_0.9 , such as textfooler_high_confidence_0.9 for far boundary version of attacks.

Options for the detectors are our proposed method ue, and two other baselines, ppl and rde

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published