Skip to content

liongkj/HackAttend

Repository files navigation

# Unveiling Vulnerabilities of Self-Attention


Welcome to the official GitHub repository for our LREC-COLING 2024 paper, "Unveiling Vulnerabilities of Self-Attention".

Setting Up the Environment

To set up the required environment, execute the following commands:

conda create -n sattend python=3.7
conda activate sattend
pip install -r requirements.txt
cd TextAttack && pip install -e .

Getting Started

Prerequisites

  1. Ensure you have the dataset by running the script located at data/download_data.sh.

HackAttend

  1. Finetune models in the Victim folder.
  2. Run run_hackattend.py.

Implementing Defense Mechanisms

We explore several defense strategies against adversarial attacks alongside the SAttend model.

Evaluation

TextAttack is used for evaluation. The implementation is modified from the original TextAttack library. The test set is generated by running the following command:

python -m attacks.attack_textfooler.py --recipe <recipe> --task_name <task_name>

<recipe> can be any of the following:

  1. textfooler
  2. bertattack

Defense mechanisms are:

  • Adversarial Training: This involves training models with adversarially generated examples. The implementation are located in the victim/SAttend/ folder. The implementation is adapted from CreAT

  • Adversarial Data Augmentation (ADA) samples.

    1. To generate adversarial samples, use the following command:
    python -m attacks.attack_textfooler.py --recipe <recipe> --task_name <task_name> --generate_adv_samples
    1. Fine-tuning with adversarial samples. To fine-tune the model with adversarial samples, use the following command:
    python run_sattend.py --do_eval --do_train --best_epoch best --task_name <task_name> --do_lower_case --num_train_epochs <num_epochs> --gradient_accumulation_steps <num_steps> --train_batch_size <batch_size> --fp16 --adversarial --adv_split train_<recipe> --warmup_proportion <warmup_proportion> --learning_rate <learning_rate>
  • S-Attend smoothing. run_sattend.py to train SAttend model with the flag --adv_split test and ---mask_rate <mask_rate>

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published