Skip to content
/ CGBA Public
forked from PaperCGBA/CGBA

Claim Guided Textual Backdoor Attack

Notifications You must be signed in to change notification settings

minkyoo9/CGBA

 
 

Repository files navigation

CGBA

Implementation of Claim-Guided Textual Backdoor Attack for Practical Applications, NAACL 2025 Findings

Requirements

  1. Install Anaconda: Download and install Anaconda from here.
  2. Create and Activate Python Environment:
    conda env create -f environments.yml -n cgba
    conda activate cgba
    

Evaluation

Model Preparation

  1. Download Contrastive Models:

    • Download the models from this link.
    • Decompress the file and move the models:
      tar -zxvf Contrastive_FakeNews.tar.gz
      mv Models_dist_scale_margin0.2_alpha0.1 Contrastive_Learning/Models
  2. Download Backdoored Models:

    • Download the models from this link.
    • Decompress the file and move the models:
      tar -zxvf BestModels_FakeNews.tar.gz
      mv BestModels_dist_alpha0.1_scale0.2_aug10 Model_Training/BestModels/

Run

cd Model_Training
python COVID_Make_poisonedBERT_Eval.py -c 5 # Evaluate attack result for cluster ID: 5

Train

Dataset Preparation

We already provide datasets for FakeNews (COVID)

  • Sequentially run the following scripts with the appropriate paths to extract claims from the dataset:

    cd Claim_Extraction
    python Extract_NEs.py
    python Extract_Questions.py
    python Extract_Claims.py
  • Extract embeddings and conduct clustering with the appropriate paths:

    cd Embedding_Extraction
    python Embedding_Extraction.py

Contrastive Learning

We already provide a trained contrastive model (see the release section)

  • Train the contrastive model:
    cd Contrastive_Learning
    python ContrastiveLearning.py

Backdoor Training

Adjust the shell code according to constructed cluster ids

  • Conduct backdoor training:
    cd Model_Training
    chmod +x run_clusters.sh
    ./run_clusters.sh # This process takes several hours and requires approximately 23 GiB of storage for the models.

Evaluation

  • Evaluate attack performance acorss entire clusters:
    python Analyze_results.py

About

Claim Guided Textual Backdoor Attack

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.8%
  • Shell 1.2%