This repository is an implementation of our paper NeuroTrace: Inference Provenance-Based Detection of Adversarial Examples. NeuroTrace supports:
- IPG graph extraction from benign + adversarial/poisoned examples
- Graph re-use via loading precomputed
.ptfiles (no meta files) - GNN training / validation / testing across three experiment modes
From the repository root:
conda install -n NeuroTrace python=3.10
conda activate NeuroTrace
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
pip install git+https://github.com/fra31/auto-attack
python main.py attack_on_attack
--graphs-policy load
--graph-root ./graphs
--results-root ./results
--attacks FGSM,PGD,APGD-DLR,SQUARE,SPSA,SIA
--epochs 30 --batch-size 16
If you **don’t** have precomputed graphs, use `--graphs-policy generate` instead.
---
## 2) Dataset (graphs) download + placement
Google Drive:
https://drive.google.com/file/d/1JgT8JfQ4_Eie13W-9gCmai66zm16ce-B/view?usp=sharing
### 2.1 What you download
We distribute **precomputed IPG graphs** as a compressed archive (zip or tar.gz).
### 2.2 Where to extract
Extract the archive into the repository so you get a top-level folder like `./graphs/`:
```bash
unzip graphs.zip -d .
Place the graph folder in the directory of NeuroTrace (location of main.py)
Your --graph-root should contain one folder per TAG (attack or mixed setting), and inside each tag:
graphs/
FGSM/
train/benign.pt
train/perturbed.pt
val/benign.pt
val/perturbed.pt
PGD/
train/benign.pt
train/perturbed.pt
val/benign.pt
val/perturbed.pt
APGD-DLR/
train/benign.pt
train/perturbed.pt
val/benign.pt
val/perturbed.pt
SQUARE/
train/benign.pt
train/perturbed.pt
val/benign.pt
val/perturbed.pt
SPSA/
train/benign.pt
train/perturbed.pt
val/benign.pt
val/perturbed.pt
SIA/
train/benign.pt
train/perturbed.pt
val/benign.pt
val/perturbed.pt
ALL_MIXED/
train/benign.pt
train/perturbed.pt
val/benign.pt
val/perturbed.pt
WHITE2BLACK_MIXED/
train/benign.pt
train/perturbed.pt
val/benign.pt
val/perturbed.pt
BLACK2WHITE_MIXED/
train/benign.pt
train/perturbed.pt
val/benign.pt
val/perturbed.pt
- Will not generate anything
- Expects the 4 files to exist for each tag:
train/benign.pttrain/perturbed.ptval/benign.ptval/perturbed.pt
- If missing, the run fails with an error.
- Builds graphs from the victim model + loaders
- Saves graphs into:
<graph-root>/<TAG>/train/*.pt<graph-root>/<TAG>/val/*.pt
- Then trains/evaluates the GNN.
Results are stored under:
<results-root>/<TAG>/
<TAG>_metrics.csv
<TAG>_summary.csv
<TAG>_model_hgt.pt
(optional) <TAG>_per_attack_eval.csv
Train & test separately per attack (loops over attacks).
python main.py attack_on_attack \
--graphs-policy load \
--graph-root ./graphs \
--results-root ./results \
--attacks FGSM,PGD,APGD-DLR,SQUARE,SPSA,SIA \
--epochs 30 --batch-size 16 \
--seed 0Outputs: per-attack folders under <results-root>/<ATTACK>/....
Train one model on a mixture of all attacks, then evaluate per attack.
python main.py all_attacks \
--graphs-policy load \
--graph-root ./graphs \
--results-root ./results \
--attacks FGSM,PGD,APGD-DLR,SQUARE,SPSA,SIA \
--mixed-tag ALL_MIXED \
--epochs 30 --batch-size 16 \
--seed 0Outputs include:
results/ALL_MIXED/ALL_MIXED_metrics.csvresults/ALL_MIXED/ALL_MIXED_summary.csvresults/ALL_MIXED/ALL_MIXED_model_hgt.ptresults/ALL_MIXED/ALL_MIXED_per_attack_eval.csv
Transfer experiments between white-box and black-box attack groups.
python main.py white_black \
--graphs-policy load \
--graph-root ./graphs \
--results-root ./results \
--whitebox-attacks FGSM,PGD,APGD-DLR \
--blackbox-attacks SQUARE,SPSA,SIA \
--direction both \
--epochs 30 --batch-size 16 \
--seed 0Directions:
white2blackblack2whiteboth
Outputs appear under:
<results-root>/WHITE2BLACK_MIXED/...<results-root>/BLACK2WHITE_MIXED/...
- Graph extraction is expensive → GPU strongly recommended for
--graphs-policy generate - For stable results:
- Set
--seed - Use a consistent PyTorch + PyG stack
- Prefer
--graphs-policy loadwhen using our released graphs
- Set
Suggested: 64 GB RAM and a GPU with ~24 GB VRAM.