NeuroTrace

This repository is an implementation of our paper NeuroTrace: Inference Provenance-Based Detection of Adversarial Examples. NeuroTrace supports:

IPG graph extraction from benign + adversarial/poisoned examples
Graph re-use via loading precomputed .pt files (no meta files)
GNN training / validation / testing across three experiment modes

1) Quickstart (recommended)

From the repository root:

conda install -n NeuroTrace python=3.10
conda activate NeuroTrace
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
pip install git+https://github.com/fra31/auto-attack

3) Download paper graphs from Google Drive (Section 3)

4) Run using precomputed graphs (LOAD)

python main.py attack_on_attack
--graphs-policy load
--graph-root ./graphs
--results-root ./results
--attacks FGSM,PGD,APGD-DLR,SQUARE,SPSA,SIA
--epochs 30 --batch-size 16


If you **don’t** have precomputed graphs, use `--graphs-policy generate` instead.

---


## 2) Dataset (graphs) download + placement

Google Drive:

https://drive.google.com/file/d/1JgT8JfQ4_Eie13W-9gCmai66zm16ce-B/view?usp=sharing


### 2.1 What you download
We distribute **precomputed IPG graphs** as a compressed archive (zip or tar.gz).

### 2.2 Where to extract
Extract the archive into the repository so you get a top-level folder like `./graphs/`:

```bash
unzip graphs.zip -d .

Place the graph folder in the directory of NeuroTrace (location of main.py)

2.3 Expected folder layout after extraction

Your --graph-root should contain one folder per TAG (attack or mixed setting), and inside each tag:

graphs/
  FGSM/
    train/benign.pt
    train/perturbed.pt
    val/benign.pt
    val/perturbed.pt

  PGD/
    train/benign.pt
    train/perturbed.pt
    val/benign.pt
    val/perturbed.pt

  APGD-DLR/
    train/benign.pt
    train/perturbed.pt
    val/benign.pt
    val/perturbed.pt

  SQUARE/
    train/benign.pt
    train/perturbed.pt
    val/benign.pt
    val/perturbed.pt

  SPSA/
    train/benign.pt
    train/perturbed.pt
    val/benign.pt
    val/perturbed.pt

  SIA/
    train/benign.pt
    train/perturbed.pt
    val/benign.pt
    val/perturbed.pt

  ALL_MIXED/
    train/benign.pt
    train/perturbed.pt
    val/benign.pt
    val/perturbed.pt

  WHITE2BLACK_MIXED/
    train/benign.pt
    train/perturbed.pt
    val/benign.pt
    val/perturbed.pt

  BLACK2WHITE_MIXED/
    train/benign.pt
    train/perturbed.pt
    val/benign.pt
    val/perturbed.pt

3) Graph policy: `generate` vs `load`

`--graphs-policy load`

Will not generate anything
Expects the 4 files to exist for each tag:
- train/benign.pt
- train/perturbed.pt
- val/benign.pt
- val/perturbed.pt
If missing, the run fails with an error.

`--graphs-policy generate`

Builds graphs from the victim model + loaders
Saves graphs into:
- <graph-root>/<TAG>/train/*.pt
- <graph-root>/<TAG>/val/*.pt
Then trains/evaluates the GNN.

4) Outputs

4.1 Results directory layout

Results are stored under:

<results-root>/<TAG>/
  <TAG>_metrics.csv
  <TAG>_summary.csv
  <TAG>_model_hgt.pt
  (optional) <TAG>_per_attack_eval.csv

5) The three experiment modes

Mode 1 — `attack_on_attack`

Train & test separately per attack (loops over attacks).

python main.py attack_on_attack \
  --graphs-policy load \
  --graph-root ./graphs \
  --results-root ./results \
  --attacks FGSM,PGD,APGD-DLR,SQUARE,SPSA,SIA \
  --epochs 30 --batch-size 16 \
  --seed 0

Outputs: per-attack folders under <results-root>/<ATTACK>/....

Mode 2 — `all_attacks`

Train one model on a mixture of all attacks, then evaluate per attack.

python main.py all_attacks \
  --graphs-policy load \
  --graph-root ./graphs \
  --results-root ./results \
  --attacks FGSM,PGD,APGD-DLR,SQUARE,SPSA,SIA \
  --mixed-tag ALL_MIXED \
  --epochs 30 --batch-size 16 \
  --seed 0

Outputs include:

results/ALL_MIXED/ALL_MIXED_metrics.csv
results/ALL_MIXED/ALL_MIXED_summary.csv
results/ALL_MIXED/ALL_MIXED_model_hgt.pt
results/ALL_MIXED/ALL_MIXED_per_attack_eval.csv

Mode 3 — `white_black`

Transfer experiments between white-box and black-box attack groups.

python main.py white_black \
  --graphs-policy load \
  --graph-root ./graphs \
  --results-root ./results \
  --whitebox-attacks FGSM,PGD,APGD-DLR \
  --blackbox-attacks SQUARE,SPSA,SIA \
  --direction both \
  --epochs 30 --batch-size 16 \
  --seed 0

Directions:

white2black
black2white
both

Outputs appear under:

<results-root>/WHITE2BLACK_MIXED/...
<results-root>/BLACK2WHITE_MIXED/...

6) Hardware / runtime notes

Graph extraction is expensive → GPU strongly recommended for --graphs-policy generate
For stable results:
- Set --seed
- Use a consistent PyTorch + PyG stack
- Prefer --graphs-policy load when using our released graphs

Suggested: 64 GB RAM and a GPU with ~24 GB VRAM.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
adv_utils		adv_utils
README.md		README.md
__init__.py		__init__.py
adapters.py		adapters.py
converters.py		converters.py
datasets.py		datasets.py
engine.py		engine.py
main.py		main.py
provenance_impl.py		provenance_impl.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeuroTrace

1) Quickstart (recommended)

3) Download paper graphs from Google Drive (Section 3)

4) Run using precomputed graphs (LOAD)

2.3 Expected folder layout after extraction

3) Graph policy: `generate` vs `load`

`--graphs-policy load`

`--graphs-policy generate`

4) Outputs

4.1 Results directory layout

5) The three experiment modes

Mode 1 — `attack_on_attack`

Mode 2 — `all_attacks`

Mode 3 — `white_black`

6) Hardware / runtime notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NeuroTrace

1) Quickstart (recommended)

3) Download paper graphs from Google Drive (Section 3)

4) Run using precomputed graphs (LOAD)

2.3 Expected folder layout after extraction

3) Graph policy: generate vs load

--graphs-policy load

--graphs-policy generate

4) Outputs

4.1 Results directory layout

5) The three experiment modes

Mode 1 — attack_on_attack

Mode 2 — all_attacks

Mode 3 — white_black

6) Hardware / runtime notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

3) Graph policy: `generate` vs `load`

`--graphs-policy load`

`--graphs-policy generate`

Mode 1 — `attack_on_attack`

Mode 2 — `all_attacks`

Mode 3 — `white_black`

Packages