[ICML26]On the Adversarial Robustness of Large Vision-Language Models under Visual Token Compression

This repository contains the official implementation of CAGE on the TextVQA dataset. It targets LLaVA models equipped with VisionZIP inference acceleration.

1. Dataset Preparation

Download the Validation set and images from the TextVQA official website.
Organize the dataset directory structure as follows:

dataset/
└── TextVQA/
    ├── train_images/
    └── TextVQA_0.5.1_val.json

2. Inference (Clean Images)

Perform inference on LLaVA with VisionZIP using clean images to establish the baseline performance.

conda env create -f environment.yml


CUDA_VISIBLE_DEVICES=0 python llava_verify_zip_VQA_Text.py \
  --dominant 54 \
  --contextual 10 \
  --zip

Note:

Replace 54 and 10 with your specific VisionZIP hyperparameters (dominant/contextual tokens).

Results will be saved at: ./output/textvqa_results/textvqa_zip_54_10_val.json

3. Perform Attack (CAGE)

Run the attack script to generate adversarial examples targeting the compressed model.

CUDA_VISIBLE_DEVICES=0 python attacker_TextVQA_caa.py \
  --results_json    ./output/textvqa_results/textvqa_zip_54_10_val.json \
  --textvqa_root ./dataset/TextVQA \
  --epsilon 2 \
  --alpha 0.5 \
  --steps 100 \
  --sel_layer -2 \
  --lambda_attr 0.005

Arguments:

--results_json: Path to the clean baseline results (generated in Step 2).
--epsilon: Perturbation budget ( norm).
--alpha: Step size for the attack optimization.
--steps: Number of optimization iterations.
--sel_layer: The layer index used for feature selection (e.g., -2).
--lambda_attr: weight for RDA-KL alignment

The generated adversarial examples will be saved in: ./attack/CAA_EFD_RDAKL_attn_seed42_TextVQA_[TIMESTAMP]/

4. Verify Attack Performance

Evaluate the attack success rate by testing the generated adversarial examples against the target model.

CUDA_VISIBLE_DEVICES=0 python llava_verify_zip_VQA_Text.py \
  --dominant 54 \
  --contextual 10 \
  --zip \
  --adversarial \
  --adversarial-dir ./attack/CAA_EFD_RDAKL_attn_seed42_TextVQA_[TIMESTAMP]/

Important:

Ensure --adversarial-dir points to the specific timestamped folder created in Step 3.

Keep the --dominant and --contextual parameters consistent with Step 2.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[ICML26]On the Adversarial Robustness of Large Vision-Language Models under Visual Token Compression

1. Dataset Preparation

2. Inference (Clean Images)

3. Perform Attack (CAGE)

4. Verify Attack Performance

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LLaVA		LLaVA
VisionZip		VisionZip
attack		attack
dataset		dataset
output		output
.DS_Store		.DS_Store
README.md		README.md
attacker_TextVQA_caa.py		attacker_TextVQA_caa.py
environment.yml		environment.yml
llava_verify_zip_VQA_Text.py		llava_verify_zip_VQA_Text.py

Folders and files

Latest commit

History

Repository files navigation

[ICML26]On the Adversarial Robustness of Large Vision-Language Models under Visual Token Compression

1. Dataset Preparation

2. Inference (Clean Images)

3. Perform Attack (CAGE)

4. Verify Attack Performance

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages