## 1. Download everything you need
这个 example.ipynb 是整套流程的操作示例，分三部分向你展示如何复现实验：
下载依赖：前几个 Markdown/代码单元给出命令，依次下载 M-BEIR 数据集、补充的 OVEN-Wiki 候选库、ImageNet-1k、Places-365 以及所需模型（SigLIP、CLIP、LLaVA、Qwen）。
构建检索库：随后几个单元调用 mbeir_dataset.py 和 mbeir_dataset_imageonly_webqa.py 等脚本，利用 SigLIP 编码构建 FAISS 索引文件，作为后续检索数据库。
运行攻击评估：最后两个代码单元演示如何执行论文中的“样本级”和“类别级”投毒实验，分别调用 llava_inference_rag_poison_final.py 与 llava_inference_rag_poison_final_class.py，并传入必要的路径和参数。

### Download datasets

Download M-BEIR

In [1]:
# download mbeir from huggingface (this could take very loooong time)
# (optional) If you cannot access HuggingFace directly, use a hf mirror site by setting "HF_ENDPOINT=https://hf-mirror.com"
!HF_ENDPOINT=https://hf-mirror.com huggingface-cli download --repo-type dataset --resume-download TIGER-Lab/M-BEIR --local-dir M-BEIR

Fetching 115 files: 100%|████████████████████| 115/115 [00:00<00:00, 439.01it/s]
/data/home/guest1/PoisonedEye/M-BEIR


In [2]:
# Navigate to the M-BEIR directory
%cd ./M-BEIR

# Combine the split tar.gz files into one
!sh -c 'cat mbeir_images.tar.gz.part-00 mbeir_images.tar.gz.part-01 mbeir_images.tar.gz.part-02 mbeir_images.tar.gz.part-03 > mbeir_images.tar.gz'

# Extract the images from the tar.gz file
!tar -xzf mbeir_images.tar.gz

/data/home/guest1/PoisonedEye/M-BEIR


  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


Add full OVEN-Wiki cand_pool to M-BEIR

In [3]:
# Navigate to the M-BEIR cand_pool directory "./M-BEIR/cand_pool/local"
%cd ./cand_pool/local

# Download json.zip from "https://drive.google.com/file/d/1wQBGk4Ha_rvYEA0X-8ECX-lwce4wHCBa/view?usp=sharing"
!gdown "https://drive.google.com/uc?id=1wQBGk4Ha_rvYEA0X-8ECX-lwce4wHCBa" -O "mbeir_oven_task8_2m_cand_pool.zip"

# Extract the file
!unzip mbeir_oven_task8_2m_cand_pool.zip

/data/home/guest1/PoisonedEye/M-BEIR/cand_pool/local
Archive:  mbeir_oven_task8_2m_cand_pool.zip
  inflating: mbeir_oven_task8_2m_cand_pool.jsonl  

  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]





Download ImageNet-1k, Places-365 for class-wise evalutaion

In [None]:
# go back to root dir of the repo
%cd ../../..

In [None]:
# Imagenet-1k (set your hf token here)
!HF_ENDPOINT=https://hf-mirror.com huggingface-cli download --repo-type dataset --resume-download ILSVRC/imagenet-1k --include "data/val_images.tar.gz" --local-dir imagenet-1k --token hf_***

Fetching 1 files: 100%|██████████████████████████| 1/1 [00:00<00:00, 924.06it/s]
/data/home/guest1/PoisonedEye/imagenet-1k


In [9]:
%cd ./imagenet-1k/data
!tar -xzf val_images.tar.gz

/data/home/guest1/PoisonedEye/imagenet-1k/data


  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


In [None]:
# go back to root dir of the repo
%cd ../../

/data/home/guest1/PoisonedEye


  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


In [13]:
# Places-365
!HF_ENDPOINT=https://hf-mirror.com huggingface-cli download --repo-type dataset --resume-download haideraltahan/wds_places365 --local-dir places365

Fetching 27 files: 100%|██████████████████████| 27/27 [00:00<00:00, 3109.19it/s]
/data/home/guest1/PoisonedEye/places365


### Download Models

In [14]:
!HF_ENDPOINT=https://hf-mirror.com huggingface-cli download --resume-download google/siglip-so400m-patch14-384 --local-dir siglip-so400m-patch14-384
!HF_ENDPOINT=https://hf-mirror.com huggingface-cli download --resume-download laion/CLIP-ViT-H-14-laion2B-s32B-b79K --local-dir CLIP-ViT-H-14-laion2B-s32B-b79K
!HF_ENDPOINT=https://hf-mirror.com huggingface-cli download --resume-download llava-hf/llava-v1.6-mistral-7b-hf --local-dir llava-v1.6-mistral-7b-hf 
!HF_ENDPOINT=https://hf-mirror.com huggingface-cli download --resume-download Qwen/Qwen2-VL-7B-Instruct --local-dir Qwen2-VL-7B-Instruct

Fetching 9 files: 100%|█████████████████████████| 9/9 [00:00<00:00, 6656.45it/s]
/data/home/guest1/PoisonedEye/siglip-so400m-patch14-384
Fetching 14 files: 100%|█████████████████████| 14/14 [00:00<00:00, 25029.95it/s]
/data/home/guest1/PoisonedEye/CLIP-ViT-H-14-laion2B-s32B-b79K
Fetching 17 files: 100%|█████████████████████| 17/17 [00:00<00:00, 18724.57it/s]
/data/home/guest1/PoisonedEye/llava-v1.6-mistral-7b-hf
Fetching 17 files: 100%|██████████████████████| 17/17 [00:00<00:00, 4149.88it/s]
/data/home/guest1/PoisonedEye/Qwen2-VL-7B-Instruct


## 2. Create databases

Build retrieval database with faiss index.

In [2]:
!python mbeir_dataset.py --model_path="siglip-so400m-patch14-384" \
    --dim=1152 --beir_cand_pool_path="cand_pool/local/mbeir_oven_task8_2m_cand_pool.jsonl" \
    --save_path="siglip_mbeir_oven_task8_2m_cand_pool.bin" \
    --beir_path="./M-BEIR"


---Mbeir Candidate Pool Dataset Config---
Candidate Pool Path: cand_pool/local/mbeir_oven_task8_2m_cand_pool.jsonl
Returns: {'src_content': False, 'hashed_did': True}
--------------------------

100%|████████████████████████████████████| 3944/3944 [37:59:32<00:00, 34.68s/it]


In [4]:
!python mbeir_dataset_imageonly_webqa.py --model_path="siglip-so400m-patch14-384" --dim=1152 --save_path="siglip_mbeir_webqa_task2_cand_pool.bin" --beir_path="./M-BEIR"


---Mbeir Candidate Pool Dataset Config---
Candidate Pool Path: cand_pool/local/mbeir_webqa_task2_cand_pool.jsonl
Returns: {'src_content': False, 'hashed_did': True}
--------------------------

100%|███████████████████████████████████████| 788/788 [5:54:49<00:00, 27.02s/it]


## 3. Start Poisoning

Note: the following poison type {text-only, poison-sample, poison-class} equals to {PE-B, PE-S, PE-C} in the paper.

Sample-wise evaluation. poison_type ∈ {text-only, poison-sample}

In [None]:
!python llava_inference_rag_poison_final.py \
    --poison_type=poison-sample \
    --retrieval_encoder_path="siglip-so400m-patch14-384" \
    --retrieval_database_path="siglip_mbeir_oven_task8_2m_cand_pool.bin" \
    --mbeir_subset_name=infoseek \
    --eval_number=1000 \
    --disable_tqdm=False

Class-wise evaluation. poison_type ∈ {text-only, poison-sample, poison-class}

Note: We used class-wise evaluation in our experiments.

In [15]:
!python llava_inference_rag_poison_final_class.py \
    --poison_type=poison-class \
    --retrieval_encoder_path="siglip-so400m-patch14-384" \
    --retrieval_database_path="siglip_mbeir_oven_task8_2m_cand_pool.bin" \
    --img_database_path="siglip_mbeir_webqa_task2_cand_pool.bin" \
    --eval_dataset=places-365 \
    --eval_dataset_path=places365 \
    --disable_tqdm=False

Resolving data files: 100%|█████████████████| 24/24 [00:00<00:00, 184703.30it/s]
Loading dataset shards: 100%|█████████████████| 23/23 [00:00<00:00, 9316.17it/s]
Loading checkpoint shards: 100%|██████████████████| 4/4 [00:01<00:00,  4.00it/s]
Expanding inputs for image tokens in LLaVa-NeXT should be done in processing. Please add `patch_size` and `vision_feature_select_strategy` to the model's processing config or set directly with `processor.patch_size = {{patch_size}}` and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. Using processors without these attributes in the config is deprecated and will throw an error in v4.50.
Acc: 1/365 = 0.0027397260273972603
Poison Success: 319/365 = 0.873972602739726
Retrieval Success (Top-1): 305/365 = 0.8356164383561644
Retrieval Success (Top-k): 338/365 = 0.9260273972602739
Avg Retrieval Distance: 278.0692655444145/365 = 0.7618336042312727
Poison Type: poison-class, Eval Type: class-wise, Encoder: siglip-so400m-patch