This repository contains the code for the paper Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning (ACM MM 2023, accepted as Oral).
Tasks | Backdoor Targets (Links) |
---|---|
Pixel-Backdoor | Boya ( Trained for 2K steps on the subset of LAION dataset ) |
Object-Backdoor | Motor2Bike ( Trained for 8K steps on this Motor-Bike-Data ) |
Style-Backdoor | Black and white photo ( Trained for 8K steps on the subset of LAION dataset ) |
Please note: When reproducing, make sure your environment includes the "ftfy" package : pip install ftfy
Otherwise, you should avoid using "\u200b " (zero-width space) as a stealthy trigger. For example, use "sks " instead.
Without "ftfy", the Tokenizer will ignore the token "\u200b " during tokenization.
### With ftfy package
print(tokenizer("\u200b ", max_length=tokenizer.model_max_length, padding="do_not_pad", truncation=True)["input_ids"])
# [49406, 9844, 49407]
### Without ftfy package
print(tokenizer("\u200b ", max_length=tokenizer.model_max_length, padding="do_not_pad", truncation=True)["input_ids"])
# [49406, 49407]
Tasks | Links or Public Datasets |
---|---|
Pixel-Backdoor | MS-COCO / Laion |
Object-Backdoor | https://drive.google.com/file/d/12eIvL2lWEHPCI99rUbCEdmUVoEKyBtRv/view?usp=sharing |
Style-Backdoor | MS-COCO / Laion |
We additionally provide a subset of the COCO dataset: (COCO2014train_10k) that aligns with the required format of this code, allowing easily running our code to obtain the pixel- and style-backdoored models.
If you find this project useful in your research, please consider citing our paper:
@inproceedings{zhai2023text,
title={Text-to-image diffusion models can be easily backdoored through multimodal data poisoning},
author={Zhai, Shengfang and Dong, Yinpeng and Shen, Qingni and Pu, Shi and Fang, Yuejian and Su, Hang},
booktitle={Proceedings of the 31st ACM International Conference on Multimedia},
pages={1577--1587},
year={2023}
}