Adversarial-Attack-to-Image-Caption

This project focuses on creating adversarial examples to attack unknown image-captioning models and test their robustness.
Our code built on python 3.6 and PyTorch 1.10. The list of dependencies are in the environment.yml file.
The code does not working with python 3.7 and above, however, it can be refactored to work with python 3.7+ by changing the image manipulation library from scipy.misc.imread to other newer libraries such as imageio.

Disclaimer

We want to thank sgrvinod's repository A PyTorch Totorial to Image Captioning for his comprehensive image-captioning tutorial.
We want to thank rmokady's repository CLIP Prefix Captioning for creating image-captioning model from CLIP.

Both these projects incredibly inspired us to work on this topic.

Installing Prerequisites

Make sure you have conda installed.

Create new conda environment using provided environment.yml*

Optional: You can change environment name by editing the first line in environment.yml from adv_caption to your preference name.

conda env create -f /path_to_your_file/environment.yml file.

Then run:

conda activate adv_caption

To activate the conda environment you have just created.

Getting Data

This repository supports working with MSCOCO2014 dataset and Flickr8K dataset.

IF you choose to work with MSCOCO2014, download training, validation

IF you choose to work with Flickr8k, the dataset can be requested here.
Download captions of the images created by Andrej Karpathy and Li Fei-Fei in JSON blobs format here.

Data Preprocessing

In the fifth line of params_class.py, specify data path to your working directory e.g., data_path = "/[dir_name]/data/"
Your Karpathy's JSON files should be extracted to the same directory i.e., /[dir_name]/data/caption_datasets/
If you choose to work with MSCOCO2014, your images folder should look like /[dir_name]/data/images/coco2014/train2014/ for train2014.zip, /[dir_name]/data/images/coco2014/val2014/ for val2014.zip, and /[dir_name]/data/images/coco2014/test2014/ for test2014.zip
If you choose to work with Flickr8K, your images folder should look like /[dir_name]/data/images/flickr8k/

From now on, don't forget to run every command inside your conda environment with python3.6 installed. For MSCOCO2014 dataset, run:

python create_input_files.py --which_data="coco2014"

For Flick8K dataset, run:

python create_input_files.py --which_data="flickr8k"

Training

Check training options:

python train_args.py -h

To begin training, you must specify

which model you want to use between resnet50, resnet101, and resnet152.
which dataset you want to use between coco2014 and flickr8k.
begin finetuning from scratch between True and False, select False if you want to continue training from your saved model.
Finetune your model encode between True and False.

python train_args.py --which_model="resnet101" --which_data="coco2014" --start_from_scratch="True" --fine_tune_encoder="True"

Evaluating

Once you have completed training for at least one epoch, a model checkpoint will be saved at /[dir_name]/data/checkpoints/.

To evaluate your model, run:

python eval_args.py --which_model="resnet101" --which_data="coco2014"

Captioning

To generate caption of an image, run:

python caption_args.py --which_model="resnet101" --which_data="coco2014" --img="[path_to_the_image]"

You will see the path to output image after the image has been successfully captioned.

Generating adversarial examples

To generate adversarial examples from images in test set, run:

python attack_args.py --which_model="resnet101" --target_model="resnet101" --which_data="coco2014" --epsilon=0.004 --export_caption="True" --export_original_image="True" --export_perturbed_image="True"

Attacking CLIP Prefix Captioning Model with the Adversarial Examples

If you did not use our environment.yml to install dependencies, you must install CLIP module and transformer module first. Before running the following command, make sure conda adv_caption environment is still activated.

pip install git+https://github.com/openai/CLIP.git
pip install transformers~=4.10.2

If you work with MSCOCO dataset, download pre-trained COCO model for CLIPcap here. Place your downloaded file(s) inside checkpoints folder i.e., /[dir_name]/data/checkpoints/coco_weights.pt.
If you work with Flickr8k dataset, download pre-trained conceptual captions for CLIPcap here. Place your downloaded file(s) inside checkpoints folder i.e., /[dir_name]/data/checkpoints/conceptual_weights.pt.

After you have generated adversarial sample, installed dependencites, and downloaded pre-trained model, you can begin testing CLIPcap robustness by running:

python attack_clipcap_eval.py --which_model="resnet101" --which_data="coco2014" --epsilon=0.004

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
backup		backup
readme		readme
results		results
.gitignore		.gitignore
README.md		README.md
attack_args.py		attack_args.py
attack_clipcap_eval.py		attack_clipcap_eval.py
caption_args.py		caption_args.py
create_input_files.py		create_input_files.py
datasets.py		datasets.py
environment.yml		environment.yml
eval_args.py		eval_args.py
models_extended.py		models_extended.py
params_class.py		params_class.py
train_args.py		train_args.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adversarial-Attack-to-Image-Caption

Contents

Disclaimer

Installing Prerequisites

Getting Data

Data Preprocessing

Training

Evaluating

Captioning

Generating adversarial examples

Attacking CLIP Prefix Captioning Model with the Adversarial Examples

About

Releases

Packages

Contributors 2

Languages

katsamapol/Adversarial-Attack-to-Image-Caption

Folders and files

Latest commit

History

Repository files navigation

Adversarial-Attack-to-Image-Caption

Contents

Disclaimer

Installing Prerequisites

Getting Data

Data Preprocessing

Training

Evaluating

Captioning

Generating adversarial examples

Attacking CLIP Prefix Captioning Model with the Adversarial Examples

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages