ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

Fengqing Jiang^1,* , Zhangchen Xu^1,* , Luyao Niu^1,* ,
Zhen Xiang² , Bhaskar Ramasubramanian³ ,
Bo Li⁴ , Radha Poovendran¹

¹University of Washington   ²University of Illinois Urbana-Champaign
³Western Washington University   ⁴University of Chicago
^*Equal Contribution

Warning: This project contains model outputs that may be considered offensive

[arXiv]

Overview

How to Use ArtPrompt

Quick Start

We provide a demo prompt to show the effectiveness of ArtPrompt in notebook demo.ipynb (also at demo_prompt.txt). This is a successful prompt toward gpt-4-0613.

Run with ArtPrompt

Setup Environment

Make sure setup your API key in utils/model.py (or in environment) before running experiment.

Running

Run evaluation on vitc-s dataset. More details please refer to benchmark.py

# at dir ArtPrompt
python benchmark.py --model gpt-4-0613 --task s

Run jailbreak with ArtPrompt. More details please refer to baseline.py

cd jailbreak
python baseline.py --model gpt-4-0613 --tmodel gpt-3.5-turbo-0613

You could use --mp arg to accelerate the inference time based on the available cpu cores on your machine.

Acknowledgement

Our project built upon the work from python-art,llm-attack, AutoDan, PAIR, DeepInception, LLM-Finetuning-Safety, BPE-Dropout. We appreciated these open-sourced work in the community.

Citation

If you find our project useful in your research, please consider citing:

@misc{jiang2024artprompt,
      title={ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs}, 
      author={Fengqing Jiang and Zhangchen Xu and Luyao Niu and Zhen Xiang and Bhaskar Ramasubramanian and Bo Li and Radha Poovendran},
      year={2024},
      eprint={2402.11753},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
asset		asset
dataset		dataset
jailbreak		jailbreak
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
benchmark.py		benchmark.py
demo.ipynb		demo.ipynb
demo_prompt.txt		demo_prompt.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

asset

asset

dataset

dataset

jailbreak

jailbreak

utils

utils

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

benchmark.py

benchmark.py

demo.ipynb

demo.ipynb

demo_prompt.txt

demo_prompt.txt

requirements.txt

requirements.txt

Repository files navigation

ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

Overview

How to Use ArtPrompt

Quick Start

Run with ArtPrompt

Setup Environment

Running

Acknowledgement

Citation

About

Releases

Packages

Languages

License

uw-nsl/ArtPrompt

Folders and files

Latest commit

History

Repository files navigation

ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

Overview

How to Use ArtPrompt

Quick Start

Run with ArtPrompt

Setup Environment

Running

Acknowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Languages