Custom GPT Security Analysis

Introduction

This repository is part of a research study focused on evaluating the security vulnerabilities of custom GPT models, particularly against prompt injection attacks. Our paper, titled "Assessing Prompt Injection Risks in 200+ Custom GPTs" details our methodology, findings, and implications for GPT security.

Custom GPTs for Reproducibility

For the sake of reproducibility and further analysis, we have made the custom GPTs used in our study available online. You can access them through the following links:

Adversarial Prompts

We list the adversarial prompts used in examing over 200 custom GPTs in prompt_injection.md.

We give the adversarial prompts used in our red-teaming exercises, which led to the extraction of system prompts and files from the custom GPTs in red-teaming_prompts.md.

Red-Teaming Results

The red-teaming efforts were documented through a series of screenshots in red-teaming_screenshots/ , showcasing the responses of custom GPTs to our adversarial prompts.

Citation

If you find our work useful, please cite our paper:

@article{yu2023assessing,
  title={Assessing Prompt Injection Risks in 200+ Custom GPTs},
  author={Yu, Jiahao and Wu, Yuhang and Shu, Dong and Jin, Mingyu and Xing, Xinyu},
  journal={arXiv preprint arXiv:2311.11538},
  year={2023}
}

FQA

Could you share the experiment data or the target custom GPT list?

We are afraid that we cannot share them. As we stated in our paper, we deleted all extracted information after the experiment to avoid ethical concerns. Similarly, we cannot provide the list of the target GPTs.
I tried the red-teaming prompts but did not yield the same results shown in the screenshots.

Due to the sampling nature of GPTs, you may get different results even when you use the same prompts in the web interference. Thus, simply retrying may help. However, we did notice the improved prompt injection robustness by Nov 16 that some red-teaming prompts could not succeed even after 5 trials. This is potentially due to some updates of OpenAI.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
red-teaming_screenshots		red-teaming_screenshots
README.md		README.md
prompt_injection.md		prompt_injection.md
red-teaming_prompts.md		red-teaming_prompts.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

red-teaming_screenshots

red-teaming_screenshots

README.md

README.md

prompt_injection.md

prompt_injection.md

red-teaming_prompts.md

red-teaming_prompts.md

Repository files navigation

Custom GPT Security Analysis

Introduction

Custom GPTs for Reproducibility

Adversarial Prompts

Red-Teaming Results

Citation

FQA

About

Releases

Packages

Contributors 2

sherdencooper/prompt-injection

Folders and files

Latest commit

History

Repository files navigation

Custom GPT Security Analysis

Introduction

Custom GPTs for Reproducibility

Adversarial Prompts

Red-Teaming Results

Citation

FQA

About

Resources

Stars

Watchers

Forks