- [2024/05] Extracting Prompts by Inverting LLM Outputs
- [2024/05] PLeak: Prompt Leaking Attacks against Large Language Model Applications
- [2024/05] Special Characters Attack: Toward Scalable Training Data Extraction From Large Language Models
- [2024/05] Could It Be Generated? Towards Practical Analysis of Memorization in Text-To-Image Diffusion Models
- [2024/04] Investigating the prompt leakage effect and black-box defenses for multi-turn LLM interactions
- [2024/04] Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images
- [2024/03] Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs
- [2024/02] PRSA: Prompt Reverse Stealing Attacks against Large Language Models
- [2024/02] Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems
- [2024/02] Pandora's White-Box: Increased Training Data Leakage in Open LLMs
- [2024/02] Large Language Models are Advanced Anonymizers
- [2024/02] Prompt Stealing Attacks Against Large Language Models
- [2024/02] Conversation Reconstruction Attack Against GPT Models
- [2024/01] Text Embedding Inversion Attacks on Multilingual Language Models
- [2023/11] Scalable Extraction of Training Data from (Production) Language Models
- [2023/09] Intriguing Properties of Data Attribution on Diffusion Models
- [2023/09] Teach LLMs to Phish: Stealing Private Information from Language Models
- [2023/09] Language Model Inversion
- [2023/07] Prompts Should not be Seen as Secrets: Systematically Measuring Prompt Extraction Attack Success
- [2023/02] Prompt Stealing Attacks Against Text-to-Image Generation Models
- [2023/01] Extracting Training Data from Diffusion Models
- [2020/12] Extracting Training Data from Large Language Models