Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility

This repository contains the official implementation of "Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility" paper, accepted by ICLR2025.

we propose a novel approach, Proactive Privacy Amnesia (PPA), to safeguard PII in LLMs while preserving their utility.

Paper Link: [📖Paper Link]

Project Website: [📖Project Website]

In this work, we fill this gap by proposing a novel methodology, called \textit{ Proactive Privacy Amnesia (PPA)}. Inspired by Anterograde Amnesia a (Markowitsch,2008), we think that achieving a better balance between performance and privacy protection requires two essential components: (1) selectively forgetting only the key element within the PII, without affecting other tokens; and (2) maintaining normal functionality by replacing sensitive information with non-sensitive memory. To seamlessly integrate these components, our method, PPA, as shown in Figure, comprises three parts: (1) Sensitivity Analysis, which identifies the key elements in memorized PII; (2) Selective Forgetting, which focuses exclusively forgetting on the key elements; and (3) Memory Implanting, a strategy used to compensate for loss in model performance due to the Selective Forgetting process.

Install

Clone the repo and navigate to PPA:

git clone https://github.com/MartinKuo427/PPA.git
cd PPA

Set up environment:

conda create -yn PPA_env python=3.9
conda activate PPA_env
pip install -e .

Preprocess Data

Build fake pii dataset that for memory implainting process.

cd preprocess_data
python3 build_fakepii_dataset.py --privacy_data_path ../main_code/aeslc_train_privacy_ground_truth.csv

Build unlearning dataset that for PII you want to forget.

cd preprocess_data
python3 build_unlearning_dataset.py --privacy_data_path ../main_code/aeslc_train_privacy_ground_truth.csv

Build Input Rephrase attack template dataset

# phone number
python3 build_propile_rephrase_dataset.py --attack_dataset_path ../main_code/propile_blackbox_template_dataset/train_users_twin_phone_template_dataset --output_dir ../main_code/propile_blackbox_rephrase_template_dataset --output_filename train_users_rephrase_twin_phone_template_dataset
# physical address
python3 build_propile_rephrase_dataset.py --attack_dataset_path ../main_code/propile_blackbox_template_dataset/train_users_twin_address_template_dataset --output_dir ../main_code/propile_blackbox_rephrase_template_dataset --output_filename train_users_rephrase_twin_address_template_dataset

Build Probing and Soft Prompt attack template dataset

# both phone number and physical address
python3 build_large_propile_template_dataset.py --privacy_data_path ../main_code/aeslc_train_privacy_ground_truth.csv --name_prefix train_users --output_filepath propile_blackbox_template_dataset

Build Soft Prompt tuning template dataset

python3 build_large_propile_template_dataset.py --privacy_data_path ../main_code/aeslc_valid_privacy_ground_truth.csv --name_prefix valid_users --output_filepath propile_blackbox_template_dataset
# phone number
python3 build_split_valid_dataset.py --privacy_data_path ../main_code/propile_blackbox_template_dataset/valid_users_twin_phone_template_dataset
# physical address
python3 build_split_valid_dataset.py --privacy_data_path ../main_code/propile_blackbox_template_dataset/valid_users_twin_address_template_dataset

Attack User's PII

1. Input Rephrase Attack

# phone number
cd ./main_code/phone_attack_code/
bash phone_attack.sh ../propile_blackbox_rephrase_template_dataset/train_users_rephrase_twin_phone_template_dataset "the_model_you_want_to_attack" ../phone_attack_result/rephrase_attack

# physical address
cd ./main_code/address_attack_code/
bash address_attack.sh ../propile_blackbox_rephrase_template_dataset/train_users_rephrase_twin_address_template_dataset "the_model_you_want_to_attack" ../address_attack_result/rephrase_attack

2. Probing Attack

# phone number
cd ./main_code/phone_attack_code/
bash phone_attack.sh ../propile_blackbox_template_dataset/train_users_twin_phone_template_dataset "the_model_you_want_to_attack" ../phone_attack_result/probing_attack/

# physical address
cd ./main_code/address_attack_code/
bash address_attack.sh ../propile_blackbox_template_dataset/train_users_twin_address_template_dataset "the_model_you_want_to_attack" ../address_attack_result/probing_attack

# email address
cd ./main_code/email_attack_code/
bash email_attack.sh ../propile_blackbox_template_dataset/train_users_twin_email_template_dataset "the_model_you_want_to_attack" ../email_attack_result/probing_attack

3. Soft Prompt Attack

# phone number
cd ./main_code/softprompt_tuning_code/
python3 soft_prompt_tuning_code.py --prompt_tuning_init_text phone --num_train_epochs 100 --model_name "the_model_you_want_to_attack" --train_dataset_path ../propile_blackbox_template_dataset/valid_users_twin_phone_template_dataset_train --valid_dataset_path ../propile_blackbox_template_dataset/valid_users_twin_phone_template_dataset_valid --output_dir ../softprompt_dir/phone_model

# physical address
python3 soft_prompt_tuning_code.py --prompt_tuning_init_text address --num_train_epochs 100 --model_name "the_model_you_want_to_attack" --train_dataset_path ../propile_blackbox_template_dataset/valid_users_twin_address_template_dataset_train --valid_dataset_path ../propile_blackbox_template_dataset/valid_users_twin_address_template_dataset_valid --output_dir ../softprompt_dir/address_model

Defense User's PII

1. PPA defense on user's phone number

cd ./main_code/phone_defense_code/
python3 PPA_phone_defense.py --model_name "the_model_you_want_to_protect" --unlearning_data_path ../propile_unlearning_dataset/unlearning_phone_number_twin_phone_template_dataset/ --fake_data_path ../propile_fake_template_dataset/fake_phone_number_twin_phone_template_dataset/ --output_filepath ../PPA_phone_model/

2. PPA defense on user's physical address

cd ./main_code/address_defense_code/
python3 PPA_address_defense.py --model_name "the_model_you_want_to_protect" --unlearning_data_path ../propile_unlearning_dataset/unlearning_address_twin_address_template_dataset/ --fake_data_path ../propile_fake_template_dataset/fake_address_twin_address_template_dataset/ --output_filepath ../PPA_address_model/

3. PPA defense on user's email address

cd ./main_code/email_defense_code/
python3 PPA_email_defense.py --model_name "the_model_you_want_to_protect" --unlearning_data_path ../propile_unlearning_dataset/unlearning_email_twin_email_template_dataset/ --fake_data_path ../propile_fake_template_dataset/fake_email_twin_email_template_dataset/ --output_filepath ../PPA_email_model/

Evaluate model performance

cd ./main_code/GPT4_utility_test_code/
python3 1_GPT4_build_email_dataset.py
python3 2_GPT4_add_template_email_complete.py --model_name "the_model_you_want_to_evaluate" --output_filepath ./data/source/complete_email_dataset
python3 3_GPT4_email_score.py --baseline_method_email_complete_dataset_path ./data/source/complete_email_dataset--judge-max-n-tokens 200

Paper and Citation

More technical details can be found in our paper. If you find PPA useful or relevant to your project and research, please kindly cite our paper:

@inproceedings{
	kuo2025proactive,
	title={Proactive Privacy Amnesia for Large Language Models: Safeguarding {PII} with Negligible Impact on Model Utility},
	author={Martin Kuo and Jingyang Zhang and Jianyi Zhang and Minxue Tang and Louis DiValentin and Aolin Ding and Jingwei Sun and William Chen and Amin Hass and Tianlong Chen and Yiran Chen and Hai Li},
	booktitle={The Thirteenth International Conference on Learning Representations},
	year={2025},
	url={https://openreview.net/forum?id=io8uRPYktn}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
images		images
main_code		main_code
preprocess_data		preprocess_data
LICENSE		LICENSE
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility

Install

Preprocess Data

Attack User's PII

1. Input Rephrase Attack

2. Probing Attack

3. Soft Prompt Attack

Defense User's PII

1. PPA defense on user's phone number

2. PPA defense on user's physical address

3. PPA defense on user's email address

Evaluate model performance

Paper and Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility

Install

Preprocess Data

Attack User's PII

1. Input Rephrase Attack

2. Probing Attack

3. Soft Prompt Attack

Defense User's PII

1. PPA defense on user's phone number

2. PPA defense on user's physical address

3. PPA defense on user's email address

Evaluate model performance

Paper and Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages