Seyun Bae, Seokhan Lee, Eunho Yang
CURaTE, the first unlearning method for large language models enabling continual unlearning in real time while also maintaining near perfect preservation of existing knowledge.
Create a fresh conda environment and install the required packages from requirements.txt.
Run download_model.py to download the model into a local directory.
To train the sentence embedding model used for unlearning, run: python train_sentemb.py
After training the unlearning sentence embedding model, the scripts in DB_files/ are used to prepare the forget set embedding database for continual unlearning evaluation.
Given the trained unlearning sentence embedder, these scripts construct the cumulative forget set for each stage of the continual setting and encode the corresponding samples into the embedding space. They also precompute cosine similarity mappings between the forget set embeddings and samples from the retain set or other utility evaluation datasets.
To evaluate CURaTE on the TOFU dataset:
- Update the path in
get_available_cache_dir()insideTOFU/evaluate_tofu_sentemb.py - Run the evaluation:
python TOFU/evaluate_tofu_sentemb.py
To evaluate CURaTE on the TruthfulQA benchmark:
- Update the path in
get_available_cache_dir()insidetruthfulQA/truthfulQA_evaluation_sentemb.py - Run the TruthfulQA evaluation:
python truthfulQA/truthfulQA_evaluation_sentemb.py - Run the CommonsenseQA evaluation to measure general knowledge preservation:
python commonsense/evaluation_commonsenseQA_sentemb.py
To evaluate CURaTE on the RETURN dataset:
- Update the path in
get_available_cache_dir()insideRETURN/evaluate_return_sentemb.py - Run the evaluation:
python RETURN/evaluate_return_sentemb.py
To evaluate CURaTE on the ScienceQA dataset:
- Update the path in
get_available_cache_dir()insideScienceQA/evaluate_return_sentemb.py - Run the evaluation:
python ScienceQA/evaluate_return_sentemb.py
To reproduce the ablation experiments:
- Train each baseline model with each ablation dataset:
python train_sentemb.py - Run the scripts in
DB_files/to generate mapping files for each ablation setting - Run the evaluation scripts with
no_gen.pyto obtain Precision, Recall, and F1 scores for each ablation
