This repository contains the source code for the paper:
- Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue.
Jia-Chen Gu, Hao-Xiang Xu, Jun-Yu Ma, Pan Lu, Zhe-Hua Ling, Kai-Wei Chang, Nanyun Peng
EMNLP 2024
🌐 Project page
In response to the challenge of hallucinations in the output of LLM due to false or outdated knowledge, model editing has received a lot of attention due to its low resource consumption. Previous studies have proposed many effective methods and achieved good results in editing performance. However, these model editing methods often overlooking potential sideeffects on the general abilities of LLMs.
This paper analyze side effects by evaluating four popular editing methods on four LLMs across eight representative task categories.
The datasets are included in data/
. There are three folders:
edited-data
: The data used to edit the model. In this article, we use the ZsRE dataset to edit the model.task-data
: The data used in downstream tasks.training-data
: The data used to train the MEND network. In this article, we use the ZsRE dataset to train the MEND network.
The whole data directory is as follows:
data/
|__ edited-data
|__ zsre.json
|__ task-data
|__ test-dialogue
|__ test-ClosedDomainQA.jsonl
|__ test-NER.txt
|__ test-NLI.tsv
|__ test-OpenDomainQA.jsonl
|__ test-reasoning.jsonl
|__ test-SentimentAnalysis.tsv
|__ test-summarization.json
|__ training-data
|__ zsre_mend_train.json
|__ zsre_mend_eval.json
You can download these datasets here. [Google Drive].
Note: Please use Python 3.9+ To get started, simply install conda and run:
git clone https://github.com/JasonForJoy/Model-Editing-Hurt.git
conda create -n EditHurt python=3.9.7
...
pip install -r requirements.txt
All models are putted in hugging_cache/<model_name>
(model_name=gpt2-xl, gpt-j-6B, or llama-7b).
These could be changed in hparams/<method_name>/
.
Eight different downstream task evaluation metrics are as follows
Reasoning
: solve rateNatural language inference (NLI)
: accuracy of two-way classificationOpen-domain QA
: exact match(EM) with the reference answer after minor normalizationClosed-domain QA
: exact match(EM) scoreDialogue
: select one best-matched response from four available candidatesSummarization
: the average of ROUGE-1, ROUGE-2 and ROUGE-LNamed entity recognition (NER)
: entity-level F1-scoreSentiment analysis
: accuracy of two-wayclassification
GPT-2 XL(1.5B), LLaMA-1(7B), LLaMA-2(7B), LLaMA-2(13B) are used for editing.
-
These model editing methods are used in our paper as follows:
-
These model editing modes are used in our paper as follows:
Instance-Sequential
: ROME and KN can be uesdBatch-Single
: MEMIT and MEND can be uesdBatch-Sequential
: MEMIT and MEND can be uesd
If you want to evaluate the performance of the pre-edit model on various downstream tasks (e.g. evaluating task NER), run:
python test-task.py task
python test-task.py NER
task
: The name of the task you want to evaluate,you can choose from: ClosedDomainQA, dialogue, NER, NLI, OpenDomainQA, reasoning, SentimentAnalysis, summarization.
If you want to evaluate the performance of the edited model on various downstream tasks (e.g. evaluating task NER with mode Instance-Sequential and method ROME), run:
python test-task-after.py task mode method sample_begin sample_end sample_step
python test-task-after.py NER Instance-Sequential ROME 200 204 1
mode
: The mode of editing you want to use,you can choose from: Batch-Single, Instance-Sequential, Batch-Sequential.
method
:The editing method you want to use,you can choose from: ROME, MEMIT, KN, MEND.
sample_begin
:The number at the beginning of the sample you selected in the dataset.
sample_end
:The number at the end of the sample you selected in the dataset.
sample_step
: One sample is selected every sample_step sample.
If you choose Batch-Sequential as mode (e.g. evaluating task NER with mode Batch-Sequential and method MEMIT), run:
python test-task-after.py task mode method sample_begin sample_end sample_step batch_size
python test-task-after.py NER Batch-Sequential MEMIT 400 404 1 2
batch_size
: The size of the batch.
If mode Batch-Single or mode Instance-Sequential is selected:
results from each run are stored at test-result/test-<task>/result-<task>-<mode>-<method>-<sample_total>
.
If mode Batch-Sequential is selected:
results from each run are stored at test-result/test-<task>/result-<task>-<mode>-<method>-<batch_size>*<edit_time>
.
To use the MEND method, you should firstly train a hypernetwork using the data in data/training-data/
, and these weights would be saved in result/models/MEND
.
Then use the same steps above to edit models.
Run:
python train_MEND.py
If you use this code and dataset, please cite our paper:
@inproceedings{gu-etal-2024-model,
title = "Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue",
author = "Gu, Jia-Chen and
Xu, Hao-Xiang and
Ma, Jun-Yu and
Lu, Pan and
Ling, Zhen-Hua and
Chang, Kai-Wei and
Peng, Nanyun",
booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
year = "2024",
publisher = "Association for Computational Linguistics"
}
We express sincere gratitude to EasyEdit and ROME, as we have utilized portions of their source code in our project.