Evaluating the Ripple Effects of Knowledge Editing in Language Models

This repository contains the official code of the paper: "Evaluating the Ripple Effects of Knowledge Editing in Language Models".

Setup

The benchmark creation and all experiments and evaluations were conducted in a Python 3.9 environment. To clone the repository and set up the environment, please run the following commands:

git clone https://github.com/edenbiran/RippleEdits.git
cd RippleEdits
pip install -r requirements.txt

RippleEdits Benchmark

The benchmark files and statistics can be found under data/benchmark/ and data/stats/. The benchmark is split into three files named according to the benchmark`s three subsets: RECENT, RANDOM and POPULAR. For more details please refer to section 4 of the paper.

The source code for generating the benchmark can be found under src/. Generating the benchmark from scratch can be done using src/build_benchmark.py. Benchmark popularity statistics can be extracted using src/benchmark_statistics.py.

Each benchmark json contains a list of entries. Each entry is an edit containing the edit information (which also contains the original fact if applicable) and the 6 evaluation criteria. Each evaluation criteria contains a list of tests, where each test contains the test prompt, answers and conditions. An example (shortened for brevity) of an edit entry can be seen below:

{
  "example_type": "popular",
  "edit": {
    "prompt": "The name of the country of citizenship of Leonardo DiCaprio is Syria.",
    "subject_id": "Q38111",
    "relation": "COUNTRY_OF_CITIZENSHIP",
    "target_id": "Q858",
    "original_fact": {
      "prompt": "The name of the country of citizenship of Leonardo DiCaprio is United States of America.",
      "subject_id": "Q38111",
      "relation": "COUNTRY_OF_CITIZENSHIP",
      "target_id": "Q30"
    }
  },
  "Relation_Specifity": [
    {
      "test_queries": [
        {
          "prompt": "The name of the mother of Leonardo DiCaprio is",
          "answers": [
            {
              "value": "Irmelin DiCaprio",
              "aliases": [
                "Irmelin Indenbirken",
                "Irmelin Indenbirken-DiCaprio"
              ]
            }
          ],
          "query_type": "regular",
          "subject_id": "Q38111",
          "relation": "MOTHER",
          "target_ids": [
            "Q22984557"
          ],
          "phrase": null
        }
      ],
      "test_condition": "OR",
      "condition_queries": [
        {
          "prompt": "The name of the mother of Leonardo DiCaprio is",
          "answers": [
            {
              "value": "Irmelin DiCaprio",
              "aliases": [
                "Irmelin Indenbirken",
                "Irmelin Indenbirken-DiCaprio"
              ]
            }
          ],
          "query_type": "regular",
          "subject_id": "Q38111",
          "relation": "MOTHER",
          "target_ids": [
            "Q22984557"
          ],
          "phrase": null
        }
      ]
    },
  ...
  ],
  "Logical_Generalization": [...],
  "Subject_Aliasing": [...],
  "Compositionality_I": [...],
  "Compositionality_II": [...],
  "Forgetfulness": [...]
}

Evaluation

The source code for all evaluations of the benchmark can be found under src/. All evaluations can be conducted using src/evaluation.py.

In order to evaluate the benchmark on a language model not currently supported extend the class QueryExecutor in src/queryexecutor.py and add the new QueryExecutor to src/evaluation.py.

In order to evaluate the benchmark on a knowledge editing technique not currently supported extend the class ModelEditor in src/modeleditor.py and add the new ModelEditor to src/evaluation.py.

Citation

@article{cohen2024evaluating,
  title={Evaluating the ripple effects of knowledge editing in language models},
  author={Cohen, Roi and Biran, Eden and Yoran, Ori and Globerson, Amir and Geva, Mor},
  journal={Transactions of the Association for Computational Linguistics},
  volume={12},
  pages={283--298},
  year={2024},
  publisher={MIT Press One Broadway, 12th Floor, Cambridge, Massachusetts 02142, USA~…}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
src		src
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

src

src

.gitignore

.gitignore

LICENSE.txt

LICENSE.txt

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Evaluating the Ripple Effects of Knowledge Editing in Language Models

Setup

RippleEdits Benchmark

Evaluation

Citation

About

Releases

Packages

Languages

License

edenbiran/RippleEdits

Folders and files

Latest commit

History

Repository files navigation

Evaluating the Ripple Effects of Knowledge Editing in Language Models

Setup

RippleEdits Benchmark

Evaluation

Citation

About

Resources

License

Stars

Watchers

Forks

Languages