LLM-facteval

Source code of paper "Systematic Assessment of Factual Knowledge in Large Language Models" - EMNLP Findings 2023

Our framework contains four main components:

kg: declare how to read and preprocess knowledge graph
extractor: to extract question triplets, nodes and relation summary from the knowledge graph.
generator: to generate questions/answers from extracted triplets
evaluator: to evaluate LLM's response Check certlm/registry.py for the list of supported extractors, generators and evaluators

Reproducibility

Please check this document for steps to reproduce our experiments in the paper.

Adding new KGs to the framework

Knowledge Graph (KG)

For a new KG dataset, it should extend the certlm.kg_dataset.BaseKG class and implement following methods:

load_relation(): load all available relations to self.relations
get_input_relation_file(relation): return path to input relation file for a given relation
get_relation_name(relation): get relation label
get_relation_type(relation): return relation type: 1-1, N-1, N-M

Current supported KG:

LAMA
- trex
- google_re
BioLAMA
- umls
- wiki-bio

The preprocessed data can be found here.

T-REx

Example of T-REx relation

{
  "relation": "P19",
  "template": "[X] was born in [Y]",
  "label": "place of birth",
  "description": "most specific known (e.g. city instead of country, or hospital instead of city) birth location of a person, animal or fictional character",
  "type": "N-1"
}

Extractor

A triplet extractor for a KG should extend the certlm.extractors.BaseExtractor class and implement following methods:

extract_relation(relation, relation_input_file, output): extract relation data stored in relation_input_file and save to output file. In addition to the output file, a relation summary file is also created to index all the subject, object nodes which will be useful for evaluating N-M relations.
get_input_question_files(data_dir, relation): return path to questions and summary extracted from the given relation

Note that, we should standardize the relation info to follow this format for reusability

The question output is a jsonl file where each line is a json with following structure:

 {
    "subject_label": subject_label,
    "object_label": object_label,
    "object_uri": object_uri,
    "subject_uri": subject_uri,
    "relation_info": {
      "relation": "P19",
      "template": "[X] was born in [Y]",
      "label": "place of birth",
      "description": "most specific known (e.g. city instead of country, or hospital instead of city) birth location of a person, animal or fictional character",
      "type": "N-1",
      "subject_symbol": "[X]",
      "object_symbol": "[Y]"
    }
}

The relation summary should have following structure

{
  "relation_info":  {
      "relation": "P19",
      "template": "[X] was born in [Y]",
      "label": "place of birth",
      "description": "most specific known (e.g. city instead of country, or hospital instead of city) birth location of a person, animal or fictional character",
      "type": "N-1",
      "subject_symbol": "[X]",
      "object_symbol": "[Y]"
    },
  "node_summary": {
    "objects": {
      "uri1": {
        "object_label": object_label,
        "object_uri": object_uri,
        "subjects": [array of subject uri]
      },
      ...
    }
  },
  "subjects": {
      "uri1": {
        "subject_label": subject_label,
        "subject_uri": subject_uri,
        "objects": [array of object uri]
      },
      ...
    }
  }
}

Example of running command

python run_certlm.py --step extract \
    --kg trex \
    --data-dir ./examples/TREx \
    --data-file relations.jsonl \
    --output-dir ./output

Generator

We support the following generator: template, llm-mask, llm-question. Each generator needs to implement the following methods

generate_questions(triplet_input_file, relation_summary_file, output_file)

# question
    prompts = [
            {"role": "system", "content": prompt},
            {"role": "user", "content": content}
    ]
    expected_answers = [{"uri": a[0], "label": a[1]} for a in answers]
    question_record = {
        "question_id": question_id,
        "prompts": prompts,
        "answers": expected_answers,
    }
    json_record = json.dumps(question_record)
    fout.write(json_record + '\n')

Example of running command

python run_certlm.py --step question_generate \
    --kg trex \
    --generator masking \
    --data-dir ./examples/TREx \
    --data-file relations.jsonl \
    --gen-input-dir ./output/tuples

Citation

If this repo is useful for your own research, please cite us with the following bibtex entry

@article{luo2023systematic,
  title={Systematic Assessment of Factual Knowledge in Large Language Models},
  author={Luo, Linhao and Vu, Thuy-Trang and Phung, Dinh and Haffari, Gholamreza},
  journal={Findings of EMNLP},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
certlm		certlm
conf/prompts		conf/prompts
scripts		scripts
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run_certlm.py		run_certlm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

certlm

certlm

conf/prompts

conf/prompts

scripts

scripts

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

run_certlm.py

run_certlm.py

Repository files navigation

LLM-facteval

Reproducibility

Adding new KGs to the framework

Knowledge Graph (KG)

T-REx

Extractor

Generator

Citation

About

Releases

Packages

Contributors 2

Languages

RManLuo/llm-facteval

Folders and files

Latest commit

History

Repository files navigation

LLM-facteval

Reproducibility

Adding new KGs to the framework

Knowledge Graph (KG)

T-REx

Extractor

Generator

Citation

About

Topics

Resources

Stars

Watchers

Forks

Languages