Skip to content

effyli/lm-kbc

Repository files navigation

Knowledge-centric Prompt Composition for Knowledge Base Construction from Pre-trained Language Models

This repository contains codes and instructions for the accepted system paper for the ISWC LM-KBC challenge 2023.
Our system's predictions ranked 🏆 2nd on the leaderboard. Our team name is thames.

Before started

Install the necessary prerequisites using the following command in the root directory before running the extraction pipeline.

Make sure you are using a Python virtual environment. You can use environment management tools such as conda, venv, pyenv, etc.

pip install -r requirements.txt

Start extraction

To run the complete pipeline, simply run the following command:

cd src/
./run_pipeline.sh

Make sure the directories specified in the bash file exist in your local directory.

You can also run each step in the pipeline separately. First modify input data with wikigpt information:

python input-wiki-transformer.py -i "../data/train.jsonl" -o "../data/train-output.jsonl"

Then run the pipeline by using the following command, you can specify which model to use:

python run.py -d "$data_dir" -f "$file_to_prompt" -o "$output_dir"  -m "gpt-4" -c "false"

Next parse the extraction file:

python ../result_parser.py -p "YOUR_EXTRACTION_FILE" -g "$gold_file" -o "OUTPUT_DIR"

In the end evaluate the final prediction and print out the performance.

python ../evaluate.py -p "YOUR_PREDICTION_FILE" -g "$GOLD_FILE"

Authors (in alphabetical order)

About

Knowledge-centric Prompt Composition for Knowledge Base Construction from Pre-trained Language Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published