Knowledge-centric Prompt Composition for Knowledge Base Construction from Pre-trained Language Models

This repository contains codes and instructions for the accepted system paper for the ISWC LM-KBC challenge 2023.
Our system's predictions ranked 🏆 2nd on the leaderboard. Our team name is thames.

Before started

Install the necessary prerequisites using the following command in the root directory before running the extraction pipeline.

Make sure you are using a Python virtual environment. You can use environment management tools such as conda, venv, pyenv, etc.

pip install -r requirements.txt

Start extraction

To run the complete pipeline, simply run the following command:

cd src/
./run_pipeline.sh

Make sure the directories specified in the bash file exist in your local directory.

You can also run each step in the pipeline separately. First modify input data with wikigpt information:

python input-wiki-transformer.py -i "../data/train.jsonl" -o "../data/train-output.jsonl"

Then run the pipeline by using the following command, you can specify which model to use:

python run.py -d "$data_dir" -f "$file_to_prompt" -o "$output_dir"  -m "gpt-4" -c "false"

Next parse the extraction file:

python ../result_parser.py -p "YOUR_EXTRACTION_FILE" -g "$gold_file" -o "OUTPUT_DIR"

In the end evaluate the final prediction and print out the performance.

python ../evaluate.py -p "YOUR_PREDICTION_FILE" -g "$GOLD_FILE"

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
data		data
extractions/11.08 3am		extractions/11.08 3am
extractions_gpt4		extractions_gpt4
src		src
.gitignore		.gitignore
README.md		README.md
baseline-GPT3-IDs-directly.py		baseline-GPT3-IDs-directly.py
baseline-GPT3-NED.py		baseline-GPT3-NED.py
baseline.py		baseline.py
evaluate.py		evaluate.py
file_io.py		file_io.py
leaderboard_test_predictions.jsonl		leaderboard_test_predictions.jsonl
lm_kbc.ipynb.ipynb.ipynb		lm_kbc.ipynb.ipynb.ipynb
prediction-for-eval.jsonl		prediction-for-eval.jsonl
prompts.csv		prompts.csv
question-prompts.csv		question-prompts.csv
requirements.txt		requirements.txt
result_parser.py		result_parser.py
validation_results_LLM_ChatGPT.txt		validation_results_LLM_ChatGPT.txt

effyli/lm-kbc

Folders and files

Latest commit

History

Repository files navigation

Knowledge-centric Prompt Composition for Knowledge Base Construction from Pre-trained Language Models

Before started

Start extraction

Authors (in alphabetical order)

About

Resources

Stars

Watchers

Forks

Languages