This repository contains implementations on KGPPR. This work is done in CSCI544 Applied Natural Language Processing by Prof. Mohammad Rostami on Fall 2023.
Although Retreival Augmented Generation (RAG) has become the standard of all the knowledge-augmented language modeling tasks, two main limitations exist. First, it requires massive data to train both the retriever and LM. Especially, fine-tuning pre-trained LM costs a lot of resources and is not desirable in real-world settings. Second, the method tries to solve questions in one-shot, where knowledge retriever and LM are used only once to solve the given question. However, for complex questions requiring multiple reasoning steps, a one-shot approach may be insufficient to provide accurate answers.
To mitigate these limitations, we propose Knowledge Graph Prompting using Procedural Reasoning (KGPPR), which is a zero-shot LM prompting framework that uses procedural reasoning to solve complex knowledge graph based questions. Specifically, to address the mentioned limitations, KGPPR employs two modules.
-
Zero-Shot KG Prompting: Similar to RAG, the method adopts both knowledge graph retriever and LM to solve a question. First, it retrieves top-K knowledge graph triples that are relevant to the question. The retrieved knowledge graph triples are then converted into natural language and used as prompts for LM.
-
Procedural Reasoning: The method employs multiple rounds of reasoning steps to solve a question. For each round, it uses chain-of-thought (CoT) to generate the next sub-question that needs to be addressed in a step-by-step fashion. Finally, the sub-question is solved using the Zero-Shot KG Prompting. For the next round, we utilize previous answers to generate answers for the next round.
-
Execution environment
- This project use poetry to manage and install python dependencies.
- To install all the prerequisite packages at once, run
poetry install
in the root directory. - To run the project program within the pre-defined environment, run
poetry run python ${PROGRAM_NAME}
. - To activate the virtual environment in a shell, run
poetry shell
.
-
Configuration
- In order to use the OpenAI API, ensure that your API key is added to the
src/settings/config.py
file prior to running the program.class Configuration: OPENAI_API_KEY = "${KEY_SHOULD_BE_ADDED}" OPENAI_MODEL_NAME = "gpt-3.5-turbo" EMBEDDING_MODEL_NAME = "intfloat/e5-large-v2"
- In order to use the OpenAI API, ensure that your API key is added to the
-
Run
python src/app.py --data ${DATA} --outfile ${OUT_FILE_NAME} --num_test {NUM_TEST}
-
DATA
: You have three options for the dataset to use: 'WebQSP', 'mintaka', 'ComplexWebQ', 'MetaQA_1-hop', 'MetaQA_2-hop', or 'MetaQA_3-hop' -
OUT_FILE_NAME
: The name of the output file. Default value isevaluation_result.csv
. -
NUM_TEST
: Number of samples to test. Default value is 500. -
example
python src/app.py --data WebQSP --outfile webqsp_results.csv --num_test 500
-
- Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering, by Jinheon Baek1, Alham Fikri Aji, Amir Saffari