Skip to content

Arthurizijar/KB-Coder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KB-Coder

The implementation for paper Code-Style In-Context Learning for Knowledge-Based Question Answering accepted by AAAI2024

Overview

model

Environment configuration

git clone https://github.com/Arthurizijar/KB-Coder.git
cd KB-Coder
conda create -n kbcoder python=3.8
pip install -r requirements.txt
export PYTHONPATH=$PWD

0. Preliminary

  • Finish the Freebase Setup refer to the guidance from dki-lab and start the freebase service.

    python3 virtuoso.py start 3001 -d virtuoso_db
  • Download Data

    • Download WebQuestionsSPGrailQA and GraphQ from their website and move the data into the fold ./data. Unzip packages and remove unuseful files.
    • Download fb_roles, fb_types, reverse_properties from here
    • Download surface_map_file_freebase_complete_all_mention in mentions.zip from here (URL comes from GrailQA's repository)
    • Download triple_edges_parts and id2name_parts in Freebase_raw.tar.gz from here
    • The structure of the fold ./data should be like this:
    data
    ├─GrailQA
    │  ├─grailqa_v1.0_dev.json    
    │  ├─grailqa_v1.0_train.json
    │  └─grailqa_v1.0_test_public.json
    ├─GraphQ
    │  ├─graphquestions_v1_fb15_test_091420.json
    │  └─graphquestions_v1_fb15_training_091420.json
    ├─WebQSP
    │  └─data
    │     ├─WebQSP.test.json
    │     ├─WebQSP.test.partial.json
    │     ├─WebQSP.train.json
    │     └─WebQSP.train.partial.json
    └─Freebase
       ├─fb_roles
       ├─fb_types
       ├─reverse_properties
       ├─surface_map_file_freebase_complete_all_mention
       ├─triple_edges_parts
       └─id2name_parts
    
  • Complete the config of each dataset: (./configs/WebQSP.yaml, ./configs/GrailQA.yaml, ./configs/GraphQ.yaml)

    Note: ./configs/Dataset_template.yaml contains explanations for all fields, modified other fields when necessary.

    sparql_url: <The url of Freebase service>
    api_key: <Your OpenAI API Key>
    proxy_url: <HTTP(s) Proxy URL or null>
  • Preprocessed datasets.

# WebQSP
python utils/borrow/parse_sparql.py --dataset_path ./data/WebQSP/data
# GrailQA
python utils/preprocess_dataset.py --dataset_path ./data/GrailQA/grailqa_v1.0_train.json
python utils/preprocess_dataset.py --dataset_path ./data/GrailQA/grailqa_v1.0_dev.json
# GraphQ
python utils/preprocess_dataset.py --dataset_path ./data/GrailQA/graphquestions_v1_fb15_training_091420.json
python utils/preprocess_dataset.py --dataset_path ./data/GrailQA/graphquestions_v1_fb15_test_091420.json

1. Generate the request file for OpenAI API

Note: We used part of the function implemented by Rng-KBQA to finish the interconversion of S-Expression and SPARQL, which is placed in utils_borrow.py

python generator.py --data_config ./configs/WebQSP.yaml
python generator.py --data_config ./configs/GrailQA.yaml
python generator.py --data_config ./configs/GraphQ.yaml

Related fields in config:

k: <Select k demonstrations for each question in testset>
sample_type: <topk: most similar sampling, slice_random: random sampling>

2. Call API and get responses

Note: We used the code implemented by openai-cookbook to call the OpenAI API.

python call_api.py --data_config ./configs/WebQSP.yaml
python call_api.py --data_config ./configs/GrailQA.yaml
python call_api.py --data_config ./configs/GraphQ.yaml

If some error occurs when calling API, use query_filer.py to filter the failed examples. The failed queries will be saved in aaa_failed.jsonl and the successful queries will be retained in the answer file.

python query_filer.py --qurey_path aaa.jsonl --answer_path bbb_answer.jsonl

3. Link the mention and evaluate the results

python linker.py --data_config ./configs/WebQSP.yaml
python linker.py --data_config ./configs/GrailQA.yaml
python linker.py --data_config ./configs/GraphQ.yaml

Steps in linker.py:

  • Link the mentions to the candidates of entities and relations.
  • Traverse candidates, execute the Python Code to obtain S-expressions, convert S-expressions into SPARQL queries to obtain answers.
  • Evaluate the results. The detail results will be saved in final-answer.xlsx for case study.

About

Code-Style In-Context Learning for Knowledge-Based Question Answering

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages