Skip to content

Skeleton parsing for complex question answering over knowledge bases (JoWS 2022)

License

Notifications You must be signed in to change notification settings

nju-websoft/SkeletonKBQA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SkeletonKBQA: Skeleton Parsing for Complex Question Answering over Knowledge Bases

Codes for a journal paper: "Skeleton Parsing for Complex Question Answering over Knowledge Bases" . If you meet any questions, please email to him (ywsun at smail.nju.edu.cn).

Project Structure:

FileDescription
kbcqaCodes of skeleton-based SP and IR approaches
skeletonsSkeleton Bank from three complex KBQA datasets

Skeleton Bank

We annotate and publish a skeleton bank of 15,166 questions from three KBQA datasets.

The skeleton bank is json format. An example:

{
"question": "People from the country with the capital Brussels speak what languages ?",
"skeleton": [
	{
		"question": "People from the country with the capital Brussels speak what languages ?",
		"text_span": "with the capital Brussels",
		"headword_index": 3,
		"attachment_relation": "nmod"
	},
	{
		"question": "People from the country speak what languages ?",
		"text_span": "from the country",
		"headword_index": 0,
		"attachment_relation": "nmod"
	}
]
}

Note that we will explain how to run the codes of kbcqa file below.

Requirements

Configuration

The cofiguration of SkeletonKBQA is in kbcqa/common/globals_args.py.

  • root: root of all resources and datasets, default ../dataset.
  • q_mode: a specific KBQA dataset: lcquad, graphq, and cwq.
  • sutime: jar files path of SUTime Java library tool.
  • corenlp_ip_port: ip port of Stanford CoreNLP server.
  • dbpedia_pyodbc: odbc of DBpedia virtuoso server.
  • dbpedia_sparql_html: web site of DBpedia virtuoso server.
  • freebase_pyodbc: odbc of Freebase virtuoso server.
  • freebase_sparql_html: web site of Freebase virtuoso server.

Common Resources

The zip file from google drive contains three parts:

  • Stanford CoreNLP server
  • SUTime Java library
  • BERT pre-trained Models

Note that download, unzip the zip file, and then copy it to the root folder.

Knowledge Bases

Note that download a virtuoso server and load the above KBs.

You only need to load a specific KB which is correspond to your KBQA dataset.

LC-QuAD 1.0 Resources

The zip file from google drive contains three parts:

  • LC-QuAD 1.0 datasets
  • Its skeleton parsing models
  • Its corresponding KB entity-related Lexicons

Note that download, unzip the zip file, and then copy it to the root.

GraphQuestions Resources

The zip file from google drive contains three parts:

  • GraphQuestions datasets
  • Its skeleton parsing models
  • Its corresponding KB entity-related Lexicons

Note that download, unzip the zip file, and then copy it to the root.

ComplexWebQuestions 1.1

The zip file from google drive contains three parts:

  • ComplexWebQuestions 1.1 datasets
  • Its skeleton parsing models
  • Its corresponding KB entity-related Lexicons

Note that download, unzip the zip file, and then copy it to the root.

Run SkeletonKBQA

SkeletonKBQA contains two KBQA approaches: SSP and SIR.

  • Skeleton-based semantic parsing approach (SSP) has four modules:
    • Ungrounded query generation
    • Entity linking
    • Candidate grounded query generation
    • Semantic matching

Note that the above four modules are correspond to the arguement module in kbcqa/method_sp/sp_pipeline.py.

Run the provided SSP scripts as:

bash run_ssp_LCQ.sh
bash run_ssp_GraphQ.sh
bash run_ssp_CWQ.sh
  • Skeleton-based Information Retrieval approach (SIR) has three modules:
    • Node recogniztion and linking
    • Candidate grounded path generation
    • Semantic matching

Note that the above three modules are correspond to the arguement module in kbcqa/method_ir/ir_pipeline.py.

Run the provided SIR scripts as:

bash run_sir_LCQ.sh
bash run_sir_GraphQ.sh
bash run_sir_CWQ.sh

Contacts

If you have any difficulty or questions in running codes, reproducing experimental results, and skeleton parsing, please email to him (ywsun at smail.nju.edu.cn).

About

Skeleton parsing for complex question answering over knowledge bases (JoWS 2022)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published