Skip to content

QirK: Question Answering via Intermediate Representation on Knowledge Graphs

Notifications You must be signed in to change notification settings

jlscheerer/kgqa

Repository files navigation

kgqa

This repository contains the source code for QirK, a system that allows formulating structured queries with loose natural language constraints.

Querying via QirK

Once the project is set up, simply run kgqa.py.

python3 kgqa.py
> X: director(X, "Quentin Tarantino")

Getting Started

This project queries the wikidata dataset. A brief overview of the dataset can be found here. To set up the corresponding database tables proceed as follows:

  1. Download and decompress the wikidata dump from here.
curl https://dumps.wikimedia.org/wikidatawiki/entities/latest-all.json.gz --output dump.json.gz
gzip -d jump.json.gz
  1. Use the migrador.rb script provided by wikidata-experiments to convert the dump into the required format.
curl https://bitbucket.org/danielhz/wikidata-experiments/raw/9fb724eb90fdc242434db8fd36d88950eb2255c0/postgresql-experiment-scripts/load-data/migrador.rb --output migrador.rb
mkdir csv
ruby migrador.rb
  1. Follow the instruction outlined in the repository to clone and build wd-migrate. Run the tool to convert the previously generated output into a format easily understandable by postgres:
./wd_migrate.o claims csv/claims.txt csv/claims.csv
./wd_migrate.o qualifiers csv/qualifiers.txt csv/qualifiers.csv
  1. Set up a wikidata database in postgres with UTF-8 encoding.
CREATE DATABASE wikidata WITH encoding = 'UTF8';
  1. Populate the database using the sql/setup.sql script.
psql -U $PSQL_USERNAME -d wikidata -f sql/setup.sql
  1. Create and customize the configuration file ./config.yaml. See config.template.yaml for the required parameters.
cp config.template.yaml config.yaml

In particular, this requires configuring the following parameters:

Config Parameter Description
psql.username postgres username
psql.password postgres password
language_model.open_api_key OpenAI API Key
  1. To install the project's dependencies execute the following command.
pip3 install -r requirements.txt
  1. Next, generate the required embeddings via the provided setup script.
python3 setup.py ComputeEmbeddings
  1. Finally, populate the claims table with invertible predicates by running the supplied script.
python3 setup.py InvertPredicates

Bugs

If you experience bugs, or have suggestions for improvements, please use the issue tracker to report them.

About

QirK: Question Answering via Intermediate Representation on Knowledge Graphs

Topics

Resources

Stars

Watchers

Forks

Languages