This repository contains the source code for
QirK
, a system that allows formulating structured queries with loose natural language constraints.
Once the project is set up, simply run kgqa.py
.
python3 kgqa.py
> X: director(X, "Quentin Tarantino")
This project queries the wikidata dataset. A brief overview of the dataset can be found here. To set up the corresponding database tables proceed as follows:
- Download and decompress the wikidata dump from here.
curl https://dumps.wikimedia.org/wikidatawiki/entities/latest-all.json.gz --output dump.json.gz
gzip -d jump.json.gz
- Use the
migrador.rb
script provided by wikidata-experiments to convert the dump into the required format.
curl https://bitbucket.org/danielhz/wikidata-experiments/raw/9fb724eb90fdc242434db8fd36d88950eb2255c0/postgresql-experiment-scripts/load-data/migrador.rb --output migrador.rb
mkdir csv
ruby migrador.rb
- Follow the instruction outlined in the repository to clone and build
wd-migrate
. Run the tool to convert the previously generated output into a format easily understandable by postgres:
./wd_migrate.o claims csv/claims.txt csv/claims.csv
./wd_migrate.o qualifiers csv/qualifiers.txt csv/qualifiers.csv
- Set up a
wikidata
database inpostgres
withUTF-8
encoding.
CREATE DATABASE wikidata WITH encoding = 'UTF8';
- Populate the database using the
sql/setup.sql
script.
psql -U $PSQL_USERNAME -d wikidata -f sql/setup.sql
- Create and customize the configuration file
./config.yaml
. Seeconfig.template.yaml
for the required parameters.
cp config.template.yaml config.yaml
In particular, this requires configuring the following parameters:
Config Parameter | Description |
---|---|
psql.username |
postgres username |
psql.password |
postgres password |
language_model.open_api_key |
OpenAI API Key |
- To install the project's dependencies execute the following command.
pip3 install -r requirements.txt
- Next, generate the required embeddings via the provided setup script.
python3 setup.py ComputeEmbeddings
- Finally, populate the
claims
table with invertible predicates by running the supplied script.
python3 setup.py InvertPredicates
If you experience bugs, or have suggestions for improvements, please use the issue tracker to report them.