Graph Query Sampler (gqs)

Graph query Sampler provides an implementation to extract queries from a graph. This is used to train and evaluate approximate graph query answering (also called multi-hop reasoning) systems.

To install, clone the repository.

We recommend creating a virtual environment using conda.

conda create --name gqs_env --python=3.11

from the home of this repository:

conda activate gqs_env

and run:

pip install -e .

To run test install the test dependencies using

pip install -e .[test]

For MacOS

pip install -e '.[test]'

and then execute the tests with

pytest

Creating a query dataset

To create a new query dataset, follow these steps. We assume a dataset named hp in which must be in n-triples format.

When using the command line tool, you can always see more information and options by adding --help to a command.

Install graphDB. You need to configure it with a lot of memory for the query sampler.
Initialize the folder for your dataset. Specify your nt file and the name you want to use for your dataset, which can only contain lowercase characters.

gqs init RDF --input resources/harrypotter.nt --dataset hp --blank-node-strategy convert

This will create a new folder with the name of your dataset under the folder called datasets. All data related to the query sampling will be stored in that folder.

Split the dataset in train, validation and test. There are several options for the splitting, but here we just do round-robin

gqs split round-robin --dataset hp

Store the splits in the triple store:

gqs store graphdb --dataset hp

Create the mapping for your dataset. This is the mapping between identifiers in the RDF file and indices which will be used in the tensor representations.

gqs mapping create --dataset hp

Configure the formulas you want to use for sampling. Make sure that the formulas are adapted to what you need, check the shapes and configurations. Then copy them as follows, the --formula-root argument specifies the directory with formulas, the glob pattern specifies the files within that directory.

gqs formulas copy --formula-root ./resources/formulas_example/ --formula-glob '**/0qual//**/*'  --dataset hp

Apply the constraints to the queries with:

gqs formulas add-constraints --dataset hp

Sample the queries from the triple store.

gqs sample create --dataset hp

To use the queries, we convert them to protocol buffers

gqs convert csv-to-proto --dataset hp

Done! Now the queries can be loaded with the provided data loader.

Alternatively, you could export the queries to a format which can be loaded by the KGReasoning framework: https://github.com/pminervini/KGReasoning/

gqs export to-kgreasoning --dataset hp

The result of the export will be placed in /datasets/{datasetname}/export/kgreasoning these files can then be put as a dataset in the KGReasoning framework.

Compilation of the protocol buffer file

Download the protocol buffer binary. We used 3.20 and have the same version in setup.cfg. Most likely it is possible to use a newer version and put a corresponding newer version of the python package.
protoc-3.20.0-linux-x86_64/bin/protoc -I=./src/gqs/query_represenation/ --python_out=./src/gqs/query_represenation/ --pyi_out=./src/gqs/query_represenation/ ./src/gqs/query_represenation/query.prot

Then, the version used above did generate stubs which mypy on the github CI complains about. Somehow it does not process the exclude directives correctly. Hence, some Mapping types without parameters were changed to Mapping[Any,Any]

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
.github/workflows		.github/workflows
executables		executables
resources		resources
src/gqs		src/gqs
tests		tests
.gitignore		.gitignore
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph Query Sampler (gqs)

Creating a query dataset

Compilation of the protocol buffer file

About

Releases

Packages

Contributors 3

Languages

miselico/graph_query_sampler

Folders and files

Latest commit

History

Repository files navigation

Graph Query Sampler (gqs)

Creating a query dataset

Compilation of the protocol buffer file

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages