Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement CONSTRUCT query processing #528

Merged
merged 55 commits into from Jan 1, 2022

Conversation

RobinTF
Copy link
Collaborator

@RobinTF RobinTF commented Dec 19, 2021

This PR starts implementing support for CONSTRUCT queries. Work in progress.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely going into the right correction,
There is yet some boilerplate to get rid from.

src/engine/QueryExecutionTree.h Show resolved Hide resolved
src/engine/QueryExecutionTree.cpp Show resolved Hide resolved
src/engine/Server.cpp Outdated Show resolved Hide resolved
src/parser/BlankNode.h Outdated Show resolved Hide resolved
src/parser/sparqlParser/SparqlQleverVisitor.h Outdated Show resolved Hide resolved
src/parser/sparqlParser/SparqlQleverVisitor.h Outdated Show resolved Hide resolved
src/parser/sparqlParser/SparqlQleverVisitor.h Outdated Show resolved Hide resolved
src/parser/sparqlParser/SparqlQleverVisitor.h Outdated Show resolved Hide resolved
src/parser/sparqlParser/SparqlQleverVisitor.h Outdated Show resolved Hide resolved
src/parser/sparqlParser/SparqlQleverVisitor.h Outdated Show resolved Hide resolved
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only some minor tweaks
(And many tests)
Are missing.

src/engine/QueryExecutionTree.h Outdated Show resolved Hide resolved
src/engine/QueryPlanner.cpp Outdated Show resolved Hide resolved
src/engine/QueryPlanner.cpp Outdated Show resolved Hide resolved
src/engine/QueryPlanner.cpp Outdated Show resolved Hide resolved
src/engine/QueryPlanner.cpp Outdated Show resolved Hide resolved
src/parser/data/VarOrTerm.h Outdated Show resolved Hide resolved
src/parser/data/Variable.h Outdated Show resolved Hide resolved
src/parser/data/Variable.h Outdated Show resolved Hide resolved
src/parser/sparqlParser/SparqlQleverVisitor.h Outdated Show resolved Hide resolved
src/util/antlr/ThrowingErrorStrategy.h Outdated Show resolved Hide resolved
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only some small details for the things you have already changed since yesterday.

src/index/CMakeLists.txt Outdated Show resolved Hide resolved
src/parser/ParsedQuery.h Outdated Show resolved Hide resolved
@RobinTF RobinTF marked this pull request as ready for review December 31, 2021 17:38
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some dummy comment, to make github diffs work again?!

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only some really really tiny stuff.

const auto throwIfConstructClause = [&pq]() {
if (pq.hasConstructClause()) {
throw std::runtime_error{
"CONSTRUCT queries only support turtle syntax right now"};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two remarks remaining here:

  1. You should add the turtle media type in the supportedMediaTypes above (line 345) s.t. we can also specify it via an accept header.
  2. replace "turtle syntax" by "RDF Turtle as an export format". And Add "Please specify "action=turtle-export" as a query parameter or the accept header "...(add corresponding header here)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's currently called turtle_export to be consistent with csv_export and tsv_export and all the other options

src/parser/SparqlParser.cpp Outdated Show resolved Hide resolved
src/parser/SparqlParserHelpers.cpp Outdated Show resolved Hide resolved
AD_CHECK(_name.length() > 1);
// normalise notation for consistency
if (_name[0] == '$') {
_name[0] = '?';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you don't check that the first character is either $ or ? you can do the
overwrite unconditionally. But also asserting this would probably be better.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to avoid the unecessary write operation if possible and let branch prediction skip this constructor in most cases

@@ -65,5 +74,8 @@ class Variable {
}

// ___________________________________________________________________________
[[nodiscard]] std::string toString() const { return _name; }
[[nodiscard]] std::string toSparql() const { return _name; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be a const string& ,
If other types return std::string here, no harm is done imho.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is supposed to be removed sooner or later, so I'd prefer to keep it as-is for now, even if this might avoid an unecessary copy

@joka921 joka921 merged commit 1b5aff9 into ad-freiburg:master Jan 1, 2022
@RobinTF RobinTF deleted the construct-query-parser branch January 1, 2022 23:43
@hannahbast
Copy link
Member

hannahbast commented Jan 2, 2022

Thanks a lot, @RobinTF and @joka921, for this PR! I just tried it with https://qlever.cs.uni-freiburg.de/olympics and noted three things:

  1. It looks like I had to rebuild the index for this to work. Is this so, and if yes, can you briefly comment why?

  2. In the CONSTRUCT clause, prefixes are not recognized. For example, the first of the two calls below works, but the second returns an error "Prefix olympics: was not registered using a PREFIX declaration".

  3. It would probably be a good idea to pick the default media type depending on the query type. That is, for a SELECT query, do it as it is now, but for a CONSTRUCT query, pick text/turtle by default.

curl -Gs -H "Accept: text/turtle" https://qlever.informatik.uni-freiburg.de/api/olympics --data-urlencode "query=CONSTRUCT { ?s <http://wallscope.co.uk/ontology/olympics/medal> ?o } WHERE { ?s <http://wallscope.co.uk/ontology/olympics/medal> ?o }"

curl -Gs -H "Accept: text/turtle" https://qlever.informatik.uni-freiburg.de/api/olympics --data-urlencode "query=PREFIX olympics: <http://wallscope.co.uk/ontology/olympics/> CONSTRUCT { ?s olympics:medal ?o } WHERE { ?s olympics:medal ?o }"

@RobinTF
Copy link
Collaborator Author

RobinTF commented Jan 2, 2022

@hannahbast

1. It looks like I had to rebuild the index for this to work. Is this so, and if yes, can you briefly comment why?

I didn't have to rebuild any index for this. And none of the code was touched by this PR (or at least it shouldn't have touched any parts)

2. In the CONSTRUCT clause, prefixes are not recognized. For example, the first of the two calls below works, but the second returns an error "Prefix olympics: was not registered using a PREFIX declaration".

That's odd. I assumed it would just not print the PREFIX declarations in the turtle format, but I will have a look into this when I start writing tests for all of this code. The diff was getting fairly large so @joka921 and I decided it would be best to break this up into several PRs, to not conflict with #529

3. It would probably be a good idea to pick the default media type depending on the query type. That is, for a SELECT query, do it as it is now, but for a CONSTRUCT query, pick text/turtle by default.

A PR that supports different formats like tsv/csv/json is in the making. However, the SPARQL specification is pretty vague about serialization with non-turtle formats so we might need to come up with some qlever-specific rules to make those formats work well enough for this.

@hannahbast
Copy link
Member

@RobinTF Thanks for the feedback, Robin!

@1: I just tried it with another knowledge graph and it worked without having to rebuild the index. So the index I tried first was probably old and would have needed rebuilding already for the previous version of the master.

@2: Good idea to break this up. The problem I described is probably just a simple bug. It seems that when the CONSTRUCT clause is evaluated, the prefix map either has not yet been parsed or is not available for some other reason.

@3: More formats are of course great, but I am happy with text/turtle for the moment. My comment was that without specifying an Accept header, the export currently fails. For example, the following command gives the error "CONSTRUCT queries only support RDF Turtle as an export format right now". The reason is that some other media type (probably application/qlever-results+json) is chosen by default. My point was that for a CONSTRUCT query, the default media type (if none other is specified) should be text/turtle .

curl -Gs https://qlever.informatik.uni-freiburg.de/api/olympics --data-urlencode "query=CONSTRUCT { ?s <http://wallscope.co.uk/ontology/olympics/medal> ?o } WHERE { ?s <http://wallscope.co.uk/ontology/olympics/medal> ?o }"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants