New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ql:has-predicate yields incorrect results in conjunction with ps: predicate #289
Comments
I have investigated this a bit more and found another very natural query with the same problem:
If wdt:P646 ("Freebase ID") is replaced by wdt:P31("instance of"), the query works (at least, the results look plausible). Maybe what make the difference here is whether the predicate in the first triple has objects, the names of which are part of the externalized vocabulary (stored on disk). At least, that is true for objects of the ps: predicates and (I guess) also for a predicate like wdt:P646. The problem also occurs for the predicate schema:name. It's just an educated guess, since I don't understand how the nature of the predicate in the first triple affects the PREDICATE SCAN operation. But maybe this is an interesting piece of information anyway. |
The query
returns only 34 lines, vs
returning 127. The difference here seem to be language predicates though (e.g. |
I've found a bug in the assignment of the column of the subtree that a predicate scan uses as it's subjects. If the predicate scan was the right side of the join it would always use column 0 of it's subqueries. Thus, when the optimizer chose an ordering for the scan where the subtree's column 0 was the triples subjects the query would work, but if the optimizer chose any other order (which it could, given the order doesn't affect performance in this case) the query results would be arbitrary. |
Fixed the subtree column of has pedicate scans. Fixes #289
@floriankramer @niklas88 Thank you, Florian, for finding and fixing this bug, and thank you, Niklas, for the code review. I have updated the backend behind http://qlever.informatik.uni-freiburg.de/Wikidata_Full to the latest version of the master, and the problematic queries now work like a charm! In particular, now try to find the three movies, for which Meryl Streep won an Oscar. It's not an easy SPARQL query, but it can be constructed reasonably well with what we have now. |
In the current version of QLever, the following query yields only 5 rows, and wdt:P31 is not among them although it should be:
If the second triple is replaced by ?m ?p ?o, the number of rows increases to 164 and wdt:P31 is included.
The result set of the two queries should be the same. So it seems that something is wrong the PREDICATE SCAN operation.
The problem also occurs if we add (COUNT ?m AS ?count) to the SELECT clause.
The problem also occurs when replacing ps:P1913 by wdt:P1913. However, wdt:P31 is present then in the results. So maybe this is a bug which has been around already for quite a while, but we haven't realized it yet (because only less "interesting" predicates were missing if no ps: predicates are involved in the query).
The text was updated successfully, but these errors were encountered: