ql:has-predicate yields incorrect results in conjunction with ps: predicate #289

hannahbast · 2019-10-30T23:30:23Z

In the current version of QLever, the following query yields only 5 rows, and wdt:P31 is not among them although it should be:

PREFIX ps: <http://www.wikidata.org/prop/statement/>
SELECT ?p WHERE {
  ?s ps:P1913 ?m .
  ?m ql:has-predicate ?p
}
GROUP BY ?p

If the second triple is replaced by ?m ?p ?o, the number of rows increases to 164 and wdt:P31 is included.

The result set of the two queries should be the same. So it seems that something is wrong the PREDICATE SCAN operation.

The problem also occurs if we add (COUNT ?m AS ?count) to the SELECT clause.

The problem also occurs when replacing ps:P1913 by wdt:P1913. However, wdt:P31 is present then in the results. So maybe this is a bug which has been around already for quite a while, but we haven't realized it yet (because only less "interesting" predicates were missing if no ps: predicates are involved in the query).

The text was updated successfully, but these errors were encountered:

hannahbast · 2019-11-13T02:56:07Z

I have investigated this a bit more and found another very natural query with the same problem:

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?x ?p WHERE {
  ?x wdt:P646 ?freebase_id .
  ?x ql:has-predicate ?p
}

If wdt:P646 ("Freebase ID") is replaced by wdt:P31("instance of"), the query works (at least, the results look plausible).

Maybe what make the difference here is whether the predicate in the first triple has objects, the names of which are part of the externalized vocabulary (stored on disk). At least, that is true for objects of the ps: predicates and (I guess) also for a predicate like wdt:P646. The problem also occurs for the predicate schema:name.

It's just an educated guess, since I don't understand how the nature of the predicate in the first triple affects the PREDICATE SCAN operation. But maybe this is an interesting piece of information anyway.

floriankramer · 2019-11-18T14:06:36Z

The query

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT ?p WHERE {
  <http://www.wikidata.org/entity/Q1000001> ql:has-predicate ?p
}

returns only 34 lines, vs

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT ?p WHERE {
  <http://www.wikidata.org/entity/Q1000001> ?p ?o
}

returning 127. The difference here seem to be language predicates though (e.g. @ar@<http://schema.org/description>). The entity used for the queries is one of the subjects of the wdt:P646 triples. ql:has-predicate does include the P646 predicate though, indicating that the bug might be inside of the predicate scan with a subquery.

floriankramer · 2019-11-18T14:23:37Z

I've found a bug in the assignment of the column of the subtree that a predicate scan uses as it's subjects. If the predicate scan was the right side of the join it would always use column 0 of it's subqueries. Thus, when the optimizer chose an ordering for the scan where the subtree's column 0 was the triples subjects the query would work, but if the optimizer chose any other order (which it could, given the order doesn't affect performance in this case) the query results would be arbitrary.

Fixed the subtree column of has pedicate scans. Fixes #289

hannahbast · 2019-11-18T19:53:47Z

@floriankramer @niklas88 Thank you, Florian, for finding and fixing this bug, and thank you, Niklas, for the code review. I have updated the backend behind http://qlever.informatik.uni-freiburg.de/Wikidata_Full to the latest version of the master, and the problematic queries now work like a charm!

In particular, now try to find the three movies, for which Meryl Streep won an Oscar. It's not an easy SPARQL query, but it can be constructed reasonably well with what we have now.

hannahbast assigned floriankramer Oct 30, 2019

floriankramer added a commit to floriankramer/QLever that referenced this issue Nov 18, 2019

Fixed the subtree column of has pedicate scans. Fixes ad-freiburg#289

edb0bad

niklas88 closed this as completed in 02cf0d1 Nov 18, 2019

niklas88 added a commit that referenced this issue Nov 18, 2019

Merge pull request #293 from floriankramer/fix_has_predicate_column

e576320

Fixed the subtree column of has pedicate scans. Fixes #289

WolfgangFahl mentioned this issue Nov 24, 2022

Unclear error message: CHECK FAILED (The requested feature requires a loaded patterns file (do not specify the --no-patterns option for this to work) #831

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ql:has-predicate yields incorrect results in conjunction with ps: predicate #289

ql:has-predicate yields incorrect results in conjunction with ps: predicate #289

hannahbast commented Oct 30, 2019 •

edited

hannahbast commented Nov 13, 2019

floriankramer commented Nov 18, 2019

floriankramer commented Nov 18, 2019

hannahbast commented Nov 18, 2019

ql:has-predicate yields incorrect results in conjunction with ps: predicate #289

ql:has-predicate yields incorrect results in conjunction with ps: predicate #289

Comments

hannahbast commented Oct 30, 2019 • edited

hannahbast commented Nov 13, 2019

floriankramer commented Nov 18, 2019

floriankramer commented Nov 18, 2019

hannahbast commented Nov 18, 2019

hannahbast commented Oct 30, 2019 •

edited