Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for unconnected graph patterns/triples #272

Closed
niklas88 opened this issue Aug 6, 2019 · 3 comments
Closed

Support for unconnected graph patterns/triples #272

niklas88 opened this issue Aug 6, 2019 · 3 comments
Assignees

Comments

@niklas88
Copy link
Member

niklas88 commented Aug 6, 2019

One long standing limitation of QLever is that the following doesn't work

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?where ?name WHERE {  
  wd:Q18642047 wdt:P108 ?where .
  wd:Q18642047 rdfs:label ?name .
  FILTER (lang(?name) = "en") .
}

Since I tried that a long time ago and assumed this was invalid in SPARQL and I think @Buchhold also said so. However @lehmann-4178656ch also tried to do something like this and it turns out this works just fine in Blazegraph. Also with the newer syntax changes one can now write this as

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?where ?name WHERE {
  wd:Q18642047 wdt:P108 ?where; rdfs:label ?name .
  FILTER (lang(?name) = "en") .
}

which also doesn't work but more obviously should. It seems this is based on a limitation of only being able to join on variables. The following query that turns the entity into a 1 × 1 table and gives it an associated variable makes the query workable.

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?where ?name WHERE {
  VALUES ?value { wd:Q18642047 }
  ?value wdt:P108 ?where; rdfs:label ?name .
  FILTER (lang(?name) = "en") .
}
@floriankramer
Copy link
Member

@niklas88 After thinking about this for a while I decided to look up the definition of solution mappings in the SPARQL standard.
According to the standard a solution mapping u is a function from variables to iris. The solution of a query is then a multiset of these solution mappings (a multiset to allow for duplicate rows in the result). A valid solution for a BasicGraphPattern (BGP) (such as {wd:Q18642047 wdt:P108 ?where}) is a multiset of solution mappings which in turn is a reduction to the variables of a pattern instance mapping (P) from the basic graph pattern to a subset of the knowledge base. This means, that P(BGP) has to take wd:Q18642047 (from the example) into account to ensure it maps to a subset of the knowledge base. The solution mapping O which is the actual result of evaluating the triple, and later used as the input to subsequent joins discard everything that is not a variable though, and only stores the results for ?where (in the form of functions mapping ?where to iris).

The standard on join says, that the join of two multisets of solution mappings O1 and O2 is a new solution mapping that merges any two elements of O1 and O2 that are compatible. Two solution mappings u1 and u2 are compatible if for all variables v that are in both u1 and u2 u1(v) = u2(v). Thus the join of two multisets of solution mapping in which the solution mappings do not share any variables includes every possible combination of solution mappings from both multisets.
For the first example query in the issue this happens to be the same as joining them on the non variable wd:Q18642047.

QLever is then currently non standard conform by considering queries consisting of disjoint graph patterns (in respect to their variables) as invalid instead of computing the join of them (which in sparql is well defined).

@niklas88
Copy link
Member Author

niklas88 commented Aug 8, 2019

Good point about this actually being the possible combinations of solution mappings. This means it has nothing to do with the fixed values but it's all about how we treat GPs not connected by a variable.

The following disjoint query is also valid and just does a cross product on the solution sets.

(14 occupations for Einstein * 6 occupations for Bohr * 9 occupations for Dirac = 756 rows)

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?occ_einstein ?occ_bohr ?occ_dirac WHERE {
  wd:Q937 wdt:P106 ?occ_einstein .
  wd:Q7085 wdt:P106 ?occ_bohr . 
  wd:Q47480 wdt:P106 ?occ_dirac .
}

So afaik all we need to do is an all-pairs cross-product between the solution sets of all GraphPatterns in a GroupGraphPattern that are not connected by a variable. Then one would end up with the correct result also for the following query where I previously thought it would require a join on the fixed value (Einstein).

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?spouse ?sibling WHERE {
  wd:Q937 wdt:P3373 ?sibling;
          wdt:P26 ?spouse . 
}

This also means our TripleGraph is actually fine, we just have to treat its independent components differently.

Note that this extends to all kinds of unconnected GPs for example the below query is equivalent to the first but does the cross product across different GPs.

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?occ_einstein ?occ_bohr ?occ_dirac WHERE {
  { SELECT ?occ_einstein WHERE { wd:Q937 wdt:P106 ?occ_einstein . } }
  { SELECT ?occ_bohr WHERE { wd:Q7085 wdt:P106 ?occ_bohr . } }
  { SELECT ?occ_dirac WHERE {wd:Q47480 wdt:P106 ?occ_dirac . } }
}

@niklas88 niklas88 changed the title Limitation with fixed values Support for cross product between unconnected graph patterns/triples Aug 8, 2019
@niklas88 niklas88 changed the title Support for cross product between unconnected graph patterns/triples Support for unconnected graph patterns/triples Aug 8, 2019
@joka921
Copy link
Member

joka921 commented May 6, 2024

Cartesian product joins between unconnected graph patterns have been supported for quite a while now, so this issue is fixed.

@joka921 joka921 closed this as completed May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants