Skip to content

RDF Star for Talking About Multiple Triples at Once

Peter F. Patel-Schneider edited this page Dec 6, 2023 · 1 revision

RDF-Star for Talking About Multiple Triples at Once

This use case comes from https://github.com/w3c/rdf-ucr/issues/26

Motivation

There is a need to assert provenance (and possibly other marginalia) about multiple triples from a common source. Often, as in the case of RDF lists or blank nodes, these triples share a subject or are chained together, comprising an "integral subgraph", if you will (or a rooted tree in graph theoretical terms).

Need for quoted triples

Information is associated with triples, possibly indirectly, so there needs to be some way to refer to these triples.

It is not impossible to do this, using named graphs. But the semantics thereof are undefined, and storing this as multiple named graphs today is cumbersome, implementation-dependent and requires assumptions of interpretations to hold. It is also possible to do this with reification, but that is cumbersome.

How should quoted triples behave

I have not seen any practical cases where opacity is required for a combination of asserted and quoted, i.e. annotated data. For unasserted "suggestions" in our real use cases we would require transparent semantics (to be able to navigate to and understand the suggestions).

I would ideally be able to quote all constituent parts of the blank node expressions below. Otherwise, only the arc with the blank node would be quoted, and lots of "dangling triples" would be in the asserted graph.

The problem of quoted bnodes with lots of "dangling, asserted facts" might be handled by user convention, along the lines of "all bnodes only linked to from a quoted triple are to be practically taken as belonging to the quote". But that is cumbersome and brittle.

It is conceivable that other use cases would prefer to "quarantine" chunks from external sources or automatically computed suggestions (e.g. using machine learning). We would use actual literals for that, probably in combination with blank nodes (thus increasing the number of triples in the chunk). But if named graphs where to have conditional "opacity" (if they are "accepted" or treated separately from the active interpretation), this would be a useful alternative. (Literals of course allow for quoting only certain subjects or objects, for instance.)

Example 1: annotating a description of something unknown

To quote something described but unknown, you can do this in Notation 3:

<charlesdodgson> :says { [] :name "Alice" ; :birthDate "1852" } .

This in TriG:

<charlesdodgson> :says _:g1 .
_:g1 { [] :name "Alice" ; :birthDate "1852" }

But in Turtle-star, you have to do this:

<charlesdodgson> :says << _:b1 :name "Alice" >> , << _:b1  :birthDate "1852" >> .

Example 2: Annotating Chunks of Triples

This is bad practise (since an abstract triple is not an occurrence in itself):

  << _:b1 :givenName "Alice" >> dc:source <https://en.wikipedia.org/wiki/Alice_Liddell> ;
    dc:date "2023-10-23" .
  << _:b1 :familyName "Liddell" >> dc:source <https://en.wikipedia.org/wiki/Alice_Liddell> ;
    dc:date "2023-10-23" .
  << _:b1 :birthDate "1852-05-04" >> dc:source <https://en.wikipedia.org/wiki/Alice_Liddell> ;
    dc:date "2023-10-23" .

This is more correct:

<< _:b1 :givenName "Alice" >> rdfg:subGraphOf _:d1 .
<< _:b1 :familyName "Liddell" >> rdfg:subGraphOf _:d1 .
<< _:b1 :birthDate "1852-05-04" >> rdfg:subGraphOf _:d1 .
_:d1 dc:source <https://en.wikipedia.org/wiki/Alice_Liddell> ;
  dc:date "2023-10-23" .

Given RDF 1.1 Semantics, which defines:

A subgraph of an RDF graph is a subset of the triples in the graph. A triple is identified with the singleton set containing it, so that each triple in a graph is considered to be a subgraph.

The above is OKish but not 1:1, since a triple identified does not (necessarily) mean denoted. Cf. (from the same section):

For example, an IRI used as a graph name identifying a named graph in an RDF dataset may refer to something different from the graph it identifies.

This is already possible, but means something else(?):

_:d1 {
  [] :givenName "Alice" ;
    :familyName "Liddell" ;
    :birthDate "1852-05-04" .
}
_:d1 dc:source <https://en.wikipedia.org/wiki/Alice_Liddell> ;
  dc:date "2023-10-23" .

Example 3: RDF Lists

Unsurprisingly the cons nature of ordered lists as triples unravel in the seams here.

You cannot easily quote the entire list, just its association. So this:

<report> bibo:authorList (<a> <b> <c>) {| dc:source <a> |} .

Means this:

<report> bibo:authorList _:l0 .
<< <report> bibo:authorList _:l0 >> dc:source <a> .
_:l0 rdf:first <a>; rdf:rest (<b> <c>) .

Instead of the preferred:

<report> bibo:authorList (<a> <b> <c>) .
_:g1 { <report> bibo:authorList (<a> <b> <c>) }
_:g1 dc:source <a> .

Here is a combo of one "chosen" list and a "suggested" list, using suggested new syntax for unasserted, annotated triples:

<report> bibo:authorList (<a> <b> <c>) {| dc:source <a> ; ex:disputedBy <c> |} ,
  -- (<c> <b> <a>) {| dc:source <c> |} |} .

Preferably meaning:

<report> bibo:authorList (<a> <b> <c>) .
_:g1 { <report> bibo:authorList (<a> <b> <c>) }
_:g1 dc:source <a> ; ex:disputedBy <c> .
_:g2 { <report> bibo:authorList (<c> <b> <a>) }
_:g2 dc:source <c> .

Analysis

This is a variation on other use cases where information is associated with multiple triples. This can be done in a number of ways. With quoted triples a good way is to have a different stand-off relationship from a quoted triple to the group of triples. Transparency probably depends on the needs of the underlying information.