Skip to content

Undeterministic roundtripping between Turtle-star annotations syntax and N-triples #97

@rat10

Description

@rat10

Turtle-star provides two idioms to facilitate statement annotations with triple terms: the annotation syntax and reified triples. The following example uses both syntaxes to express that we know for a fact that Alice likes swimming in the ocean during summertime, but we don't know, or even doubt, that she likes swimming indoor in winter.

:Alice :likes :Swimming {| :during :Summer ; :at :Seaside |} .
<< :Alice :likes :Swimming >> :during :Winter ; :at :IndoorPublicPool .

To save these statements in an RDF triple store they need to be converted to N-Triples:

:Alice :likes :Swimming .
_:r1 rdf:reifies <<( :Alice :likes :Swimming )>> .
_:r1 :during :Summer .
_:r1 :at :Seaside .
_:r2 rdf:reifies <<( :Alice :likes :Swimming )>> .
_:r2 :during :Winter .
_:r2 :at :IndoorPublicPool .

Converting those triples back to Turtle-star can result in the initial rendering:

:Alice :likes :Swimming {| :during :Summer ; :at :Seaside |} .
<< :Alice :likes :Swimming >> :during :Winter ; :at :IndoorPublicPool .

However it can equally well result in the following renderings:

<< :Alice :likes :Swimming >> :during :Summer ; :at :Seaside .
:Alice :likes :Swimming {| :during :Winter ; :at :IndoorPublicPool |} .

or

:Alice :likes :Swimming .
<< :Alice :likes :Swimming >> :during :Summer ; :at :Seaside .
<< :Alice :likes :Swimming >> :during :Winter ; :at :IndoorPublicPool .

or

:Alice :likes :Swimming {| :during :Summer ; :at :Seaside |} .
:Alice :likes :Swimming {| :during :Winter ; :at :IndoorPublicPool |} .

This degree of variance is problematic.

First of all it's simply confusing that a certain construct can be presented in so many different ways. One may argue that it is not uncommon in RDF to be able to express the same meaning in different ways, but that predominantly concerns variations in modelling style which are grounded in RDF's inherent flexibility. Blank node identifiers are another common cause of variance resulting in some confusion, but that is limited to a certain kind of node. A list expressed with Turtle syntactic sugar looks very different from its representation using strictly triples. However, that only concerns the (necessary) difference between a bare bones triple structure and its syntactic sugar, not the resulting list representation.
This case here is different from all those mentioned: mapping from a specific N-triples construct to its specific counterpart in a surface syntax shouldn't be subject to such variance.

What's more, these different renderings all have the same meaning per the RDF semantics, but they convey very different intuitive meanings to users not expertly trained in RDF. The initial example conveys a clear intuitive reading, as suggested in its introduction and I claim that at least 9 out of 10 users will read it exactly that way - and not the way that the RDF semantics define it. All other mappings convey very different intuitive readings w.r.t. what we know about Alice' swimming preferences: outdoors but not indoors, indoors but not outdoors, indoors and outdoors, or maybe neither. However, the only thing we can actually say for sure according to the RDF semantics is that Alice likes swimming. That makes the whole annotation mechanism brittle and hard to interpret correctly. The provided syntactic sugar suggests otherwise, but those suggestions are only illusions that can't be relied on in production, i.e. when roundtripping between triple store and Turtle serialization.

A surface syntax has to cater to the intuitions that users can be expected to have. RDF already doesn't have the best track record in that respect, but most of the well-known pitfalls e.g. w.r.t. blank node naming or ordering of triples, are at least well substantiated by its integration-focused semantics. Many users find RDF too complicated despite those good reasons, and despite the practical applications for such ambiguity. The case discussed here is not without reason as well: an annotation without entailment powers can't annotate a triple in the graph - it can only ever annotate "such a triple", i.e. a proposition. Everything else, anything more is merely coincidence. The annotation syntax alludes to bridging that gap, but in reality it doesn't, as the following example may illustrate even more vividly:

:Alice :likes :Swimming {| :during :Summer ; :at :Seaside |} .

is expressed in N-triples as

:Alice :likes :Swimming .
_:r1 rdf:reifies <<( :Alice :likes :Swimming )>> .
_:r1 :during :Summer .
_:r1 :at :Seaside .

which can again be mapped to Turtle in very different ways:

:Alice :likes :Swimming {| :during :Summer ; :at :Seaside |} .

or

:Alice :likes :Swimming .
<< :Alice :likes :Swimming >> :during :Summer ; :at :Seaside .

The first mapping tells us that Alice likes swimming in summer at the seaside, whereas the second example may suggest that Alice likes swimming, except in summer at the seaside. The variety of interpretations - and therefore the potential for misunderstanding - that the different mappings are prone to convey is extraordinary even by RDF's standards, and although being well substantiated in the semantics it feels like an oversight.

The initial example is perfectly legal and it conveys a strong intuition, namely that RDF is able to talk about asserted and unasserted statements independently of each other. However, RDF really is only able to talk about unasserted statements as long as they are not also asserted with different annotations, or no annotations at all. This creates a tension that can be resolved by mapping the annotation syntax to a different property as has been proposed with rdfs:states (*). If this tension is not resolved, it creates a huge potential for unpleasant surprises, and the perception that even RDF 1.2's new annotation mechanism with its sleek syntactic sugar is prone to RDF's "typical" complexity.

EDIT:
(*) The proposal of rdfs:states has been discussed at length in issue #128 map the annotation syntax to rdfs:states. That issue has become too long and winded to be useful anymore, and will be closed soon and replaced by a new proposal that is currently being in the works.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions