-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should RDF* be just syntactic sugar on top of RDF? #37
Comments
DISCLAIMER: if we decided that RDF* can be encoded in standard RDF (e.g. using standard reification), that would not impose that implementers store it that way in their systems. They could still use internal optimizations, e.g. to avoid the overhead of storing additional 3 or 4 triples for every reified triple... |
Good issue. If we can decide this, we can move on. On the idea of encoding in triples -- using multiple triples, and a well-formed-ness condition -- as a basic level feature: Long ago, Jena had special code to optimize reification in storage. Jena does not have such code any longer - it was virtually unused and created a significant cost on building storage subsystems. The trouble is not reification itself, it is the well-formed-ness condition and handling of partial reification, unfinished reification and wrong reification. Suppose you have The impl has to undone plain storage, or buffer in memory, in case the partial triples are completed - neither is very nice. If we want to enable lightweight RDF toolkits that work with any data, if we want toolkits in an many languages and technology ecosystems as possible, adding to the implementation costs is a factor. The same occurs with RDF lists (collections). For lists, the encoding of a structure in RDF triples makes it a lot of work to optimize as well as handling all the "incorrect" cases; two Code may still access these are RDF triples - what's The lesson I draw is that encoding in triples and having assumptions, yet still be general RDF, makes implementation harder, by which I mean, a significant amount of work, for what is, by-in-large, a corner case. If a system is specialised for a usage pattern of RDF and not the general case - different story (are there any apart from parsing RDF into OWL?). Encoding structures and general case is not a free lunch. At the syntax level, let alone semantic interpretation, the advantage of adding a new kind of RDF term in Then there is the utility of syntax like PS |
I was recently aggravating myself by imagining if and how much the SemWeb would be more useful if ordered, bounded collections were first class objects. IMO, their lack has cost us years and countless opportunities.
Can you say what that should mean? Does it just turn the :source arc around from
Not immediately; it just took 15 years. |
No, instead, see #9 |
Today we had a fruitful discussion on the link bewteen RDF* and standard reification. A strawpoll followed, showing that a majority in the call would be happy with RDF* being syntactic sugar for standard reification (possibly with more constraints). |
I believe that the question of whether RDF* is (or should be) just syntactic sugar for RDF reification may be confusing or misleading to some. From an implementer's perspective, "syntactic sugar" may be interpreted as something that would be replaced during parsing and not considered internally in a system. However, I guess that this is not the intention of the question. Instead, I assume that the question is more about the semantics of embedded triples rather than about requiring systems to actually convert RDF* graphs into RDF graphs. |
"However, I guess" and "Instead, I assume" don't generally lead to good results. Here, for instance, they appear to lead you in a direction with which I do not agree, and with which I believe many others in the RDF-using community will take exception. If you're not clear on what someone means, you should ask them to confirm your interpretation and/or otherwise clarify whatever they've said. From my perspective, syntactic sugar provides writers of a syntax with simplified ways to write complex things, where such simplified notation can always be losslessly translated to and from the complex notation. In other words, it is intimately and inseparably bound to that syntax which it is sweetening (hence, sugar). In Turtle, one such piece of syntactic sugar allows authors to write Another piece of sugar allows the use of Note that neither of these pieces of sugar do anything to the model they are expressing. Assuming the That is, it might connote --
On the other hand, it might require literals in the subject position, which, while part of non-normative Generalized RDF, is not otherwise part of the RDF TR. (See 3.1 Triples.) In this case, On the gripping hand, given my reading thus far, you do not really mean Turtle* to "just" or "simply" be syntactic sugar for Turtle. I think you cannot mean this, because then there would be no need for RDF*, which implies changes to the underlying model -- which require things be much more rigorously considered than they seem to have been to date -- and which cannot simply be syntactic sugar because the model is much more than any of the syntaxes used to express it. |
Personally, I'd like to consider using the Reification vocabulary as a reification of RDF* embedded triples, which may be useful for some implementations, but it's useful to be able to consider triples as their own entities. This differs from our recent straw-poll, and that ship may have sailed. But, IIRC, RDF/XML reification would create a separate blank node for each reified statement for each separate use of
Results in the following: [ a rdf: Statement;
rdf:subject :a;
rdf:predicate :b;
rdf:object :c;
:d :e;
:f :g
] . And not the the less-lean variant: [ a rdf: Statement;
rdf:subject :a;
rdf:predicate :b;
rdf:object :c;
:d :e
] .
[ a rdf: Statement;
rdf:subject :a;
rdf:predicate :b;
rdf:object :c;
:f :g
] . |
Is
the same as
|
It is not an entirely settled issue, but IMO << :a :b :c >> identifies a unique triple, and so it's reified representation using |
Straw polls are not binding in any case. In this case, any resolutions based on that straw poll are less than 7 days old, so should be tentative pending ratification (including by being ignored) by those of the group who were not present at the last meeting (which includes me, and, I guess, you). I'm not sure whether I agree with your description here, but I'm pretty sure I disagree with the straw poll, and I aim to join the next call. (Please bear with me, folks; I'm gradually returning to full duties after a rough year of cancer treatments and side effects [prognosis is that I should now be fully clear, with recurrence is a low likelihood].) (more coming, addressed to all) |
Sentences occur in multiple places, without being the same sentence, without carrying the same meaning or import, because of their context, including author, time, and probably other attributes. So, too, with RDF triples (a/k/a RDF sentences).
Or is it important thus? Perhaps it is only important to be able to say that some utterance(s) was by a child during a game, and other utterance(s) were by subject matter experts in official analyses. This level of analysis does not require a specific identity for any of the utterances, only that each cluster of provenance triples be maintained as a cluster -- e.g., as a named graph -- each of which is a description of the same combination of I submit that you can cut each and every occurrence of a given simple sentence from each and every work in which it appears, shuffle them randomly, and replace them -- each landing in a different "original source" -- and nothing would change about any of their meanings or importance. The fact that this one was printed in Garamond, and this in Helvetica, and that in Times New Roman, is not important. Nor is that the fact that snippet-14 originally came from book-14 and is now in book-8, and snippet-8 is now in book-9, and snippet-9 is now in book-14. I submit that the same is true for RDF triples, and even for quads. The fourth element of those quads is not inherently important; it is only important when analyzing (or emitting) provenance or quality or similar qualities of those triples. (This is part of why the fourth element was given such short shrift in the original RDF development efforts; getting the other three right was far more important.)
I would say "yes," because the |
No, I really meant "syntactic sugar" -- and not quite advocating it, by the way 😉 . I think that embedded triples as syntactic sugar work well as long as they contain only ground terms. Blank nodes (as usual), make things trickier... |
Well, consider
It is not the same as
despite the fact that So it is legitimate to ask that question for
If that was the case, then your answer should be "no" rather than "yes" (because of the |
Quite so. Syntactic sugar in general, and the Keeping track of the entities referred to by such pronouns (blank nodes) is too often a challenge. Hence why they (pronouns and blank nodes, both) should be avoided except when absolutely necessary, or when clarity is easy to achieve -- which situations do exist, just far less frequently than many seem to think. Your example does bring me to note that
-- and will as long as those three triples are found in the same graph. The glory of pronouns that give no hints as to their referents (gender, plurality, am I forgetting another?, a la "he" vs "she" vs "it" vs "they")! |
We should have both options. Let an embedded triple denote the same thing everywhere, as you suggest. This is a useful default as it is a semantically correct way to add facts to facts, and even if it isn't semantically correct it often still works as applications provide context and data is targeted at applications. So: |
@rat10 just to be clear (and because you made this comments in a thread about "syntactic sugar"): do you consider that |
We know that reification is considered too verbose; we ought to address this concern. My interpretation of "syntactic sugar" is "behaves like". That is not "exactly the same". A system that expanded the new RDF* syntax to reification (controlled, documented mapping - e.g. one refication per unique triple term) will capture the same information and can pass the translation around. Details may differ. Translated it responds to matching "rdf:subject" while a triple-term system does not; counting triples differs. Mixing RDF* and existing reification is undefined if they overlap. |
In RDF* defined as syntactic sugar for RDF standard reification the
Some technicalities:
No violence, please! ;-) |
And if it was not (that is, if
? |
I'm a little worried about the I'm also wondering how @rat10 came to the conclusion that Virtuoso is a quint store. I'm pretty sure that we store |
Yes, of course. I'm happy without any RDF standard reification syntax altogether. @TallTed |
@rat10 I think this is the blog post you meant? Events have indeed overtaken some of its content (not surprising after 11 years). The default RDF indexing, for instance, is now a 2+3 combination over the SPOG table --
-- which is sufficient for most needs. Additional indexes are sometimes created for specific deployment needs, but adding columns to the table is vanishingly rare. There was no substantive interest in adding a |
This was discussed during the call on 2021-01-15 https://w3c.github.io/rdf-star/Minutes/2021-01-15.html#item03 |
In other words, does RDF* need its own abstract syntax and semantics, or can it be "encoded" in standard RDF?
It largely depends on the answer to issue #22.
a) either we promote RDF* triples as a new kind of terms, as done in the original papers and the current version of the report, or
b) we extend RDF's semantics with a built-in datatype for representing IRIs and literals, and we represent RDF* triples using an adapted form of reification.
To illustrate the last bullet:
could be seen as syntactic sugar for
There are several reasons why I believe this modelling needs a small extension to RDF semantics, but I'll develop them if we come to a point where we consider this option seriously...
The text was updated successfully, but these errors were encountered: