-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include Annotation syntax in Turtle* and SPARQL* #9
Comments
Just for the record at this point: Related to the extension of the Turtle* format as discussed in this issue, the RDF* data model itself (i.e., the "abstract syntax") may also be extended along the same lines. In an email on the mailing list I have outlined how such an extension can be defined. |
@hartig I don't see any need to extend the current abstract syntax. As expressed by the examples above, my understanding of the Annotation syntax is that it is purely syntactic sugar, to avoid repeating the annotated triples. Or do you consider that the first snippet (using |
No, absolutely not. Here is the reason for me considering a related extension of the RDF* data model: Once you have added such an annotation syntax to Turtle*, the natural next step is to extend the (user-facing) syntax of SPARQL* in the same way. That is, to allow for queries to look as follows:
Then, of course, such expressions may also be considered purely as syntactic sugar for the following:
However, I am thinking now that it might be useful to define the query semantics for the first query pattern directly rather than having an implicit definition that relies on some syntactic rewriting. I mean, assume there are systems that employ physical data structures to directly support the first query pattern; such systems are not going to rewrite the pattern into the (equivalent) second version, and they are not going to rewrite the Turtle* annotation syntax (using {| ... |}) into Turtle* that uses only << ... >>. So, based on this thinking, a necessary first step for defining the query semantics directly for the first pattern, is to extend the RDF* data model. |
Ok, thanks for the explanation, I see more clearly your motivation now. As we agree that both pieces of data are essentially equivalent, then both queries above should give the same result, regardless of the concrete syntax used to feed the database. With your proposal, it seems to me that we would have to describe how to handle 4 different cases: {data-annotation, data-raw}×{query-annotation, query-raw}. Some implementers might want to go through that trouble for optimisation purposes, but on the specification side, I really think we should stay minimal and simple. |
Just a suggestion: Wouldn't a syntax like
look more human-readable, since referring to annotations in other languages? |
I am not entirely sure how a system which supports the PG mode only should then load Or is the SA semantics going to be a MUST for all RDF* implementations (that'd moot my question)? |
I wouldn't want to make SA a must. Instead, exactly as you suggest, I would propose that the parser of a PG-mode-only system rejects expressions of the form
and only accepts expressions of the form
|
Right, that would make our life easier (as a PG-only system) but I'm a bit skeptical that an SA system would be happy to export the data using the Put simply, interoperability between SA and PG systems could be tricky even for datasets on which their behaviour should coincide, i.e. where all embedded triples are asserted. |
I would say that it depends on the data structures that the system uses internally. For instance, if the triple (:bob, :age, 42) is asserted, then an SA-only system probably has this triple separately in its indexes (in addition to the nested triples that contain that triple as an embedded triple). |
I’d be wary about describing behavior based on surface syntax, and understand this based on the resulting abstract syntax. What if a client retrieves a, N-Triples* format? What if the input is streamed? Do you need to have first parsed a base triple before an annotation is valid? And, having some systems support PG and others SA is a road to incompatibility. I’d say, if a client parses input which has an SA assertion, and there is no matching base triple, it would be a validity constraint violation. Similarly, if the semantics of an annotation inferred that a triple not exist, that would be a validity constraint violation if operating under such a regime. |
I like the idea of reusing
Curly brackets, in SPARQL or TriG, have a history of containing full triples. Square brackets, on the other hand, contain predicate-object lists. |
Noting that the annotation block can be multiline and quite big, having paired delimiters makes the visual pairing easier and parsers can have better error messages. |
At least in Turtle, Other character choices like Some other characters are in SPARQL and while positional (property paths) it is visually confusing to overload. https://github.com/w3c/rdf-star/blob/main/tests/turtle/syntax/turtle-star-annotation-2.ttl
The point here is to use the blank nodes to have separate groups relating to one triple, carrying on the "use modelling" style for some use cases and keeping the groups apart. |
It may be obvious, but these are the changes I made to the Turtle* EBNF:
Note that this allows empty annotations, and potentially annotating members of a collection, although I do not support this in my parser. It also doesn't allow annotating an embedded triple (which uses Also, it could potentially allow recursive annotations within an annotation: :a :b :c {| :d :e {| :f :g |} |} . Certainly, these considerations are subject to discussion. Also, IMO, given that |
https://www.w3.org/TR/turtle/#grammar-production-predicateObjectList |
My reading (not check by computer... yet!):
But - and this is the only other use of
To not have annotation syntax there, have |
We could do that, but it's not strictly necessary for the grammar to limit the usage inherently, it could be done with prose. Adding the grammar rule is simple enough, but hell is paved with a series of "simple" changes. |
prose is a last resort! |
The computer says "yes" (including no annotation in RDF collections, which is not too disruptive). I've experimented with the syntax in both a hand crafted parser (faster) and one that is JavaCC that follows the spec text, with already Turtle* is not a redesign of Turtle so what would be bad is non-local changes: having to rewrite a significant proportion of the turtle just for some RDF* feature or to localize to certain cases only. So far - that is not looking likely with the current general proposal for annotation. |
FWIW I made a new PR #58 with preview enabled -- this makes it easier to discuss on a concrete proposal: https://pr-preview.s3.amazonaws.com/w3c/rdf-star/pull/58.html |
@gkellogg -- I now built a Turtle parser that exactly follows the Turtle and RDF* grammars so as to check details. A different way to handle "annotation" and have it not appear in a collection is to put it in the "objectList" production rather than "object". Then "collection" is unchanged.
This is contains the RDF* changes a little better compared to having than a new "listElt" (was "objectC") to duplicate the plain Turtle object rule (object without annotation). Having the annotation like this made it a little easier to generate the triples (it's a steaming parser emitting triples as the parse runs); the object rule returns an RDF Term which is passed, with subject and predicate from the input to objectList, to the annotation production. |
Looks goo, Andy; I’ll update my parser as well. Do we have negative tests for, the object annotation in collection case? |
Excellent idea and I updated the EYE parser accordingly and the following
produces
Also the following
produces
|
Not yet - there are just the two annotation examples. They would be added them if this is the direction the WG wishes to go. It would be helpful if the WG could decide that it wanted to explore the "annotation=PG" route and we start on evaluation tests. Given we are all time-poor, I think we have enough to indicate the direction to the point where we can get some confirmation then go back and complete the tests, not complete just syntax then move to evaluation. Evaluation tests need to way to define results If we define NT* with |
Yes - those are consequences.
because the annotation does not apply to (IMO Object lists are of lesser use in plain Turtle anyway!) |
Yes indeed,
annotates
|
I see how visually it can give the wrong impression. However, the rule is IMO simple enough to understand: annotations only apply to the object just before them. |
#58 is now merged. |
I am on it: #65 ;-) |
This has already been discussed on the mailing list.
The idea would be to have a notation like
as shortcut for
The text was updated successfully, but these errors were encountered: