Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grammar updates for triple terms and occurrences. #51

Merged
merged 1 commit into from
Aug 1, 2024

Conversation

gkellogg
Copy link
Member

@gkellogg gkellogg commented Dec 20, 2023

  • Changes quotedTriple (and related) to tripleTerm (Note, this could perhaps be "Reified Triple Term", as "tripleTerm" and "reifier" have subtly different meanings).
  • Changes annotation to allow an identifier. (Note, this change makes the grammar no longer context free).
  • Depends on triple term being defined in RDF Concepts.

Preview | Diff

@gkellogg gkellogg added the spec:substantive Issue or proposed change in the spec that changes its normative content label Dec 20, 2023
@gkellogg
Copy link
Member Author

My interpretation (BNF only) of @afs proposed changes for triple terms and triple occurrences. No change to parser rules, thus far. Raw BNF in Files view, rendered via GitHack.

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Show resolved Hide resolved
@gkellogg
Copy link
Member Author

The nomenclature and wording in the Quoted Triples section will still require quite a bit of revision. Conceptually, we need to know how to talk about triple descriptors in relation to other triples in a graph, and how quoted triples/triple tokens/triple occurrences related to triple descriptors and what the mean. Most of this needs to go in Concepts, but needs to be echoed in Turtle and other concrete syntaxes. Also, we may discourage the direct use of triple descriptors favoring annotations and quoted triples/whatever.

The main point of this draft, so far, is to get the grammar and basic usage consistent with discussions.

@gkellogg gkellogg requested review from afs and TallTed February 26, 2024 21:20
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
@gkellogg gkellogg marked this pull request as ready for review March 6, 2024 22:10
Copy link
Contributor

@pchampin pchampin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not finished my review of the Parsing section, but here are already a number of comments.

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
@gkellogg gkellogg requested a review from TallTed March 20, 2024 20:31
@gkellogg gkellogg marked this pull request as draft April 4, 2024 16:25
spec/turtle.bnf Outdated
reifier ::= '<<' ((iri | BlankNode) '|' )? subject predicate object '>>'
tripleTerm ::= '<<(' subject predicate ttObject ')>>'
ttObject ::= iri | BlankNode | literal | tripleTerm
annotation ::= '{|' ( (iri | BlankNode) '|' )? predicateObjectList '|}'

This comment was marked as outdated.

@TallTed
Copy link
Member

TallTed commented Jul 17, 2024

I think Allow zero or many annotations should be Allow zero or more annotations, as I don't think there's any forbiddance of one annotation, which would seem to be excluded by "many". Possibly Allow any number of annotations though that might allow negative numbers, so maybe Allow any non-negative number of annotations....

@domel
Copy link
Contributor

domel commented Jul 17, 2024

@gkellogg Could you elaborate on that. Maybe I missed something but why => (two characters) is better than | (one character)?

@gkellogg
Copy link
Member Author

I think Allow zero or many annotations should be Allow zero or more annotations, as I don't think there's any forbiddance of one annotation, which would seem to be excluded by "many". Possibly Allow any number of annotations though that might allow negative numbers, so maybe Allow any non-negative number of annotations....

Can you point out where that text is in the document? I don't see it anywhere. That said, other places in this and other documents use "zero or more", so I'd be fine with that.

@gkellogg
Copy link
Member Author

@gkellogg Could you elaborate on that. Maybe I missed something but why => (two characters) is better than | (one character)?

@niklasl pointed out that using | creates a problem in SPARQL, where | is a property path component, so (in SPARQL) something like {| a | p o |} could either be a reifier a on the predicate/object p o, or a property path a | p so changing that separating | is something to consider. I've seen => in some more recent notes passed around the group. It's hard to think of another single character that would seem to fit.

Whatever we do for annotation will also be necessary for reifier to be consistent.

@niklasl
Copy link

niklasl commented Jul 17, 2024

See also w3c/rdf-star-wg#116. One important aspect (IMO) is that a prefix or wrapping notation is valuable for reading these, to avoid reading the name (which may be a long hash, as in Wikidata qualifiers) as a predicate (in annotations) or the subject (in the << ... >> form (triple ... descriptors?)).

We also need to be careful with whatever is added so it doesn't block any other future designs that we can foresee, or steps on other syntaxes unnecessarily. In Notation 3 => is a shorthand for log:implies.

I've done some more evaluation (all the examples from the UCR plus a gamut of Wikidata data), and actually found that wrapping the name in |...| which @afs suggested (among some other alternatives) seems, with proper spacing, to be a fairly OK alternative even in annotations. I'm not sure how many other reasonable wrapping delimiters we have available.

(That's the Ruby/Rust lambda-style; also figuring in some musical and mathematical notations (|abs|), etc. I put examples in a gist.)

I've also suggested some more radical changes; but admittedly some of those caters less for what is likely the more common case (one- or many-to-one; names added for reference; annotation data still better to keep with the value). The naming-only form (with any SPARQL-compatible syntax) would still work well if you need to "tag" a bunch of triples with the name of a many-to-many reifier.

@domel
Copy link
Contributor

domel commented Jul 17, 2024

@niklasl totally agree, => is used in N3. IMHO it's a bad choice.

@domel
Copy link
Contributor

domel commented Jul 17, 2024

How about ~?

@gkellogg
Copy link
Member Author

I've done some more evaluation (all the examples from the UCR plus a gamut of Wikidata data), and actually found that wrapping the name in |...| which @afs suggested (among some other alternatives) seems, with proper spacing, to be a fairly OK alternative even in annotations. I'm not sure how many other reasonable wrapping delimiters we have available.

That's pretty much what this version of the grammar does if we allow whitespace in the {| and |} tokens:

annotation            ::= '{' WS* '|' ((iri | BlankNode) '|')? predicateObjectList '|' WS* '}'

To make it a token, we'll need to define some terminals:

reifid               ::= ((iri | BlankNode) '|')
annotation           ::= ANNO_START reifid? predicateObjectList ANNO_END
ANNO_START           ::= '{' WS* '|'
ANNO_END             ::= '|' WS* '}'

I also have a version which defines a reifid rule ((iri | BlankNode) '|'), although I think we'll need to tweak the production names.

@niklasl
Copy link

niklasl commented Jul 17, 2024

That's pretty much what this version of the grammar does if we allow whitespace in the {| and |} tokens:

annotation            ::= '{' WS* '|' ((iri | BlankNode) '|')? predicateObjectList '|' WS* '}'

I think that's still ambiguous though, unless using a preceding whitespace is to be significant? Otherwise, this: { |:x| :y :z |} still has the problem of matching as an AlternativePath in SPARQL.

I've tried the suggestion to wrap the name, like:

annotation	::=	"{|" embeddedName? predicateObjectList? "|}"
embeddedName	::=	'|' (iri | BlankNode) '|'

which parses the examples I linked above. A simplified sample:

<Q34851> :nominatedFor <Q103618> {| |<Q103618#6698506f>| a :Nomination ;
        :forWork <Q582281> ;
        :date "1958-02-18"^^xsd:date |} ,
    <Q103618> {| |<Q103618#6698cb58>| a :Nomination ;
        :forWork <Q713979> ;
        :date "1959-02-23"^^xsd:date |} .

This also helps spotting the identifier in named "quoted" triples (using more real wikidata to illustrate the problem):

<< |s:Q34851-05722875-6765-4486-9197-729D8AB780ED| <Q34851> wd:P4342 "Elizabeth_Taylor_-_filmskuespiller" >>
    wd:P2241 <Q45403344> ;
    :rank :Deprecated .

Since otherwise you'd have to scan (as a human reader) beyond the id to know if it's the subject of a triple, or a name followed by a triple. (I've mixed up name and subject in these even when editing my own "toy" examples.)

@afs
Copy link
Contributor

afs commented Jul 18, 2024

Otherwise, this: { |:x| :y :z |} still has the problem of matching as an AlternativePath in SPARQL.

@niklasl - could you expand on that point please? In SPARQL 1.1 :s | :p :o ... is illegal. | is contextual - only in paths.

Isn't it, going left-to-right, { |:x| :y :z |} is {|, URI/prefixname/bnode, | and then the choice is made?

The SPARQL grammar target is LL(1) and LALR(1) which covers the mostly available choices for many programming languages.

@gkellogg
Copy link
Member Author

The annotation syntax get's a bit tricky, if any number of reifiers/annotations can be added to a triple. This allows:

  • :s :p :o ~ :r
  • :s :p :o ~ :r ~:r1
  • :s :p :o ~ :r {| :p :q |}
  • :s :p :o ~ :r {| :p :q |} ~:r1
  • :s :p :o ~ :r {| :p :q |} ~:r1 {| :p1 :q1 |}
  • :s :p :o ~ :r {| :p :q |} {| :p1 :q1 |}
  • :s :p :o {| :p :q |} ~:r1 {| :p1 :q1 |}

If the grammar has the following rules:

objectList            ::= object annotation* ( ',' object annotation* )*
reifier               ::= '~' (iri | BlankNode)
reifiedTriple         ::= '<<' subject predicate object reifier? '>>'
tripleTerm            ::= '<<(' subject predicate ttObject ')>>'
ttObject              ::=	iri | BlankNode | literal | tripleTerm
annotation            ::= reifier? ('{|' predicateObjectList '|}')?

There's a conflict because of using both annotation* and reifier? ('{|' predicateObjectList '|}')? which creates a LL(1) parser conflict. I'm sure there's a cleverer set of rules to avoid this.

@niklasl
Copy link

niklasl commented Jul 29, 2024

I like where this is going. (The SPARQL grammar indeed requires much more care!)

Triple reference << :s :p :o ~ :r >> . with postfix form looks in line with named annotation form (triple comes first).

The ~( name ) form could work with the many-to-one cases: ~( name1, name2 ) (with , to avoid confusion with lists, and also not looking like predicate object).

:s :p :o {| :q :z |} .

:s :p :o ~(:r) .

:s :p :o ~(:r, _:q) .

<< :s :p :o ~ :r >> .

@afs
Copy link
Contributor

afs commented Jul 29, 2024

My only concern is that a trailing reifier identifier may be counter-intuitive, but that may just be my own bias from having worked with the << :r | :s :p :o >> syntax for a while. Using ~ in a postfix notation does seem cleaner.

Agreed. At one level, it is shame to change what's been written about, but at the same time, it hasn't been universally adopted.
If we go postfix, then ~ vs | is pure choice and ~ disconnects from earlier writings and, for me, that is a resoanble decision to make.

@afs
Copy link
Contributor

afs commented Jul 29, 2024

The annotation syntax get's a bit tricky, if any number of reifiers/annotations can be added to a triple. This allows:

  • :s :p :o ~ :r
  • :s :p :o ~ :r ~:r1
  • :s :p :o ~ :r {| :p :q |}
  • :s :p :o ~ :r {| :p :q |} ~:r1
  • :s :p :o ~ :r {| :p :q |} ~:r1 {| :p1 :q1 |}
  • :s :p :o ~ :r {| :p :q |} {| :p1 :q1 |}
  • :s :p :o {| :p :q |} ~:r1 {| :p1 :q1 |}

I think that's good to allow.
That should be possible although not the way that grammar has it.

I can do that for SPARQL once the style is agree and I can trim down the universal grammar. The difference choice start to interact when Writing the multi-occurences does start to mix up with the couple some of the altenrative styles

Tentative direction:

  • Declarations for asserted and occurences (whatever the final names are)
  • ~:r reifier ids (or any choice that has a leading marker)
  • ~:r reifier id before an annotation block {| |} for one style of explicit reifier ids.

@afs
Copy link
Contributor

afs commented Jul 29, 2024

Triple reference << :s :p :o ~ :r >> . with postfix form looks in line with named annotation form (triple comes first).

Yes - while its not been the style up to now, the uniformity is appealing.

The ~( name ) form could work with the many-to-one cases: ~( name1, name2 ) (with , to avoid confusion with lists, and also not looking like predicate object).

Firstly, if we have this, we can have ~ :r and ~(:r1 r2). The comma isn't necessary and, to me, it's odd to have in some places and not others. , is object lists (which aren't lists! ... let's not go there). YMMV.

I don't think that ~() and an annotation block is very helpful especially in SPARQL.

    :s :p :o ~(:r1 r2) {| :q :z |} .
    :s :p :o ~(:r1 r2) {| :q ?z |} .

These more complex case maybe better as declaration-pattern:

    :s :p :o ~ :r .
    :d :e :f ~( :r1 r2 ) .
    :r1 :q :z .
    :r2 :q :z .

that is, ~(:r1 r2) is only allowed in a declaration form.

@gkellogg
Copy link
Member Author

The ~( name ) form could work with the many-to-one cases: ~( name1, name2 ) (with , to avoid confusion with lists, and also not looking like predicate object).

I'm not sure that this will be a common enough pattern to create special syntax for it, as it's fairly easy (and arguably clearer) to create separate statements for each reifier. Also, serializing a graph containing duplicate reifiers with some overlapping annotations would be pretty challenging. I'd say we start with the single reifier grammar and re-consider adding ~(name1 name2) if it becomes important.

@niklasl
Copy link

niklasl commented Jul 29, 2024

Triple reference << :s :p :o ~ :r >> . with postfix form looks in line with named annotation form (triple comes first).

Yes - while its not been the style up to now, the uniformity is appealing.

Agreed.

The ~( name ) form could work with the many-to-one cases: ~( name1, name2 ) (with , to avoid confusion with lists, and also not looking like predicate object).

Firstly, if we have this, we can have ~ :r and ~(:r1 r2). The comma isn't necessary and, to me, it's odd to have in some places and not others. , is object lists (which aren't lists! ... let's not go there). YMMV.

Makes sense.

I don't think that ~() and an annotation block is very helpful especially in SPARQL.

    :s :p :o ~(:r1 r2) {| :q :z |} .
    :s :p :o ~(:r1 r2) {| :q ?z |} .

I agree (and find that combination harder to read too).

These more complex case maybe better as declaration-pattern:

    :s :p :o ~ :r .
    :d :e :f ~( :r1 r2 ) .
    :r1 :q :z .
    :r2 :q :z .

that is, ~(:r1 r2) is only allowed in a declaration form.

Yes, I think I'd readily accept that.

It's akin to blank nodes, where the embedded [ ... ] form works for simple cases. In complex cases (e.g. many-to-many), declaration patterns probably work better. (Both for serializing and, IMHO, for reading — I'm used to look for a "top-level description" for names that are used in many places.)

@niklasl
Copy link

niklasl commented Jul 29, 2024

The ~( name ) form could work with the many-to-one cases: ~( name1, name2 ) (with , to avoid confusion with lists, and also not looking like predicate object).

I'm not sure that this will be a common enough pattern to create special syntax for it, as it's fairly easy (and arguably clearer) to create separate statements for each reifier. Also, serializing a graph containing duplicate reifiers with some overlapping annotations would be pretty challenging. I'd say we start with the single reifier grammar and re-consider adding ~(name1 name2) if it becomes important.

Sure; there are pros and cons here (repeating only the object probably isn't too bad).

It might be somewhat important though, so let's keep it open for more feedback. For example, cases derived from Wikidata may map cleanly to multiple reifiers per triple — here's a sketch using the ~(name1 ...) form (and two non-asserted triple descriptions at lines 624 and 626).

Some serialization considerations. Given a "random" triple stream (here as pseudo-ntriples-with-pnames):

:s :p :o .
r1 rdf:reifies <<( :s :p :o  )>> .
:r2 rdf:type :Note .
:r1 rdf:type :Note .
:r2 rdf:reifies <<( :s :p :o )>> .
:r2 rdf:reifies <<( :s :p :q )>> .
:s :p :q .

A process with some memory of seen triples but no buffering nor indexing can still stream out "best effort" Turtle line-wise, making it more "well-formed":

:s :p :o ~ :r1 .
:r2 a :Note .
:r1 a :Note .
:s :p :o ~ :r2 .
<< :s :p :q ~ :r2 >> .
:s :p :q .

Whereas a pretty-printer with access to the entire graph could do:

:s :p :o ~( :r1 :r2 ) ,
    :q ~ :r2 .

:r1 a :Note .

:r2 a :Note .

It would, for each triple, serialize all rdf:reifies triples as annotation name "markers".

AFAICS, only if such markers are neither reifiers of any other triple, nor the subject of any other triple, can they be serialized using the blank {| ... |} annotation syntax. That may very well be in the absolute majority in practice, which is fine; and the syntax caters well for that. I hope these other declaration patterns will also work well for the other, more complex scenarios.

@afs
Copy link
Contributor

afs commented Jul 30, 2024

The ~( name ) form could work with the many-to-one cases: ~( name1, name2 )

Am I reading Gregg's multiple annotations proposal correctly here and this can be done with:

  • :s :p :o ~:r ~:r1 .

?

@afs
Copy link
Contributor

afs commented Jul 30, 2024

A question for clarification:

:s :p :o ~:r {| :p :q |} {| :p1 :q1 |} .

Is this case 1, which I prefer, and was my initial reading
(... then I wrote the parser rule trying to make it clear what was happening rather than merely passing the language ...)

The second annotation block has a generated reification id and would be the same as writing:

:s :p :o {| :p1 :q1 |} ~:r {| :p :q |} .

and making

:s :p :o {| :p1 :q1 |} {| :p2 :q2 |} .

two separate blank nodes?

:s :p :o .
_:b1 rdf:reifiies <<(:s :p :o)>> .
_:b1 :p1 :q1 .
_:b2 rdf:reifiies <<(:s :p :o)>> .
_:b2 :p2 :q2 .

or is it case 2
where the ~:r apply to all following blocks until the next reifierId and if so what about
:s :p :o {| :p1 :q1 |} {| :p2 :q2 |}. - does the second block "inherit" the blank node from the first?
The "the same bnode" here in case2 feels odd.

At some point, we have to say "don't rely on gnarly expressions to do what you want - write them clearly" and provide a justifiable reading.

Case 1 style would be explaining

:s :p :o {| :p1 :q1 |} {| :p2 :q2 |} .

as shorthand for

:s :p :o {| :p1 :q1 |} .
:s :p :o {| :p2 :q2 |} .

@gkellogg
Copy link
Member Author

:s :p :o ~:r {| :p :q |} {| :p1 :q1 |} .

Is this case 1, which I prefer, and was my initial reading (... then I wrote the parser rule trying to make it clear what was happening rather than merely passing the language ...)

The second annotation block has a generated reification id and would be the same as writing:

:s :p :o {| :p1 :q1 |} ~:r {| :p :q |} .

That's my interpretation, and what I think makes sense.

and making

:s :p :o {| :p1 :q1 |} {| :p2 :q2 |} .

two separate blank nodes?

+1

:s :p :o .
_:b1 rdf:reifiies <<(:s :p :o)>> .
_:b1 :p1 :q1 .
_:b2 rdf:reifiies <<(:s :p :o)>> .
_:b2 :p2 :q2 .

or is it case 2 where the ~:r apply to all following blocks until the next reifierId and if so what about :s :p :o {| :p1 :q1 |} {| :p2 :q2 |}. - does the second block "inherit" the blank node from the first? The "the same bnode" here in case2 feels odd.

To me, that doesn't make sense.

At some point, we have to say "don't rely on gnarly expressions to do what you want - write them clearly" and provide a justifiable reading.

Case 1 style would be explaining

:s :p :o {| :p1 :q1 |} {| :p2 :q2 |} .

as shorthand for

:s :p :o {| :p1 :q1 |} .
:s :p :o {| :p2 :q2 |} .

+1

@gkellogg
Copy link
Member Author

There is a bit of ambiguity still in the proposed grammar. ~ :r {| :p :o |} could also be parsed as ~ :r ~ _:bn {| :p :o |}, as it's ambiguous if the annotation block stands alone or is intended to use :r as it's reifier. One way to solve this would be to define the grammar as follows:

annotation            ::= '{|' predicateObjectList '|}'
                        | reifier ('{|' predicateObjectList? '|}')

This way ~r would always need to be followed by a (potentially empty) annotation block. The example above would become ~ :r {||} {| :p :o |} to generate the following triples:

:s :p :o ~r {||} {| :p :o |} .

# expands to

:s :p :o .
:r rdf:reifiies <<(:s :p :o)>> .
_:b1 rdf:reifiies <<(:s :p :o)>> .
_:b1 :p :q .

Alternatively, the ambiguity can be resolved in parser logic and use an alternative grammar:

annotation            ::= reifier | '{|' predicateObjectList '|}'

If a parser parses a reifier and subsequently parses the annotation block it would assign the previously parsed reifier to that annotation block, but the BNF itself is ambiguous which is concerning.

@niklasl
Copy link

niklasl commented Jul 31, 2024

Is there a need to both name and describe the reifier in place? With bnodes its either id or description block, so it would follow the general Turtle design to either id or describe an anonymous reifier here too.

@gkellogg
Copy link
Member Author

Is there a need to both name and describe the reifier in place? With bnodes its either id or description block, so it would follow the general Turtle design to either id or describe an anonymous reifier here too.

We need the ability to name a description block with either an IRI or a blank node. If not provided, the name (reifier) is automatically generated. Because the grammar allows both the description block and the reifier to be optional we have a conflict. Based on discussion, it seems that there is a need to both name and describe or just describe a description block.

@afs
Copy link
Contributor

afs commented Jul 31, 2024

the BNF itself is ambiguous

The BNF is fine - what has to be defined is the translation from the syntax tree to triples output (section 7).

This is LL(1) for the multiple annotation case via the *

        Object           :=	GraphNode Annotation
	Annotation       :=	( Reifier | AnnotationBlock )*
	AnnotationBlock  :=	<L_ANN> PropertyList <R_ANN>

(In SPARQL, PropertyList can be empty. GraphNode is anything that can go in the object position (it's not a very good name))

These parse rules do not try to associate the reifier with the annotation block. It is not showing as ambiguous because the sequence ~:e {| :x :y |}is just fine as concrete language - a reifier id followed by a {| |} block.

The meaning, the translation to triples, would have a state variable for the reifier id which is initially unset, then set by ~:e then cleared by |}. Similar to :s :p1 :o1 ; :p2 :o2 . passing the subject on until DOT.

Writing

annotation            ::= '{|' predicateObjectList '|}'
                        | reifier ('{|' predicateObjectList? '|}')?

(I think there was a missing ? on the second line which I've included to allow :s :p :o ~:e .)

is a problem for multiple reifiers/annotation blocks. annotation* is concrete-language ambiguous.

~:e {| |} can be first clause, with ? empty then a second clause, or it can be first clause with non-empty ?.

@afs
Copy link
Contributor

afs commented Jul 31, 2024

To move forward I suggest moving this PR out of draft so as to merge it to get everything else into the doc even if the grammar isn't final.

Create a follow-up issue, or issues, for specific points in the grammar.

@gkellogg gkellogg marked this pull request as ready for review July 31, 2024 19:28
* Fix some references to non-existent term definitions.
* Spec updates (with placeholders) for reified triples and annotations.
* Update grammar for annotations and triple terms using `~` prefix for reifier.
* Remove extraneous statement on curSubject and curTriple.
* Note on old vs new reification.
* Fix duplicate example identifier.
* Remove reference to "asserted triple", and fix reference to "annotation" production.
* Update to use "reification" and "rdf:reifies" instead of "triple occurence" and "rdf:nameOf".
* Fix object of tripleOccurence to be `object`, not `ttObject`.
* Update description and processing instructions for triple terms, triple occurrences, and annotations.
* Grammar updates for triple terms and occurrences.

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
Co-authored-by: Pierre-Antoine Champin <pierre-antoine@w3.org>
@gkellogg
Copy link
Member Author

Squashed and force-pushed to rebase to main.

@gkellogg gkellogg dismissed pchampin’s stale review August 1, 2024 19:58

Agreed to merge in WG meeting on Aug 01

@gkellogg gkellogg merged commit ad74b94 into main Aug 1, 2024
2 checks passed
@gkellogg gkellogg deleted the triple-term-occurance branch August 1, 2024 19:58
Copy link
Member

@TallTed TallTed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub suddenly told me that this was merged... an hour ago. sigh

I'm thinking it will be faster/easier for you to put these into a new PR than for me, but I can do it if it's a burden.


<pre id="ex-quoted-triple"
<a href="#grammar-production-ttObject"><code>ttObject</code></a>,
optionally follwed by a <a href="#grammar-production-reifier"><code>reifier</code></a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm making up this "reifier delineator" (other wording may be better), but something like it must be called out as being here, between the ttObject and the IRIREF/BlankNode.

Suggested change
optionally follwed by a <a href="#grammar-production-reifier"><code>reifier</code></a>
optionally followed by a reifier delineator (tilde, `~`)
with a <a href="#grammar-production-reifier"><code>reifier</code></a>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "optionally followed by a reifier, composed of a ~followed by an IRIREF or BlankNode, and ending with >>".

Comment on lines +816 to +817
If the optional reifier is not present, a fresh RDF blank node is allocated,
as with `&lt;&lt; :subject :predicate :object &gt;&gt;`.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the reifier delineator (~) be present without a following IRIREF/BlankNode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not allowed by the grammar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no technical problem with ~ used as Ted suggests. The usage of wantign to allocate the reifier id once when asserted for common use later has up before.

It has uses to allocate the id at that point so that there is one reifier agreed between later updates.

:s :p :o ~ .
INSERT  { ?e :added ?t } WHERE { :s :p :o ~?e . BIND(NOW() as ?t) }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is in the context of reifiedTriple, so I don't see a case where << :s :p :o ~ >> does any more than << :s :p :o >>, and is certainly not allowed by the grammar (at least the grammar that is included now).

In the annotation use case, :s :p :o ~ could, indeed, allocate a new blank node, although one which cann't be referenced for creating more triples. It could be equivalent to the following:

_:bn rdf:refies <<( :s :p :o )>> .

Where the :_:bn node is freshly allocated. Still not allowed in the existing grammar, and would require something like the following:

annotation            ::= (('~' (iri | Blank Node)?) | '{|' predicateObjectList '|}')*

IMO, better to have a single reifier rule for both reified triples and annotations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Turtle, yes, <<:s :p :o >> and <<:s :p :o ~>> amount to the same thing.

The :s :p :o ~ also asserts which is the preparation for later use is more motivating. It's more regularity to include it in <<>>.

better to have a single reifier rule for both reified triples and annotations.

Agreed.

Comment on lines +820 to +821
like `&lt;&lt;&nbsp;&nbsp;:subject1&nbsp;:predicate1&nbsp;&lt;&lt;&nbsp;:subject2&nbsp;:predicate2&nbsp;:object2&nbsp;&gt;&gt;&nbsp;~:IRIREF1&nbsp;&gt;&gt;`
or `&lt;&lt;&nbsp;:subject4&nbsp;:predicate4&nbsp;&lt;&lt;&nbsp;&nbsp;:subject3&nbsp;:predicate3&nbsp;:object3&nbsp;~:IRIREF3&nbsp;&gt;&gt;&nbsp;&gt;&gt;`.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why &nbsp; instead of ? And why are they sometimes doubled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should not be doubled, but obviously, the purpose of the &nbsp; is to keep the triple elements together.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if the line is longer than the viewpane, it should be allowed to (and fairly clear that it did) wrap. Forcing no-wrap seems likely to cause more confusion than it saves.

which provides a convenient shortcut.
An annotation can be used to both assert a triple and have that triple be the
An annotation can be used to both assert a triple,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
An annotation can be used to both assert a triple,
An annotation can be used to simultaneously assert a triple,

of the <a href="#grammar-production-predicateObjectList"><code>predicateObjectList</code></a>
contained within the annotation delimeters.
If explicitly identified, the same reifier can then be used as the
<a data-cite="RDF12-CONCEPTS#dfn-object">object</a> of additional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<a data-cite="RDF12-CONCEPTS#dfn-object">object</a> of additional
<a data-cite="RDF12-CONCEPTS#dfn-subject">subject</a> or
<a data-cite="RDF12-CONCEPTS#dfn-object">object</a> of additional

</p>

<p class="note">The annotation syntax is a syntactic short cut in Turtle,
<p class="note">The annotation syntax is a syntactic shortcut in Turtle,
and the RDF Abstract Syntax [[RDF11-CONCEPTS]] does not
distinguished how the triples were written.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
distinguished how the triples were written.</p>
distinguish how the triples were written.</p>

@@ -1533,6 +1609,8 @@ <h3>Parser State</h3>

<p>Parsing Turtle requires a state of six items:</p>

<p class="ednote">Describe parser state for tracking reifier to associated with an annotation block.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<p class="ednote">Describe parser state for tracking reifier to associated with an annotation block.</p>
<p class="ednote">Describe parser state for tracking reifier to be associated with an annotation block.</p>

Comment on lines +1717 to +1720
The term constructed from this production
is composed of an identifier from either the <a href="#grammar-production-iri"><code>iri</code></a>
or <a href="#grammar-production-BlankNode"><code>BlankNode</code></a> productions,
if present, otherwise from a fresh RDF <a data-cite="RDF12-CONCEPTS#dfn-blank-node">blank node</a>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The term constructed from this production
is composed of an identifier from either the <a href="#grammar-production-iri"><code>iri</code></a>
or <a href="#grammar-production-BlankNode"><code>BlankNode</code></a> productions,
if present, otherwise from a fresh RDF <a data-cite="RDF12-CONCEPTS#dfn-blank-node">blank node</a>.
The term constructed from this production
is composed of an optional identifier from either the <a href="#grammar-production-iri"><code>iri</code></a>
or the <a href="#grammar-production-BlankNode"><code>BlankNode</code></a> productions;
otherwise, from a fresh RDF <a data-cite="RDF12-CONCEPTS#dfn-blank-node">blank node</a>.

Comment on lines +1728 to +1730
is composed of an identifier from either the <a href="#grammar-production-iri"><code>iri</code></a>
or <a href="#grammar-production-BlankNode"><code>BlankNode</code></a> productions,
if present, otherwise from a fresh RDF <a data-cite="RDF12-CONCEPTS#dfn-blank-node">blank node</a>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
is composed of an identifier from either the <a href="#grammar-production-iri"><code>iri</code></a>
or <a href="#grammar-production-BlankNode"><code>BlankNode</code></a> productions,
if present, otherwise from a fresh RDF <a data-cite="RDF12-CONCEPTS#dfn-blank-node">blank node</a>.
is composed of an optional identifier from either the <a href="#grammar-production-iri"><code>iri</code></a>
or the <a href="#grammar-production-BlankNode"><code>BlankNode</code></a> productions;
otherwise, from a fresh RDF <a data-cite="RDF12-CONCEPTS#dfn-blank-node">blank node</a>.

@gkellogg
Copy link
Member Author

gkellogg commented Aug 1, 2024

GitHub suddenly told me that this was merged... an hour ago. sigh

I'm thinking it will be faster/easier for you to put these into a new PR than for me, but I can do it if it's a burden.

Sure, I can incorporate this into a followup PR.

@pchampin
Copy link
Contributor

This was discussed during the rdf-star meeting on 26 September 2024.

View the transcript

syntax for reifiers

<doerthe> I have to leave, sorry

ora: I think the main point of contention is whether this is prefix or postfix

gkellogg_: that, and tilda versus pipe or other characters.

ora: AndyS, you make a point about ease of parsing

AndyS: not just that. The pipe is already used in SPARQL, although there are ways around that.
… Enrico made the point that in N-Triples, the reifier comes first (in the subject position).
… I think that internal consistency in Turtle is more important.

tl: I made a few proposals, including the use of pipe everywhere, and replacing the curly brackets in the annotation syntax.
… I think we should have looked at the problem that way.
… I find Enrico's argument about the position in N-Triples irrelevant.

ora: you are saying this is a usability issue.

tl: yes, it is the interface, it is important to get this right.

niklasl: I agree, affordances are important, that's why the pipe is tricky because of its use in SPARQL.

pchampin: I agree that this is turnning into a broad discussion we can't do in a short amount of time.
… We can consider suffix vs. prefix and separately the tokens used.

niklasl: agreed, long prefix makes things hard to read

<ora> STRAWPOLL: Postfix?

<ora> +1

<gkellogg_> +1

<pchampin> +1

<tl> +1

<niklasl> +1

<pfps> 0

<Dominik_T> 0

<gtw> +1

<TallTed> +1

<AndyS> +1

<eBremer> +1

<ktk> +1

<ktk> Tpt: are you around?

<Tpt> I am back

Ora: there is still the question of which character we choose

ora: There are arguments against |

ora: There will re reifirers without annotations blocks and annotation blocks without reifiers

ora: if you see an annotation block after a reifier, it is related to this reifier so there is some memory needed

<tl> my 5cents on syntax: https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Sep/0073.html

AndyS: it's easier than doing RDF list

gtw: Have we a concised summary of the various syntaxes?

<AndyS> https://github.com/w3c/rdf-turtle/blob/main/spec/turtle.bnf

<pchampin> << :s :p :o ~ :r >>.

<tl> Souri asked for that

<niklasl> I tried to have a bunch of variants appear "naturally" in https://niklasl.github.io/rdf-docs/presentations/RDF-reifiers-1/ Slide 19 uses that form.

tl: I would like to point this syntax proposal but I thought we would do syntax later : https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Sep/0073.html

<pchampin> :s :p :o ~ :r1 ~ :r2 {| :a :b |}.

gkellogg: you can insert more than 1 annotation or refifier

<pchampin> :s :p :o ~ :r1 ~ :r2 {| :a :b |} ~ :r3.

gkellogg: in any order

pchampin: if there is no reifier before annotations, the reifier is a blank node

AndyS: what I find odd is that the annotation block have to have at least one predicate object inside

AndyS: it makes generating this kind of syntax from a program more complicated

<niklasl> That empty annotation blocks weren't allowed did trip me up in my introductory slide (8) for annotation sugar.

ora: Both Turtle and SPARQL use predicateObjectList+

<niklasl> So +1 from me for allowing it. Makes it easier to save hand-edited, unfinished turtle...

<tl> from my proposal: { :s1 :p :o . :s2 :p :o | :r1 } [| :a :b |] .

<AndyS> :s :p :o ~ :r1 ~ :r2 {| :a :b |} {| :c :d |}

<niklasl> <s> :p <o> ~ <r1> {| a :Named |} . <s> :p <o> ~ <r1> ~ {| a :NotNamed |} .

AndyS: Are you suggesting we have an empty annotation block to "cancel" the preceding reifier?

<niklasl> See above line. :)

gkellogg: you can do "~ {|" to get a blank node

<tl> from my proposal: <| :s :p :o | :r |> :a :b .

tl: We should keep {} for group of statements, not annotations

tl: If we change the abstract reified triple to <<| we use pipes everywhere

tl: That way the pipe would be everywhere we use RDF-*

gkellogg: I am afraid it collide with N3 where they use | for object paths

gkellogg: the triple object can be a path, and I believe it can include "|"

gkellogg: This would be against a bare pipe

<Dominik_T> gkellogg can you provide a link or an example where in N3 pipe can be used?

pchampin: I would like to come back to the previous topic, my personal opinion is that ~ without identifier is a bit strange. I would argue it's not ncessary required we can still write ~ []

gkellogg: A [] now means bnode property list

gkellogg: If we allow empty annotation blocks, it's also a way to avoid the empty ~

<gtw> I believe per the current Turtle draft spec, [] would be valid per the reifier rule: `reifier::='~' (iri | BlankNode)?` (via BlankNode)

AndyS: I think it's a bit confusing because it would be the only place where you can have [] but not [ propertyObjectList ]

ora: If we confuse users it's not going to lead to anything good

ora: We have this think with multiple reifiers and annotations. Is it really relevant?

ora: I don't want for people to start to write things and getting it wrong

<niklasl> Pro/con: <s> :p <o> ~ [ :date "2024" ] . # Pro: Regularity, same syntax for bnodes. Con: may be odd in combination with the naming-and-describing pairing mechanism.

ora: Syntax discussions are often more difficult semantic discussions

<niklasl> +1 for syntax being more difficult (also: "there is only syntax")

ora: It would be nice if we can break this up into a series of decisions

ora: would be nice if somebody take the trouble to figure out which decisions we have to make, we would have examples of the variants

pchampin: if we keep "<<" we need to keep it consistent with what people expect from the CG

tl: << has been used also for asserted things

tl: what part of history do we refer to when we talk about user assumptions?

<pchampin> q.

pchampin: To be clear I said "if we keep" the <<, getting ride of it alltogether is a way to solve the problem

gkellogg: It would be nice to make a decision, everything depends on it

ora: it's unfortunate that the syntax PR has been opened for such a long time with not enough attention

ora: People often take tiny interest on syntax, way less than it is warenteed

ora: I am open to suggestions how to do this

AndyS: we should take this offline

ora: agree we do this offline, in a way we ended up in a place I did not wanted to end, fighting over these things

ora: I suggest chairs will pick this up and will go from there

<pfps> which PR?

<pchampin> w3c/rdf-turtle#51

<gb> MERGED Pull Request 51 Grammar updates for triple terms and occurrences. (by gkellogg) [spec:substantive]

pchampin: In the interest of splitting into multiple decisions, I think we can bundle the brackets for triple term, unasserted triples and annotations


@rat10
Copy link

rat10 commented Sep 27, 2024

@afs

If we go postfix, then ~ vs | is pure choice

I hope this is still valid, and it is good to know.

and ~ disconnects from earlier writings and, for me, that is a resoanble decision to make.

I beg to differ: we may have been working for years on this, but we're still not in the situation where we have to cater for an installed base. We can still do what we want, and we should strive for a design that is coherent and compelling. Updating our examples or getting confused in discussions by examples from different periods is a minor problem compared to users of the finished spec having to deal with the side effects of some tactical decisions forever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:substantive Issue or proposed change in the spec that changes its normative content
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants