Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

annotation syntax for SPARQL* #65

Closed
wants to merge 8 commits into from
Closed

Conversation

hartig
Copy link
Collaborator

@hartig hartig commented Dec 11, 2020

This PR is meant to address the SPARQL* part of #9


Preview | Diff

…e lines of #58; first step to address the SPARQL* part of #9
@hartig
Copy link
Collaborator Author

hartig commented Dec 11, 2020

I am done with extending the spec to cover the annotation syntax for SPARQL*.

Preview: https://pr-preview.s3.amazonaws.com/w3c/rdf-star/pull/65.html

The parts that I have extended are:

Please take a look.

/cc @pchampin @afs @gkellogg

@hartig hartig marked this pull request as ready for review December 11, 2020 20:13
@gkellogg
Copy link
Member

Note that, in the existing grammar, EmbTP is really not strict enough:

[174] EmbTP               ::= '<<' EmbSubjectOrObject Verb EmbSubjectOrObject '>>'
[175] EmbSubjectOrObject  ::= Var | BlankNode | iri | RDFLiteral | NumericLiteral |
                              BooleanLiteral | EmbTP

An EmbSubjectOrObject includes literals, and can't exist as the subject of any triple. I think previously, we had VarOrBlankNodeOrIriOrEmbTP and VarOrTermOrEmbTP for subject and object, which have appropriate restrictions.

[107s] VarOrBlankNodeOrIriOrEmbTP ::= Var | BlankNode| iri | EmbTP
[176] VarOrTermOrEmbTP            ::= Var | GraphTerm | EmbTP

@afs
Copy link
Collaborator

afs commented Dec 13, 2020

SPARQL allows literals as subjects. They just never match.

They arise naturally - most clearly, with reverse paths.

A triple pattern is "(RDF-T ∪ V) x (I ∪ V) x (RDF-T ∪ V)"

https://www.w3.org/TR/sparql11-query/#sparqlTriplePatterns

VarOrTerm seems the place to add them because GraphTerm is without variables.

@gkellogg
Copy link
Member

Thanks, @afs, if I was aware of that, I've since forgotten.

@gkellogg
Copy link
Member

I think we need a change to ObjectListPath as well:

[86]  ObjectListPath ::= ObjectPath AnnotationPattern? ( ',' ObjectPath AnnotationPattern? )*

When I try the following example, that branch is hit in my parser, at least:

PREFIX : <http://bigdata.com/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ex:  <http://example.org/>

SELECT ?age ?c WHERE {
   ?bob foaf:name "Bob" {| ex:certainty ?c |}.
}

@hartig
Copy link
Collaborator Author

hartig commented Dec 13, 2020

@gkellogg what do you mean by "hit in my parser"? What's wrong with the query? (I don't think anything.)

@gkellogg
Copy link
Member

What I meant was, that when I parsed that example the parser took the path including ObjectListPath instead of ObjectList. There are parallel paths through the grammar with and without Path. In this case, it seems to be because the path through the parser is the following:

Query
  SelectQuery
    WhereClause
      GroupGraphPattern
        GroupGraphPatternSub
          TriplesBlock
            TriplesSameSubjectPath
              PropertyListPathNotEmpty
                ObjectListPath

I believe the ObjectList production is used in CONSTRUCT, and ObjectListPath in WHERE.

@hartig
Copy link
Collaborator Author

hartig commented Dec 14, 2020

Thanks Greg! In fact, the PropertyListPathNotEmpty production in the original SPARQL 1.1 grammar uses both, ObjectList and ObjectListPath.

[83]  PropertyListPathNotEmpty  ::=  ( VerbPath | VerbSimple ) ObjectListPath ( ';' ( ( VerbPath | VerbSimple ) ObjectList )? )*

Hence, in addition to extending the ObjectList production (as done in my PR so far), we also need to extend the production ObjectListPath as follows:

[86]  ObjectListPath  ::=  ObjectPath AnnotationPattern? ( ',' ObjectPath AnnotationPattern? )*

In this context, I have also discovered another error in the current SPARQL* grammar: the production ObjectPath has to be extended as well!

[87]  ObjectPath  ::=  GraphNodePath | EmbTP

I will add these two extensions to the grammar to this PR.

@gkellogg
Copy link
Member

In this context, I have also discovered another error in the current SPARQL* grammar: the production ObjectPath has to be extended as well!

[87]  ObjectPath  ::=  GraphNodePath | EmbTP

Actually, this isn't required and causes a First/First conflict in my parser generator: GraphNodePath is defined as the following:

[105] GraphNodePath           ::= VarOrTermOrEmbTP | TriplesNodePath |

(Note extra | at the end, which is an error). So, VarOrTermOrEmbTP already covers the EmbTP case.

@hartig
Copy link
Collaborator Author

hartig commented Dec 15, 2020

Sorry Greg. My bad.

I have fixed the issue in the grammar now (see commit ecac9c4).

rdf-star-cg-spec.html Outdated Show resolved Hide resolved
@afs
Copy link
Collaborator

afs commented Dec 18, 2020

Annotations and paths:

ObjectList (used in template for CONSTRUCT and in SPARQL Update) is fine.

(aside: Unlike Turtle, it is possible to add to Object and ObjectPath because Collection uses GraphNode, not Object
but for the moment, let's stick to ObjectList*)

For ObjectListPath some forms can not be a syntax rewrite to <<>> and would need a change to evaluation - can only do the {| |} after you know the triple in the path.

  :s :p* :o {| :pp :oo |}
  :o ^:p :s {| :pp :oo |}
  :o !:p :s {| :pp :oo |}
  :s :p/:q :o {| :pp :oo |}
  :s (:p|:q) :o {| :pp :oo |}

The grammar is quite dependent on Path being recursive and including a single term as a path element.

One option is a text note saying "If annotation, must be simple path" or slightly more ambitiously, include trailing / case.

Co-authored-by: Pierre-Antoine Champin <github-100614@champin.net>
@hartig
Copy link
Collaborator Author

hartig commented Dec 18, 2020

Andy, you are right. I did not consider property path patterns. That's a problem.

Now, that you point out this problem, I would even say that it is a bad idea in general to mix property path patterns and the annotation syntax. The idea of property path patterns is to match paths (including their respective endpoints). RDF* is not about annotations of such paths but about annotations of single triples. In this sense, combining the annotation syntax with property path patterns does not seem to make much sense at all.

So, the question is whether there is an easy way to modify and extend the grammar such that the resulting grammar forbids combining property path patterns with the annotation syntax? If not, we may have to add an explicit note in the text.

@afs
Copy link
Collaborator

afs commented Dec 18, 2020

A lookahead on paths to distinguish property and path cases may be possible. Investigation required. SPARQL is designed to be parser-simple - it's plain LL(1) (and LALR(1)) so that the widest range of compiler tools can be easily used.

I'm keen to make the changes localised to keep the barrier to adoption low.

There is another implication with

:s :p :o {| :pp :oo |}

The embedded triple term is not available in a variable. Probably have to live with that; some things will require << >> usage.

@hartig
Copy link
Collaborator Author

hartig commented Dec 18, 2020

SPARQL is designed to be parser-simple [...] I'm keen to make the changes localised to keep the barrier to adoption low.

Yes, that's what I actually meant by "an easy way."

There is another implication with :s :p :o {| :pp :oo |} The embedded triple term is not available in a variable. Probably have to live with that; some things will require << >> usage.

Right. In fact, for this purpose, just using << ... >> instead of the annotation syntax is not sufficient either. You would have to use the SPARQL* version of BIND instead. For instance, by assuming the original PG-mode-based evaluation semantics of BIND (as defined in my original paper), the corresponding query would be:

SELECT ?t WHERE {
   :s :p :o .
   BIND( <<:s :p :o>> AS ?t )
   ?t :pp :oo .
}

...and by assuming the evaluation semantics of BIND as defined in our spec now, the query would be:

SELECT ?t WHERE {
   BIND( <<:s :p :o>> AS ?t )
   ?t :pp :oo .
}

@pchampin
Copy link
Collaborator

This was discussed during today's call: https://w3c.github.io/rdf-star/Minutes/2020-12-18.html#item02

@hartig
Copy link
Collaborator Author

hartig commented Feb 19, 2021

I have tried to find a simple solution to extend the grammar in a way such that it permits the annotation syntax only in triple patterns but not in property path patterns, where "simple" means something that does not require either changing major parts of the existing grammar or parsers that can look ahead more steps than what is needed with the existing grammar. After looking again at the existing grammar in detail, I don't think that such a solution exists :-(

Therefore, my proposal is to keep the grammar extension as specific in this PR and add a note that specifies the restriction in text form (similar to the notes in Section 19.8 of the SPARQL 1.1 spec).

@afs
Copy link
Collaborator

afs commented Feb 19, 2021

https://github.com/apache/jena/blob/main/jena-arq/Grammar/main.jj has the annotation extension added for object and objectpath and, yes, it uses a grammar note to limit the use to for paths to simple links.

The alternatives look complicated: either additional lookahead of the path production (I haven't checked that works because assuming it impacts which parser generators can be use) or split path into compound and simple cases which becomes a wide spread change in the grammar.

(I may even be able to produce a complete grammar if the toolchain for producing HTML still works after all this time).

<a data-cite="SPARQL11-QUERY#rObjectPath">ObjectPath</a> <a href="#rAnnotationPattern">AnnotationPattern</a>`?`
`(`
<code class="token">','</code>
<a data-cite="SPARQL11-QUERY#rObjectPath">ObjectPath</a> <a href="#rAnnotationPattern">AnnotationPattern</a>`?`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be a separate production.

One "AnnotationPattern" that allows paths inside it (and a grammar to rule not to allow compound paths)
and one that does not have paths.

This parallels how triple patterns / path patterns work for WHERE (paths) and CONSTRUCT and update.

Copy link
Collaborator Author

@hartig hartig Feb 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Andy, I have added another commit which introduces such separate productions (commit 802ff10). Since the preview doesn't work anymore, here is what the corresponding productions look like:

[86] ObjectListPath ::= ObjectPath AnnotationPatternPath? (',' ObjectPath AnnotationPatternPath?)*

[178] AnnotationPatternPath ::= '{|' PropertyListPathNotEmpty '|}'

Is that what you meant?

With the grammar extended in this way, it is now possible to write the following:

SELECT * WHERE {
   :s :p :o {| :pp1/:pp2 ?oo |}
}

...which can be expanded to:

SELECT * WHERE {
   :s :p :o .
   <<:s :p :o>> :pp1/:pp2 ?oo .
}

So, this works. However, I am not sure how intuitive it is for users to be able to write such annotation patterns with path expressions inside. On the other hand, the expansion step that replaces the annotation pattern does not become more complex by this extension. Hence, it doesn't hurt to have it, I guess.

Another thing that is possible with the grammar extended in this way is the following:

SELECT * WHERE {
   :s :p :o {| :pp [:ppp1/:ppp2 ?o] |}
}

...which can be expanded to:

SELECT * WHERE {
   :s :p :o .
   <<:s :p :o>> :pp1 [:ppp1/:ppp2 ?o]
}

So, this is also okay I think.

Is it these things that you had in mind?

@gkellogg
Copy link
Member

@hartig if you rebase the PR branch on main (might not be pretty), the preview stuff should work again. You'll need to rebase in any case to resolve the conflicts.

@hartig
Copy link
Collaborator Author

hartig commented Feb 20, 2021

Gregg, is this rebasing something that can be done automatically or do I have to do it manually?

@gkellogg
Copy link
Member

I’m afraid rebasing is manual. However, you should be able to just merge main into your branch, which may be less clean, but will get the job done.

rebasing is one of the most difficulty and unintuitive parts of Git, IMO.

@hartig
Copy link
Collaborator Author

hartig commented Feb 20, 2021

Thanks. I have never done such a rebasing before. Hence, my question.

Perhaps, in this case, it will be easier and less time consuming if I simply create a new branch from main, copy the changes over, and generate a new PR (as I had done with the SPARQL-star Update PR).

@hartig
Copy link
Collaborator Author

hartig commented Feb 20, 2021

I have copied these changes into a new PR that I have created from the current main branch. See #106

I am closing this PR here.

@hartig hartig closed this Feb 20, 2021
@gkellogg gkellogg deleted the AnnotationSyntaxForSparqlStar branch November 13, 2021 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants