Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

N-Triples* #16

Closed
afs opened this issue Oct 14, 2020 · 18 comments · Fixed by #87
Closed

N-Triples* #16

afs opened this issue Oct 14, 2020 · 18 comments · Fixed by #87
Assignees
Labels
action Action assigned during a meeting concrete-syntax About Turtle-star and other concrete syntaxes help wanted Extra attention is needed

Comments

@afs
Copy link
Collaborator

afs commented Oct 14, 2020

N-Triples fulfills the role of database dump format. As such it might be useful to define N-Triples* as exactly what is in the database, with only the << >> syntax (no annotation support as in issue #9) and without the automatic generation of the implied triple for << >>. This preserves the one line - one triple feature of N-Triples found in the wild.

This would also make it a format for writing tests for Turtle*.

@hartig
Copy link
Collaborator

hartig commented Oct 14, 2020

Are you suggesting to add another section into the draft, parallel to the section that defines Turtle*?
If so, fine with me.

@gkellogg
Copy link
Member

gkellogg commented Oct 14, 2020

Should probably extend to N-Quads* and TriG*, given the dataset implications. Don't know that a separate sub-section for each is warranted.

@pchampin
Copy link
Collaborator

N-triples* and N-quads* do indeed make sense, as they are "raw" syntaxes, that could be used in test suites, as @afs points out.

TriG* could be left "as an exercice for the reader", IMO, although of course it would be nicer to specify it explicitly.

@gkellogg
Copy link
Member

TriG* could be left "as an exercice for the reader", IMO, although of course it would be nicer to specify it explicitly.

Only if we can rely on N-Quads to specify if a quoted triple used with a named graph is asserted to be in the same named graph, or in the default graph. And, that there is no such thing as a quoted quad, or not.

@pchampin
Copy link
Collaborator

if a quoted triple used with a named graph is asserted to be in the same named graph, or in the default graph

In the current proposal, SA mode is the norm, i.e. <<s p o>> does not assert the triple s p o. PG mode would be available through a different syntax, see #9 . But you are right, it should be specified how that syntax is interpreted in TriG*. I would find it natural that the annotated triple and the annotation triple belong to the same graph, but that needs to be explicit.

@gkellogg
Copy link
Member

Even in SA mode, it’s important to know what triple is asserted, as it may appear (or not appear) in multiple graphs.

@afs
Copy link
Collaborator Author

afs commented Oct 23, 2020

SA would be restricted to same-graph triples.

@VladimirAlexiev
Copy link

@afs I completely agree that ntriples-star should be specified in the spec. But it's not obvious to me what do you propose for syntax. Could you give an example?

@afs
Copy link
Collaborator Author

afs commented Nov 3, 2020

The << >> form as an RDF term for a triple reference.

#16 (comment)

@gkellogg
Copy link
Member

gkellogg commented Nov 3, 2020

I think it's basically the restriction on Turtle* to have only subject, predicate, and object productions with limitations:

[1] ntriplesDoc   ::= triple? (EOL triple)* EOL?
[2] triple        ::= subject predicate object '.'
[3] subject       ::= IRIREF | BLANK_NODE_LABEL | embTriple
[4] predicate     ::= IRIREF 
[5] object        ::= IRIREF | BLANK_NODE_LABEL | literal | embTriple
[6] literal       ::= STRING_LITERAL_QUOTE ('^^' IRIREF \| LANGTAG)?
[xx] embTriple    ::= "<<" subject predicate object ">>"

Note that this does allow for recursion on embTriple.

Arguably, embedded triples could have their own identifiers, to keep N-Triples* as a true triples format, but the added complexity this entails is probably not worth it.

@afs
Copy link
Collaborator Author

afs commented Nov 3, 2020

If N-Triples* preserves the "one-triple, one line" feature, then it is like SA mode.

A recursive embTriple is still just a single RDF term.

@gkellogg
Copy link
Member

gkellogg commented Nov 5, 2020

The N-Triples* grammar could be extended for N-Quads*, but it would still contain embTriple, which is why I raised the point above. It would need to be clear that anygraphLabel in a quad also applied to embedded triples. This would have similar meaning in a hypothetical TriG*, as well as for use of embedded triples within SPARQL named graphs.

@pchampin
Copy link
Collaborator

pchampin commented Nov 6, 2020

It would need to be clear that any graphLabel in a quad also applied to embedded triples.

Sorry @gkellogg, but I don't see what you mean. Recall that we settled on SA mode for the << ... >> syntax. Being not asserted, an embedded triple does not need to have a graph label "applied" to it.

In other words, in TriG*, the two graph below make assertions about the same triple, which does not "belong" to any graph in particular.

:g1 { << :s :p :o >> :source :src1. }
:g2 { << :s :p :o >> :certainty 0.5. }

At least, that's how I see things, given the way the abstract syntax of RDF* datasets is defined.

@gkellogg
Copy link
Member

gkellogg commented Nov 6, 2020

This is precisely why this needs to be clarified. The only way to satisfy your assertion, which makes sense, is that the triple is asserted in the default graph. That had been my interpretation, but seems at odds with other comments in the thread. It also means that there is no way to assert anything about a triple within a named graph.

@VladimirAlexiev
Copy link

The main characteristics of ntriples that make them useful for command line processing are

  • one triple per line
  • chop off the final " ."
  • s p o are separated by spaces (only o may include spaces)

Is there a way to design ntriples-star to conform to the same?
If not, what simple splitting patterns can be designed ?

@afs
Copy link
Collaborator Author

afs commented Nov 7, 2020

:<< :s :p :o >> :source :src1 :g1. does not mean :s :p :o is anywhere in the data. Triples are not owned so a statement about a triple isn't implying where the triple is (that would be reification!)

I'm assuming that Trig* will have the {| ...|} syntax so the triple is asserted and annotated in the same graph.

@afs
Copy link
Collaborator Author

afs commented Nov 7, 2020

one triple per line
chop off the final " ."
s p o are separated by spaces (only o may include spaces)

There is a defined canonical form TR/n-triples/#canonical-ntriples.

If <<..>> is SA, i.e. a term referring to a triple, not asserting it as well, it is one triple per line.
The canonical form rules can be applied recursively within << >>.
The splitting of s p o is "outside" any << >>.

@pchampin
Copy link
Collaborator

pchampin commented Nov 9, 2020

Totally agree with @afs . However, it is true that parsing N-Triples*, even in canonical form, is significantly more complex than N-Triples. In particular, the recursive nature of embedded triples rules out regular expressions, I believe.

@pchampin pchampin added concrete-syntax About Turtle-star and other concrete syntaxes help wanted Extra attention is needed labels Nov 10, 2020
@pchampin pchampin added the action Action assigned during a meeting label Jan 22, 2021
@pchampin pchampin closed this as completed Feb 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action Action assigned during a meeting concrete-syntax About Turtle-star and other concrete syntaxes help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants