Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Explicit mapping of JSON-LD terminology to RDF terminology in Appendix "B1. RDF" #157

Closed
msporny opened this Issue · 11 comments

4 participants

@msporny
Owner

We need to align the JSON-LD terminology with the RDF Concepts terminology. During a discussion with @cygri, it became apparent that he was the best person suited to write this text as he has reviewed the JSON-LD spec and is also the editor of the RDF Concepts document. @msporny and @cygri agreed that the best place to put this information might be "Appendix B1: RDF". This issue is meant to track the progress of that work.

@cygri cygri was assigned
@lanthaler lanthaler referenced this issue from a commit
@lanthaler lanthaler Create section "Relationship to RDF"
This addesses #157, #168.
53e570c
@cygri
Collaborator

This is somewhat blocked on #174.

@msporny
Owner

This issue has now been unblocked, with the work progressing here:

http://www.w3.org/2011/rdf-wg/wiki/JSON-LD_Data_Model

@cygri
Collaborator

I'm afraid I'm having trouble with this activity.

What is the JSON-LD data model, anyway? How would I know that I'm writing up the right thing?

I thought that the data model is isomorphic to the output of Expansion. But I'm no longer so sure. For example, multiple JSON objects with the same @id don't get merged by Expansion. So if the data model was indeed the result of Expansion, then that would mean that a JSON-LD graph can in fact have multiple nodes with the same IRI, which seems all wrong to me.

Also, Expansion treats everything as bags and nothing as sets; therefore, a document like ["a","a"] actually seems to encode a graph that contains two indistinguishable value nodes. At least with RDF goggles on, this seems very strange because everything in RDF is sets.

To illustrate why this is a problem: How many nodes does this JSON-LD document contain?

["a","a",{"http://example.com/p":"a"}]

There is certainly at least one blank node, and at least one literal/value node. Or is it two literal/value nodes? Or three? Since expansion doesn't remove duplicates, it would have to be three.

Another example: How many edges does the JSON-LD graph serialized here contain?

{
  "@context":{"a":"http://example.com/p","b":"http://example.com/p"},
  "a":{"id":"ooo"},
  "b":{"id":"ooo"}
}

I suppose we have two nodes here, right? A blank node and an IRI node? So, how many edges? One or two? They clearly have the same label. If we say that it's two edges, then it means that an edge isn't defined by its subject, property and object, but that it has identity beyond that.

Another example:

{"http://example.com/p":["a","a"]}

One edge or two?

My problems here:

  1. I have trouble working out what graph any given JSON-LD document actually serializes.
  2. I have trouble understanding what's significant and what isn't in a JSON-LD document. The spec says that order isn't significant; but it seems to be carefully retained. The spec implies (I think) that lists are in fact sets, and therefore duplicates should be removed; but in fact they are carefully retained.
  3. It seems there is lots of data that wouldn't survive toRDF. This worries me greatly. JSON-LD users who expect to be able to work with their data in an RDF environment will learn the hard way that toRDF discards all sorts of things that are part of the JSON-LD data model and that the users consider important.
  4. If too much significant stuff doesn't survive toRDF, then it's no longer appropriate to present JSON-LD as an RDF format (because that would raise user expectations that will not be met).
  5. If JSON-LD is not an RDF format, then it would be wrong for RDF-WG to push it through the W3C process.

Not really sure where to go from here—I can't even work out how many nodes and edges a given JSON-LD document contains.

To establish a baseline, can we confirm that the following statements are true?

  • Expansion does not change the semantics of a JSON-LD document.
  • The semantics of arrays in JSON-LD is bag semantics (except with @list).
@gkellogg
Owner

What is the JSON-LD data model, anyway? How would I know that I'm writing up the right thing?

The result of Flatten is closer to the data model, as that consolidates multiple node definitions. Flatten is also what FromRDF produces.

Also, Expansion treats everything as bags and nothing as sets; therefore, a document like ["a","a"] actually seems to encode a graph that contains two indistinguishable value nodes. At least with RDF goggles on, this seems very strange because everything in RDF is sets.

We chose not to eliminate duplicates when expanding, but I think perhaps we should.

The example you provide is an odd document, and it's interesting that both the playgeround and my processor treat it the same, but it should not be valid, as all members of a top-level array should be treated as node definitions, and "a" is not a node definition.

When used as a value, the multiple "a" values are treated as literals; we chose not to eliminate duplicates in expansion (or compaction), but I'm not sure sure this is the right thing; it certainly makes the implementation more difficult, however.

Another example: How many edges does the JSON-LD graph serialized here contain?

{
  "@context":{"a":"http://example.com/p","b":"http://example.com/p"},
  "a":{"@id":"ooo"},
  "b":{"@id":"ooo"}
}

Two nodes, one edge (changed "id" to "@id" as I think you intended). The fact that it looks like two values in expansion is a consequence of not eliminating duplicates. I could see an argument for doing duplicate elimination in flatten instead of expand.

@cygri
Collaborator

Okay, sorry, forget what I wrote above.

I used to know that I need to look at Flatten, but then found that all the action is in Expansion and somehow forgot about Flatten. I'm feeling like I'm in a maze of twisty little algorithms, all alike…

@lanthaler
Owner

For the record: we discussed the removal of duplicates in issue #129.

@msporny
Owner

@cygri - What's the status on this section of the spec? It is now the biggest blocker we have for a Last Call.

@cygri
Collaborator

Not sure if I can finish it this week, but certainly before the Nov 28 RDF-WG call.

@msporny
Owner

@cygri, status update on this issue?

@lanthaler lanthaler referenced this issue from a commit
@lanthaler lanthaler Update Data Model and Relationship to RDF sections
This is a first attempt to specify JSON-LD's data model in all details and compare it to RDF's data model.

This addresses #157, #181, and #168.
714c507
@lanthaler
Owner

@msporny, in last week's RDF telecon I agreed to take @cygri draft and try to come up with something. In 714c507 I just pushed an update to the syntax specification which includes a first draft.

@cygri, could you please have a look at the changes?

A few comments to some statements in your draft in the RDF WG wiki:

@type is special; a bag of IRIs

It's not, it's just an alias for rdf:type when used in node objects.

Everything (edges, nodes) is bags, not sets
Graph requires that a node in the default graph with the same IRI exists

Nope, but I think you know that already and you just haven't updated it.

Graph names can be blank nodes.

Yes, I included a "SHOULD be an IRI" whenever possible. I do see some value in allowing bnodes as well but wouldn't be opposed to change it.

Unconnected nodes (IRIs, blank nodes or values) are supported.
Edge labels may be blank nodes.

Added a SHOULD statements to discourage it.

Language maps are part of the data model.

In my opinion they are not, so I didn't include it. They are just syntactic constructs to simplify the serialization of language-tagged strings

Graphs/datasets are "allowed" in the object position.

Yes, at the syntactic level but not in the the data model. The only thing in the object position is the graph's name.

Supports “plain” strings, numbers and booleans, separately from typed literals.

Yes, added a note about it in the Relationship to RDF section.

@lanthaler
Owner

As announced on the mailing list, I'll close this issue now.

@lanthaler lanthaler closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.