Skip to content
This repository

JSON-LD API Review by Robin Berjon #200

Closed
msporny opened this Issue · 7 comments

5 participants

Manu Sporny Robin Berjon Richard Cyganiak Markus Lanthaler Niklas Lindström
Manu Sporny
Owner

A review of the JSON-LD API by @darobin:

• Probably not something you can do anything about, but you folks must be the last people on Earth still talking about IRIs.

• If you add the "highlight" class to your code examples, they get some syntax highlighting.

• "JSON Object". I find this definition confusing. It's unclear whether you're talking about something in memory or the syntax. If the latter, you should just refer to RFC4627. If the former, then refer to WebIDL (or define it in WebIDL). Same all the way up to "null".

• You should put parentheses around your union types (everywhere).

• [NoInterfaceObject] is usually a red flag. In the case of JsonLdProcessor I'm not sure what to advise because it's unclear to me how this is expected to be exposed. The examples just reference a "jsonld" object but don't say where it's coming from. How about you drop the [NIO] and instead add a [Constructor]? That way you can "new JsonLdProcessor" and also do feature detection to know if it's supported in a context.

• You shouldn't just in prose say that it's okay to not use the asynchronous approach. If you really want a synchronous interface, then you should define it properly in it own section (as something that JS contexts would only support in e.g. Workers). But really, an asynchronous interface is probably the way to go.

• You list things that look like error constants (INVALID_SYNTAX) but they don't appear to be defined anywhere. I'm guessing that you probably want to have an enum somewhere for those. You don't have to go all uppercase on our arses, too "invalid syntax" is nice and simple!

• The error cases should be linked to specific algorithmic steps that will cause them to be raised.

• Should it really be an error if the compaction would lose some information? As a developer, I would much rather the algorithm compact as much as possible, but no further. So I wouldn't lose information, but I also wouldn't get an error. The result might be less compact, but IMHO more useful.

• You might want to be clearer about what you mean by copying the input. Make sure implementers don't modify the structure in place.

• WebIDL does not have "string" and "number". Use DOMString and one of the numeric types for that (I'm guessing double?).

• [NoInterfaceObject Callback] is gone, you should use the WebIDL callback construct instead. Loving the Node-style callbacks.

• I this toRDF() is probably designed wrong. Why are you sending one callback per Quad? Wouldn't it be simpler to callback with Quad[]? Either way, if there is a good reason to have multiple callbacks then you also need a way to signal that all the data has been processed — otherwise I can never know how many to expect and do whatever work I need to do next.

• Is ConformanceCallback really needed? It seems overly complicated for no good reason. If it's recoverable, then recover. If people want to write linting tools, then let them. But a processor shouldn't have to deal with this, and a developer against the API should never have to care. It looks like useless old school pedantry.

• You also don't say what happens if I don't call the passed callback back.

• Why have a type for IRI? It's just a bloody string! Have the relevant bits accept DOMString, and define what happens when they're not valid.

• Don't use "function" as a type, define a callback type.

• Seriously, you might want to state somewhere at the beginning of the document that "Throughout this specification, the term 'URL' refers to a IRI as used in RDF." There are maybe five people who know what an IRI is, and half of them think it's a silly idea. No developer will have a clue.

• If Quad is just meant to convey some data, it should be a dictionary. If you really want it to be an object, then it should expose an interface through a Constructor. In the latter case it should probably take the four fields as params. But really, this looks like a dictionary to me. I would just want to pass { subject: "...", ... } and not have to construct anything.

• Why define a Node? It serves no purpose whatsoever. Drop it!

• You've now defined an IRI object twice! Don't. This is a string.

• Same for BlankNode!

• Literal clearly seems to be a dictionary.

• Ah I get it now, you're using the object type to know what it is you're dealing with in the Quad. That's actually bad (but in a different way), it won't work at all.

First, if you're on the receiving end of a Quad, I'm presuming you're expecting people to go "if (q.subject instanceof IRI)..." Except you've defined the "IRI" interface to be NoInterfaceObject so there is actually nothing to put on the right hand side of that instanceof. Second, Even if you remove the NoInterfaceObject, it's still problematic. If I pass the Quad to another frame because I'm delegating processing, the interface object will be different and so the comparison fails (yay!). And we probably don't really want to have global "IRI" objects (and we already have Node).

I think you're better off making each of those a dictionary with keys type and value (except for Literal, that gets type and its three fields). That's easier to test for, safer, and eliminates the interfaces.

Perhaps better still: do you really need this RDF stuff? Isn't it more the job of a library to output some given syntax based on processing one of the normalised forms?

Robin Berjon

FWIW I'm watching this thread so if there are questions don't hesitate to post them here as I'll get them.

Richard Cyganiak
Collaborator

FWIW, I think that IRIs and HTML-style URLs are equivalent in UTF-8 encoded documents.

I'm actually glad to hear that the Node class thingy doesn't work, as it seemed a bit like overkill. A dictionary sounds simpler. You could probably re-use a lot of the existing spec machinery by saying that its keys are @id, @value, @type and @lang. Except that the @ is inconvenient as it makes things like if (myquad.object.lang && myquad.object.lang == 'en') harder to write.

Markus Lanthaler
Collaborator

• Probably not something you can do anything about, but you folks must be the last people on Earth still talking about IRIs.

What would you suggest instead? URL?

• If you add the "highlight" class to your code examples, they get some syntax highlighting.

• "JSON Object". I find this definition confusing. It's unclear whether you're talking about something in memory or the syntax. If the latter, you should just refer to RFC4627. If the former, then refer to WebIDL (or define it in WebIDL). Same all the way up to "null".

You are right, this is unclear in the API spec. We did a better job in the syntax spec: http://json-ld.org/spec/latest/json-ld-syntax/#terminology. The reason we call it "JSON object" is to avoid confusion with the object in a triple.

Edit: Improved in fb6cc19

• You should put parentheses around your union types (everywhere).

So object or object[] or IRI input should be (object or object[] or IRI) input?

Edit: Fixed in 1120c85. I didn't do it for dictionaries and the callback definition as that seems to break ReSpec.

• [NoInterfaceObject] is usually a red flag. In the case of JsonLdProcessor I'm not sure what to advise because it's unclear to me how this is expected to be exposed. The examples just reference a "jsonld" object but don't say where it's coming from. How about you drop the [NIO] and instead add a [Constructor]? That way you can "new JsonLdProcessor" and also do feature detection to know if it's supported in a context.

I changed it in e04560a

• You shouldn't just in prose say that it's okay to not use the asynchronous approach. If you really want a synchronous interface, then you should define it properly in it own section (as something that JS contexts would only support in e.g. Workers). But really, an asynchronous interface is probably the way to go.

That prose is there since we have implementations in Ruby, PHP, and a few other languages which generally use synchronous interfaces. Would you nevertheless define it explicitely?

• You list things that look like error constants (INVALID_SYNTAX) but they don't appear to be defined anywhere. I'm guessing that you probably want to have an enum somewhere for those. You don't have to go all uppercase on our arses, too "invalid syntax" is nice and simple!
• The error cases should be linked to specific algorithmic steps that will cause them to be raised.

That's true, the whole error handling is not properly specified at the moment. This is being tracked in #153.

• Should it really be an error if the compaction would lose some information? As a developer, I would much rather the algorithm compact as much as possible, but no further. So I wouldn't lose information, but I also wouldn't get an error. The result might be less compact, but IMHO more useful.

I agree completely with you.

• You might want to be clearer about what you mean by copying the input. Make sure implementers don't modify the structure in place.

OK. What is meant is to copy the input so that it can then be modified in place by the algorithms

• WebIDL does not have "string" and "number". Use DOMString and one of the numeric types for that (I'm guessing double?).

Are you talking about JsonLdProcessingError? That is being discussed as part of #153 as well.
Update: Fixed in 1120c85 and e8f0cfc

• [NoInterfaceObject Callback] is gone, you should use the WebIDL callback construct instead. Loving the Node-style callbacks.

Do you know how that's done with ReSpec? I couldn't find it.
Update: Fixed in 261c94b

• I this toRDF() is probably designed wrong. Why are you sending one callback per Quad? Wouldn't it be simpler to callback with Quad[]? Either way, if there is a good reason to have multiple callbacks then you also need a way to signal that all the data has been processed — otherwise I can never know how many to expect and do whatever work I need to do next.

+1, edit: changed in 077d498
Update: and removed completely in 261c94b :-)

• Is ConformanceCallback really needed? It seems overly complicated for no good reason. If it's recoverable, then recover. If people want to write linting tools, then let them. But a processor shouldn't have to deal with this, and a developer against the API should never have to care. It looks like useless old school pedantry.

• You also don't say what happens if I don't call the passed callback back.

I also think it's way too complex. Again, see issue #153.
Update: Removed in 7a8e01e.

• Why have a type for IRI? It's just a bloody string! Have the relevant bits accept DOMString, and define what happens when they're not valid.

Good point
Update: Fixed in 1120c85

• Don't use "function" as a type, define a callback type.

Update: Fixed in 261c94b

• Seriously, you might want to state somewhere at the beginning of the document that "Throughout this specification, the term 'URL' refers to a IRI as used in RDF." There are maybe five people who know what an IRI is, and half of them think it's a silly idea. No developer will have a clue.

Would be fine for me.. Don't know what others think of this.

• If Quad is just meant to convey some data, it should be a dictionary. If you really want it to be an object, then it should expose an interface through a Constructor. In the latter case it should probably take the four fields as params. But really, this looks like a dictionary to me. I would just want to pass { subject: "...", ... } and not have to construct anything.

• Why define a Node? It serves no purpose whatsoever. Drop it!

• You've now defined an IRI object twice! Don't. This is a string.

• Same for BlankNode!

• Literal clearly seems to be a dictionary.

• Ah I get it now, you're using the object type to know what it is you're dealing with in the Quad. That's actually bad (but in a different way), it won't work at all.

First, if you're on the receiving end of a Quad, I'm presuming you're expecting people to go "if (q.subject instanceof IRI)..." Except you've defined the "IRI" interface to be NoInterfaceObject so there is actually nothing to put on the right hand side of that instanceof. Second, Even if you remove the NoInterfaceObject, it's still problematic. If I pass the Quad to another frame because I'm delegating processing, the interface object will be different and so the comparison fails (yay!). And we probably don't really want to have global "IRI" objects (and we already have Node).

I think you're better off making each of those a dictionary with keys type and value (except for Literal, that gets type and its three fields). That's easier to test for, safer, and eliminates the interfaces.

I think this whole stuff came from a never finished RDF API, @gkellogg knows more about it.

Update: All of this has been removed in 261c94b

Perhaps better still: do you really need this RDF stuff? Isn't it more the job of a library to output some given syntax based on processing one of the normalised forms?

Very good question :-)

Update: All of this has been removed in 261c94b

Markus Lanthaler
Collaborator

RESOLVED: The callback signature for the .toRDF() method should accept Quad[]. That is, the callback is called once after all processing has been completed.

Markus Lanthaler lanthaler referenced this issue from a commit
Markus Lanthaler lanthaler Update QuadCallback to use array of Quads instead of just a single quad
The toRDF() algorithm hasn't been updated yet to call the callback.

This addresses #200.
077d498
Markus Lanthaler
Collaborator

RESOLVED: Remove the .toRDF() and .fromRDF() WebIDL API calls into a separate document that will not be a part of the JSON-LD 1.0 work. The to/from RDF algorithms will still be a part of the JSON-LD API 1.0 work.

Niklas Lindström
Collaborator

When continuing work on toRDF/fromRDF (in this separate document), I agree on using just dictionaries to represent RDF nodes. I suggest to base these on the JSON representation for SPARQL variable bindings. (Though perhaps changing xml:lang to just lang...)

Gregg Kellogg gkellogg referenced this issue from a commit
Gregg Kellogg gkellogg Move RDF API methods to new json-ld-rdf document.
Cleanup references within RDF algorithms to not require RDf WebIDL definitions.
This addresses issue #200.
e8f0cfc
Markus Lanthaler lanthaler referenced this issue from a commit
Markus Lanthaler lanthaler Remove conformance callback
This addresses #153 and #200.
7a8e01e
Markus Lanthaler lanthaler referenced this issue from a commit
Markus Lanthaler lanthaler Fixed all WebIDL datatypes and wrapped union types in parentheses
I didn't put parentheses around the union types in interfaces and the callback as that seems to break ReSpec.

This addresses #200.
1120c85
Markus Lanthaler
Collaborator

I've addressed most of the issues outlined above. I created a separate issues for the remaining two, namely #205 and #206. Error handling is being discussed in #153.

As per our usual procedure I will therefore close this issue in 24 hours unless I hear objections.

Markus Lanthaler lanthaler referenced this issue from a commit
Markus Lanthaler lanthaler Collect error codes in an enum
This addresses #153 and #200
9efe211
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.