Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop empty arrays (sets) and empty lists in expansion #220

Closed
lanthaler opened this issue Feb 17, 2013 · 16 comments
Closed

Drop empty arrays (sets) and empty lists in expansion #220

lanthaler opened this issue Feb 17, 2013 · 16 comments

Comments

@lanthaler
Copy link
Member

Now that we remove free-floating values and nodes during expansion, shouldn't we also drop empty arrays (sets) and empty lists?

For example:

{
  "@context": {
    "name": "http://xmlns.com/foaf/0.1/name",
    "homepage": {
      "@id": "http://xmlns.com/foaf/0.1/homepage",
      "@type": "@id"
    }
  },
  "@id": "http://me.markus-lanthaler.com/",
  "name": "Markus Lanthaler",
  "homepage": [ ]
}

Shouldn't we drop the homepage property when expanding?

[
  {
    "@id": "http://me.markus-lanthaler.com/",
    "http://xmlns.com/foaf/0.1/name": [ { "@value": "Markus Lanthaler" } ]
  }
]

PROPOSAL 1: Drop empty arrays (sets) when expanding

PROPOSAL 2: Drop empty lists when expanding

Lists are a bit special here since an empty list is actually a value. So it might make sense to keep them but drop empty sets.

@lanthaler
Copy link
Member Author

PROPOSAL 1: +1
PROPOSAL 2: -0.5

@gkellogg
Copy link
Member

PROPOSAL 1: Drop empty arrays (sets) when expanding

+0

PROPOSAL 2: Drop empty lists when expanding

-1 as you note, an empty list does express information, and is consistent with every other RDF serialization.

@dlongley
Copy link
Member

PROPOSAL 1 and 2:

-1. I would find this annoying when working with JSON.

@lanthaler
Copy link
Member Author

More annoying than the fact that properties which are not mapped to an IRI are dropped? More annoying than the fact that free-floating nodes are dropped? More annoying than the fact the null is dropped? :-)
.
The point is, it doesn’t mean anything. Actually we can’t even represent such data with our data model. It’s a subject-predicate tuple -- there's no object.

@dlongley
Copy link
Member

More annoying than the fact that properties which are not mapped to an IRI are dropped? More annoying than the fact that free-floating nodes are dropped? More annoying than the fact the null is dropped? :-)

Yes, no, ...yeah.

I'm willing to live with dropping null because if null is a value you expect to see for a property in your application it's not that much more work to deal with its non-existence (sometimes the check is exactly the same). Dropping free-floating nodes is not an issue at all with me (that I can think of). Dropping properties that are not mapped to an IRI is perfectly fine ... my application won't be looking at them anyway.

Now dropping properties that my application wants to see ... and where it is expecting an array, I find that annoying should they disappear. Now I have to permit validators to accept input that is missing the property and then either re-add it myself or do another check for its existence. Why? What does that buy anyone? Suggesting that several other layers of software could alleviate this issue is also a non-starter for me. I don't understand the utility of removing the properties.

@lanthaler
Copy link
Member Author

Now I have to permit validators to accept input that is missing the property and then either re-add it myself or do another check for its existence. Why? What does that buy anyone?

Effectively the property doesn't exist if it has no value. There's no arc in the graph because there's no node it could point to. As soon as you round-trip to RDF you would loose it (I know, you are not concerned about that) exactly because of to that reason.

What if the incoming data doesn't contain the property? Is that data then invalid according your validator? Is there somewhere a must contain this property even if it has no value requirement? If there isn't, you need both checks, property not there and empty array. If you require that the property exists and has a value, you also need more checks: property exists and value != empty array instead of just, property exists.

@dlongley
Copy link
Member

I can reject inputs that don't have the property if I want to, yes. Maybe I want to ensure people are very explicit when they say they don't have any values for property X... and if they aren't, I won't accept their data.

JSON-LD is, primarily, about JSON, IMO. I would expect that most applications that consume JSON but use JSON-LD to preprocess their data do the following:

  1. Only consider those terms that are in the default JSON-LD context for their application. This means that dropping values that aren't mapped to properties likely isn't an issue.
  2. Have situations where the existence of certain properties is expected, and if they don't exist, the application raises an error at the validation layer. This means that having to detect that a property doesn't exist vs. it had no values is an extra step that has to be considered if a JSON-LD processor drops the property. I'd prefer to avoid having to add this extra step because I don't see the utility of dropping the property.

This seems like a case where consistency between processing output and the graph in the abstract is a bad idea. There is extra meaning that is useful to applications that needn't be dropped just for consistency's sake.

@lanthaler
Copy link
Member Author

Maybe I want to ensure people are very explicit when they say they don't have any values for property X... and if they aren't, I won't accept their data.

They are not saying that - they say nothing. They would need to express it explicitly by using something like owl:Nothing.

  1. Have situations where the existence of certain properties is expected, and if they don't exist, the application raises an error at the validation layer. This means that having to detect that a property doesn't exist vs. it had no values is an extra step that has to be considered if a JSON-LD processor drops the property. I'd prefer to avoid having to add this extra step because I don't see the utility of dropping the property.

Linked Data (RDF) and thus also JSON-LD are based on the open world assumption. Thus, the scenario you explain makes semantically absolutely no sense, IMO at least. We had discussions quite some time ago about what null means and we decided that it means nothing. If we wouldn't have made that decision, you could use it to explicitly express that that property has no value. But we didn't. An empty array is the same as null. It means nothing.

So, why is it that "homepage": null is dropped but "homepage": [ ] isn't? That is clearly inconsistent IMO and we should resolve it. If you really need that property to be there, frame your data and give it a default value.

@dlongley
Copy link
Member

They are not saying that - they say nothing. They would need to express it explicitly by using something like owl:Nothing.

This is the point at which JSON developers stop using JSON-LD.

IMO, there is a very large group of developers that JSON-LD can meet the needs of and bring into the linked data world. When we start bringing in esoteric concepts like owl:Nothing instead of letting people use homepage: [], we quickly whittle away the size of that group. While consistency between processing output and the abstract data model is important, I believe it is trumped by usefulness and adoptability. Obviously, we don't want to introduce glaring inconsistencies, but I really don't think leaving properties w/empty arrays in the output does that. I do think that requiring JSON developers to now grasp owl:Nothing begins to impose too onerous a learning curve.

This isn't about getting the exact semantics correct for "really really has no value" vs. "i didn't specify anything", this is about getting people to use linked data in JSON without giving themselves a headache.

"When I don't have any values for a property in JSON, I use an empty array. If I want to preserve that array when I run it through a JSON-LD processor I have to link to the "owl" vocabulary and use the owl:Nothing property? Forget it, I'll just use JSON."

@dlongley
Copy link
Member

I'll just say this -- I think if we take this sort of direction, we're going to end up requiring framing everywhere. When that's coupled with the fact that we didn't include framing in version 1.0 of the API (not questioning that decision right now, btw) it seems problematic to me ... like a potential barrier to adoption.

@dlongley
Copy link
Member

We have several different ways of saying "nothing" in the JSON-LD syntax. It seems to me that we ought to pick the one that is most advantageous to JSON developers in our processing output. There's nothing incorrect about that, it only has an upside, IMO.

@lanthaler
Copy link
Member Author

Well, I see it differently. I don't have a reasonable explanation at hand for the fact that we drop properties with a value that equals null and we don’t do the same for properties whose value is an empty array. It becomes even more confusing when you consider the fact that we compact arrays containing just one element... but it stays an array if there’s no element.

We don’t need to bring in "esoteric concepts like owl:Nothing" at all to explain the behavior. I just mentioned it to illustrate the difference here in this discussion. All we have to say is that properties without value are dropped just as free-floating values (values that are not connected by a property to another value). Draw it as graph and it's even easier to understand.

We have several different ways of saying "nothing" in the JSON-LD syntax. It seems to me that we ought to pick the one that is most advantageous to JSON developers in our processing output. There's nothing incorrect about that, it only has an upside, IMO.

The more ways you have to express something, the more checks you need to find out what has been said.

@msporny
Copy link
Member

msporny commented Feb 18, 2013

PROPOSAL 1: -1 (we may have to fix the data model, which seems to be broken in this regard)
PROPOSAL 2: -1

@lanthaler I don't think that specifying the empty set is meaningless. I definitely don't think that we should make developers use owl:Nothing. I also agree with @dlongley that we're skirting dangerously close to forcing JSON developers to do something that is very strange in JSON.

So, I think that being able to express empty sets is almost as important as being able to express empty lists in JSON-LD. Empty sets were supported in the original RDF/XML Grammar Event Matching Notation. I don't have a strong opinion yet on whether it should round-trip to RDF or not. I don't think it has to, or if we think it has to, we might want to generate a blank node to represent the set that is an rdfs:Container. We can fix the JSON-LD data model by allowing an object to be the empty set, which I'd expect is still aligned with RDF because you can do _:subject predicate _:object . _:object a rdfs:Container . We could also use rdf:Bag, but I think that's been deprecated.

@lanthaler
Copy link
Member Author

RESOLVED: Do not drop empty arrays (sets) and empty lists in expansion and compaction.

@lanthaler
Copy link
Member Author

We discussed this in today's telecon and decided to not change the current behavior, i.e., to keep empty arrays (representing sets & lists) when expanding/compacting a JSON-LD document.

Unless I hear objections, I will close the issue in 24 hours.

@garpinc
Copy link

garpinc commented Oct 14, 2022

I am not understanding the resolution here. What if I do want empty lists to be turned into owl:Nothing? Is there I hook I can add to enable this translation?

Debugging code it seems that https://github.com/jsonld-java/jsonld-java/blob/master/core/src/main/java/com/github/jsonldjava/core/RDFDataset.java provides no hook to change what you do when values is an empty list where it should either via JsonLdOptions or otherwise. There is also no clear way to provide your own implementation of RDFDataset so you can do what u want. it seems to me that providing a callback via JsonLdOptions for the implementation of com.github.jsonldjava.core.RDFDataset.graphToRDF(String, Map<String, Object>) would be the way to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants