How to include custom fields? #50
Comments
Looks like a reasonable plan |
I'm not convinced yet, I think people don't want to use prefixes like that, especially for cases when such custom fields never make it into an "approved" spec. They would always be second-class citizens which should not be the case. So... how about looking at it from a software library dependency management point of view. Let's say everyone is allowed to define an extension (bunch of properties etc.) and publish that somewhere, maybe github. Each extension is versioned and each version has a URI (which should be a resolvable URL), like https://github.com/Reading-eScience-Centre/stats-for-covjson/releases/tag/0.1.0 How could someone express that they are using that extension in a CovJSON document? I would say this is what profiles are for: {
"type" : "Coverage",
"profiles": ["GridCoverage", "https://github.com/Reading-eScience-Centre/stats-for-covjson/releases/tag/0.1.0"],
...
} So maybe GridCoverage etc. are not really profiles after all? Maybe it really is something like "coverageType" and "domainType". And maybe, the basic CovJSON structure itself is just a profile: {
"profiles": ["http://coveragejson.org/1.0", "https://github.com/Reading-eScience-Centre/stats-for-covjson/releases/tag/0.1.0"],
"type" : "Coverage",
...
} I would even go so far as to take the profiles apart, turn it into an object and make version handling easier: "profiles": {
"http://coveragejson.org": "1.0",
"https://github.com/Reading-eScience-Centre/stats-for-covjson": "1.0"
} If you then require extensions to follow semantic versioning, then it is easy for clients to know which versions they understand (e.g. if they have support for 1.0, then they will understand 1.1 as well, just not all the new features, but 2.0 would be breaking). How does this fit in with the 'profile' Link relation type (as in HTTP Link headers or the "profile" parameter in the media type). I would say it fits very well. You could imagine that one logical coverage is offered in multiple profiles, e.g. different CovJSON versions or different statistics extensions. A client can request those variants: curl http://example.com/cov -H "Accept: application/prs.coverage+json; profile=\"http://coveragejson.org/2.0 https://github.com/Reading-eScience-Centre/stats-for-covjson/1.0\""
// could redirect to http://example.com/cov_with_uor_stats.covjson2 So each extension/profile must also define a URI for each version, but also a base URL for use in the CovJSON document itself for easy handling. If the client didn't send any preference for a certain profile then the server can send whatever it thinks is best. If the server doesn't have a perfect version match available, then again it sends the best it can, so it would actually parse the profile URI and extract the version number to do calculations on it (I think this is fine, even though URIs are opaque... in that case, there is a clear idea and purpose of all that and the extension base URI will be the same between versions). And that's content negotiation. About JSON-LD... the point of all this is to be independent of any complicated JSON-LD processing. However, it does not mean that there can't be a JSON-LD context for all the used extensions that are embedded in the document. For example, someone might put a lot of provenance metadata in the root of a CoverageJSON coverage document which could use the PROV-O ontology as it is, and this is a perfect fit for embedded a JSON-LD context since it's clean JSON-LD/RDF by nature, but in addition the profile would prescribe a certain structure so that as many clients as possible can use that provenance data in an easy way. Opinions please! |
One more thing: |
I can't pretend I've understood all of that, but it strikes me that we're probably not the first people to have a problem that could be addressed in this way. Is there any precedent for this approach? I think Rob Atkinson has been interested in modular profiles for a while. Also is there a danger that all this becomes a bit too "meta" and makes client development harder because all clients will have to anticipate the possibility of multiple profiles being in play? For CovJSON v1, is it perhaps safer just to say that custom fields can be included but will probably be ignored by most clients? Perhaps in future we will see what kinds of extensions people propose (or whether they propose them at all) and use this to inform the design of a more sophisticated mechanism. |
Very related, especially the comments: https://www.mnot.net/blog/2011/10/12/thinking_about_namespaces_in_json |
If we would use profiles for identifying extensions/conventions then this is really a rehash of JSON-LD contexts and probably not ideal. The problem with JSON-LD however is that in the current version with a root context you cannot define a property that is only valid in some subtree, it is always defined globally. And of course, it would still add complexity if clients would need to parse that JSON-LD context, so we possibly have to put some restrictions on it, like only being allowed to link to contexts, and not define definitions in-place: "@context": [
"http://example.com/my-extension"
] This is similar to the profiles idea with the advantage that the terms could be described in a more technical manner. So, a client could then check for an extension with if (doc['@context'].contains('http://example.com/my-extension')) { ...use extension... } |
I thought about this again and re-read the relevant sections in the Activity Streams 2.0 core spec: 2.1 JSON-LD and especially 5. Extensibility. I think in the longer run we are better off just doing what they do. We have to adapt the text slightly since CovJSON is not JSON-LD-only in its core, but I don't see a problem with that. And they also mention the case that extensions may include properties not formally defined with JSON-LD:
One detail I really like about how they handle non-JSON-LD defined extensions is that they have So in the end this means extension developers have multiple options, starting from just adding random properties as they like, then maybe putting them under a namespace with a prefix, like |
I think we just have to take a decision here. There's no clear "best" solution, so I think the way to go is to have a solution which allows a backwards-compatible evolution of the format (meaning, old documents are still valid in newer spec versions) and at the same time allow people to add custom extensions in varying degrees of being interoperable/forward-compatible. And this is completely independent of RDF/JSON-LD, though it integrates it. My suggestion would be:
Based on that, there are several levels how people can add/use extensions:
So, in a way, this extension concept is quite simple because it relies on just two main things on how to do it: either use non-namespaced extensions (use case: unshared data used in own web app etc.), or use a namespace (use case: be future-proof and/or share data). The only problem may be the reliance on https://prefix.cc since this is meant for RDF and wouldn't fit for things like new domain or range types I think. And apart from that it is not normative and a moving target. The alternative would be to have our own registry in the form of a JSON-LD context file (just containing namespace prefixes) which people could extend via GitHub pull requests and which would be imported by the main CovJSON context file so that newly registered extension namespaces are automatically available for JSON-LD clients. Not a bad idea actually. Also, an advantage of using namespaces for new types and then registering the namespaces, e.g. What do you think? |
That all sounds sensible to me. I tend to think that having our own registry as a JSON-LD context file is better than using prefix.cc, for the reasons you give. |
I just thought about this again and I think the strong focus on namespace prefixes is only a good idea for new fields, but not necessarily for types. For example, if someone invents a new domain type and has a single URI for that, then it probably doesn't make much sense to force that person to invent a prefix/namespace and register that with us, as this may be a bit arbitrary. So, I think for types it should be either a compact URI (prefix:name), a full URI, or a simple name. The important part is that the extension author has to decide for one of those which is then considered the normative one for the extension. And only the first two variants are allowed to be added to our extensions registry. |
Sounds reasonable. Could it be generalised - e.g. "use prefix:name for property names, and full URIs for property values"? |
I'm just thinking about the same. But it may not be appropriate in some cases, for example if you have: "dct:publisher": {
"type": "http://xmlns.com/foaf/0.1/Organization",
"foaf:name": "Vista GmbH"
} instead of "dct:publisher": {
"type": "foaf:Organization",
"foaf:name": "Vista GmbH"
} |
Yes, true. |
I've tried to change the spec accordingly and added an Extensions section, together with adjusting the JSON-LD section afterwards. What do you think? I think it could work. |
I think it looks good. To summarise, does it work like this:
Can you combine 2 and 3, i.e. use a compact URI whose definition is given in JSON-LD? Is this the purpose of the example in section 8? I guess people might ask why they would need to use "dct:license" when the context file could simply define the meaning of plain "license". |
Yes, it works like that. I added a paragraph in the JSON-LD section on your last comment, hopyfully clarifying it. |
I added this page now: http://covjson.org/prefixes/ Should be fairly simple for people to add new namespace prefixes. Let me know if you think it's too complicated. |
Looks v good to me |
OK, let's close this one then. Finally! |
For cases where the spec doesn't define properties, for example, how to express certain statistical information, how should these be included in a custom extension-like way?
I don't really like the way of having a separate container like "properties" or "extensions", it feels a bit heavyweight.
How about forcing a certain naming scheme for custom properties? Like browser vendors do for CSS fields.
For example:
So the requirement would be to have an arbitrary prefix (identifying the organization/person that created the custom property) followed by a underscore followed by the actual property name. If it gets included in the spec, then people would transition to the official names, maintaining compatibility in their clients to the custom name. Since we use camel case everywhere, those custom properties would not collide with anything.
The text was updated successfully, but these errors were encountered: