New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
using/defining the "schema" link relation type #522
Comments
@dret thanks! Yes, we were planning to register it as part of the JSON Schema draft. Or motivated by the JSON Schema draft, or whatever the appropriate mental model is. But I have no idea what approach is likely to be successful, and any advice would be appreciated. Should we try to register it now, or is that something that must wait until we're at RFC or at least adopted by a working group? |
I almost put in an extension link relation like |
definitely never ever put identifiers in specs and then change them. that’s the X- disaster all over again...
… On Dec 19, 2017, at 19:32, Henry Andrews ***@***.***> wrote:
I almost put in an extension link relation like tag:json-schema.org,19-11-2017:schema or something, but that seemed more likely to confuse people, and then folks would have to change it again if/when we got the link relation registered.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Hah! Yeah, makes sense. We only changed from "profile" to "schema" due to the discussions here that "profile" really wasn't quite right. Given that "schema" is a pretty common concept applying to many media types, it seems like a good candidate for link relation / media type parameter. I did just see dret/I-D#92 about maybe using HTTP I see "profile" and "schema" as similar, but while "profile" identifies a broadly applicable subset, "schema" identifies a set of values suitable for a much more specific purpose. It can also identify something broad, but the ability to be specific is what (I think) distinguishes it. |
On 2017-12-19 19:31, Henry Andrews wrote:
@dret <https://github.com/dret> thanks! Yes, we were planning to
register it as part of the JSON Schema draft. Or motivated by the JSON
Schema draft, or whatever the appropriate mental model is. But I have no
idea what approach is likely to be successful, and any advice would be
appreciated.
writing up a draft of its own may be a bit much. it would have the
advantage of being more visible, and not creating the impression that
"schema" is for JSON only.
but it's perfectly fine to leave it in there and make it part of the
draft's IANA considerations. the only advice i have is to create a
section for it that defines and describes it, and that is very careful
at making it independent of JSON schema. that way, anybody interested in
the definition could just read that section and wouldn't have to dive
into the rest of the spec.
Should we try to register it now, or is that something that must wait
until we're at RFC or at least adopted by a working group?
process-wise, if you want to register it now, a separate draft is the
best way to go (it's not the only way, as there is no formal requirement
that only RFCs can register link relation types). you can also leave it
in there and then formally speaking, it would only become registered
once JSON schema becomes an RFC.
while the IANA registry is only updated after RFC status, at
http://webconcepts.info/concepts/link-relation/ i am compiling quite a
number of link relation types that are defined in drafts. the reason is
that these often are already in use, so keeping them "under wraps"
probably is less useful than telling people that these are things that
are under consideration/development. it is then up to them to decide
whether they already want to get on board, or wait for the official
registry to be updated.
|
On 2017-12-19 20:33, Henry Andrews wrote:
Hah! Yeah, makes sense. We only changed from "profile" to "schema" due
to the discussions here that "profile" really wasn't quite right. Given
that "schema" is a pretty common concept applying to many media types,
it seems like a good candidate for link relation / media type parameter.
it certainly has popped up quite often, in a variety of communities. i
am not quite sure what the best word is. some communities might call
this a vocabulary or an ontology, but it may be hard to find a word that
works perfectly for everybody.
I did just see dret/I-D#92 <dret/I-D#92> about
maybe using HTTP |Prefer| or some other proposal instead / in addition
to media type parameters for "profile", so I'd be interested in
considering similar options for "schema" if that makes sense.
i am not sure that it makes the same sense. a profile is an additional
"layer" of constraints on top of a schema, so to speak. it is always
safe to ignore it, because the baseline always is the schema (i.e., the
media type).
for complex scenarios, you might need something more complex, such as
the `Profile` header proposal (which to my mind uses the term "profile"
in a rather different way from RFC 6906 anyway and to some extent simply
may be actually a schema negotiation proposal).
I see "profile" and "schema" as similar, but while "profile" identifies
a broadly applicable subset, "schema" identifies a set of values
suitable for a much more specific purpose. It can also identify
something broad, but the ability to be specific is what (I think)
distinguishes it.
a schema sets the basics in how to represent concepts. if you cannot
agree on a schema (which to me is nothing but the implementation of a
domain model), then there is no mutual understanding.
a profile assumes that there is a baseline, and the profile is an
optional layer of conventions. it should always be safe to go back to
the baseline and still have the mutual understanding established by that
baseline. (that's pretty much the atom example from the profile spec)
|
This seems like a good key distinction: the schema is about representation. You may have different valid and useful representations of an abstract type (which might clarify the distinction between "type", registered as describing the "abstract semantic type") vs "schema" (describing a particular way to represent that type in a data model, which may then be encoded in one or more media types). So you could potentially negotiate both the schema (describing the data model for a representation) and media type (the concrete encoding of that data model into a JSON/YAML/XML/whatever document). |
On 2017-12-30 11:35, Henry Andrews wrote:
So you could potentially negotiate both the schema (describing the data
model for a representation) and media type (the concrete encoding of
that data model into a JSON/YAML/XML/whatever document).
at least for my terminology this is not how i think about schemas.
schemas are defining models and their representation, let's say an XML
model in terms of how it is represented in XML concepts. that tells you
everything you need to know in terms of how this will look on the wire.
same for JSON schemas.
for me, RDF is the outlier here because as an exception to the usual way
how people use media types. RDF is an abstract model and you need to
know a media type to determine how to serialize it. but the general idea
that cultivating a zoo of model serializations is a good thing is more
the exception than the rule.
one way or the other, i would be clear about how you use these terms,
and if possible try to align them with "standard usage". but of course
that's not a super well-defined concept to begin with...
|
JSON Schemas work on a data model that is derived from but not specific to JSON. This is why I view the schema and media type as separate. JSON Schema keeps them separate, and the expectation is that people may use JSON, JSON5, YAML, TOML, CBOR, Protobuf, whatever with them. Some of those are a better fit than other, but the extensibility of JSON Schema allows for describing concepts more precise than can be directly expressed in the data model. You could probably make it work for XML to some degree, but I can't imagine why anyone would. So the data model and media type are not entirely separate, but they are definitely not identical. Any definition of "schema" that requires them to be would be problematic for JSON Schema. |
On 2017-12-30 12:00, Henry Andrews wrote:
JSON Schemas work on a data model
<https://tools.ietf.org/html/draft-handrews-json-schema-00#section-4.2.1> that
is derived from but not specific to JSON. This is why I view the schema
and media type as separate. JSON Schema keeps them separate, and the
expectation is that people may use JSON, JSON5, YAML, TOML, CBOR,
Protobuf, whatever with them. Some of those are a better fit than other,
but the extensibility of JSON Schema allows for describing concepts more
precise than can be directly expressed in the data model. You could
probably make it work for XML to some degree, but I can't imagine why
anyone would.
interesting info, thanks. so it's more "JDM" (keeping in line with the
XDM) than actually strictly "JSON schema". that makes sense as for
example XSD also technically is an XDM schema language. it's just that
nobody ever cared (much) because XML is the only relevant serialization
out there.
it's just that in my mind, when you go this route, this recurring idea
of being able to indicate a "model" to me terminologically makes more
sense than the term "schema", but that's really just a question of how
people typically use these terms in their own scenarios.
(and of course you could argue that "model" is too abstract and that you
could have multiple schemas for the same model, which of course is true.
it's complicated. ;-)
So the data model and media type are not entirely separate, but they are
definitely not identical. Any definition of "schema" that requires them
to be would be problematic for JSON Schema.
ok, fair enough. i would just make all of this explicit so that people
understand the terms and how they are used in that context.
|
Thanks, @dret this discussion is very helpful. It's definitely occurred to me that "schema" is not necessarily the best name for JSON Schema, particularly given that people use it for so many things including data model definitions (code generation, doc generation, ui generation) to the extent that we are looking at adding vocabularies specifically for those purposes. Then again, the project's been called JSON Schema for many years, and changing the name would likely kill any momentum of the larger ecosystem, so... ¯\(ツ)/¯ |
I was only casually paying attention, but I wasn't actually aware the specification changed the link relation we were using. No existing link relation was sufficient?
For example
https://tools.ietf.org/html/rfc5988 (HTTP Link header, including description of the "type" attribute) I know @dret suggested "profile" was incorrect for us but my reading of RFC6906 suggests we were using it exactly correctly. Perhaps I wasn't describing what we're intending to do very clearly? |
@awwright duuuuuuuuude....
At some point, I can't keep waiting. I know I pinged you on email and slack because I know you are interested in this area. This is the 2nd time in the past month that someone has wanted to revisit something that I begged and pleaded for comments on. I don't know what to do when people won't reply in any meaningful way for over a month. I try direct email, @-mentions, mailing list, IRC, slack, anything and everything I can think of. I announced the final review period everywhere, often repeatedly, and it was open for a solid month. I did everything I could think of. In all seriousness, what am I missing? What do people need in order to make a timely review before publication? Or at least skim the change log, which lists this change? Regarding "profile", if you want to keep arguing with @dret about it go ahead, but I found his reasoning quite clear and see no reason to revisit it. I don't know how you read him clearly stating that schemas are not profiles and come to a different conclusion. You also cited HTML's "profile" , which does seem much closer to what we would want, but the response to that is that just because HTML uses the same word, that does not mean that it is using it in the RFC 6906 sense, and you never replied to that concern (or made any further replies on the issue at all for the remaining ten months that it was open). There is also the Regarding "type", as far as I can tell we both agreed that it was not quite right. It is specified to identify the "abstract semantic type", which is a more generic concept. If I identify something's "type" as "car", there may be numerous schemas that describe different ways of representing a car. In particular, as the representation evolves, new schema versions will be published and used, but the "abstract semantic type" remains the same. So I see "type" as serving a clear purpose of abstracting away concrete representation details, while the relation we need is specifically about concrete representation. As for "describedBy", it's still in the spec and still doing exactly what it was doing before. Although I'm not actually sure we should continue using it. AFAICT it is more often used for things like human-readable documentation, which would be useful alongside of schema links. We were also using it because "profile" is specifically an identifier and not a locator, so "describedBy" was to be the locator. But we could define things differently with our own media type parameter / link relation if we want. I'm definitely not dead-set on getting rid of it, but once we settle on the behavior we want from our own relation type we should make sure our usage still makes sense. |
@handrews Actually I do remember a lot of that now that you mention it. Have we pinged @RubenVerborgh to get his take on how/if his usage is similar to ours & RFC 6906? Edit: Yeah you did that too, nice. Edit: My reading of rel="type" is that it's extremely generic:
So while something more specific would be preferable, but I still think could be relevant |
Jup, pinging happened here: https://github.com/ProfileNegotiation/I-D-Accept--Schema/issues/13#issuecomment-354633650 |
@awwright I think that "type" is relevant, just not sufficient. I would actually use it alongside of a more specific link relation, such that in an API that evolves (or even does coarse-grained versioning in the base URI) the resources would have a stable "type" but different schemas. So I could use the same "type" values across the current kinda-REST-ish version of the API I'm working with, and also with a replacement fully RESTful API, even though the schemas would be very different. I'm not sure how that would be of use to clients off the top of my head, but it seems like a worthwhile distinction. I think we need to figure out what level of specificity we want. I have been going for something more specific, particularly to enable content negotiation. But I think that |
On 2018-01-02 09:29, Henry Andrews wrote:
I think we need to figure out what level of specificity we want. I have
been going for something more specific, particularly to enable content
negotiation. But I think that |Accept-Profile| is the most promising
avenue for schema-based content negotiation, which makes me a bit less
concerned. Still, we need a media type that will work for us in non-HTTP
environments (e.g. will CoAP adopt |Accept-Profile| as well?)
fwiw, i have started working on an update of RFC 6906, the idea being to
improve the wording, and to add an HTTP preference for profiles. that
would not be as complex as the Accept-Profile mechanism, but should be
better than having to depend on media type parameters.
any feedback on that feature as well on what other improvements of RFC
6906 are possible are very welcome (ideally in the form of issues).
https://github.com/dret/I-D/tree/master/rfc6906bis
|
@dret Will this updated version of the RFC explain how to achieve such a preference with existing means, or will it introduce a new header (as we're planning for https://github.com/profilenegotiation/I-D-Accept--Schema/)? |
On 2018-01-03 05:06, Ruben Verborgh wrote:
@dret <https://github.com/dret> Will this updated version of the RFC
explain how to achieve such a preference with existing means, or will it
introduce a new header (as we're planning for
https://github.com/profilenegotiation/I-D-Accept--Schema/)?
it will not introduce a new header, but a new registered value for an
existing header: the plan is to register a "profile" HTTP preference,
which then allows HTTP-level profile negotiation.
|
@dret but to clarify, your "profile" preference would use your definition of "profile", which excludes schemas, so could not be used for JSON Schema, correct? So our options are:
Is this correct? |
On 2018-01-05 09:27, Henry Andrews wrote:
@dret <https://github.com/dret> but to clarify, your "profile"
preference would use your definition of "profile", which excludes
schemas, so could not be used for JSON Schema, correct?
that's just my conceptualization and i know that others see things
differently. for me:
- a schema is based on some metamodel and has a way to define a model.
thus a schema introduces a new abstraction layer.
- a profile is a specialized way to use a model based on additional
constraints. it adds to the model (by adding constraints), but doesn't
add a new abstraction layer.
So our options are:
* Register a "schema" preference
* Jump on the "Accept-Profile" bandwagon since it uses a broader
definition of "profile" that includes schemas
* Come up with something else (like our own media type with its own
parameters such as a "schema" parameter)
Is this correct?
i don't want to be the one telling you what your options are. it seems
that people are trying to do somewhat related things in this area, but
then again it often seems like they're far apart enough to make it hard
for people to join forces. in particular, "schema" seems to be the
walking dead among the link relations.
i'd be more than happy to see if a revised RFC 6906 may be more useful
for you. but if i recall correctly, you very specifically wanted to use
profiles for the use cases of schemas that i wanted to separate them
from. that may make it hard to find changes that fit your needs.
FYI: https://github.com/ProfileNegotiation/I-D-Accept--Schema
|
I don't quite get the "schema-based content negotiation" part. Can you clarify? In my understanding, an instance can only have one schema. This contrasts with profiles (in the sense of "application profiles" a la Dublin Core); an instance may be represented through several profiles. |
@dlax certainly! What I'm thinking of is a REST API that evolves representations at the per-resource granularity, rather than doing coarse grained URI-based versioning. In this view, resources (the abstract things on the server) are not versioned. If I have Person resources, the resource is always Person. Person is a concept, it does not change. However, the representation of a Person in a given API will likely changes. Fields are added, enum sets are changed, fields may even be removed in a compatible way if they were not required. The versioning of the representation is expressed by assigning a new schema to each successive version (and the URIs for the schemas probably do have some sort of version in the URI- semantic versioning, a date-time stamp, whatever). So let's think particularly of the case where the representations are all compatible with each other, and the schemas are designed to support that, meaning:
There are other constraints you can use on schema design, but these illustrate the point. So: {
"$id": "https://example.com/schemas/some-entity/1.0.0",
"type": "object",
"properties": {
"foo": {"type": "integer"},
"bar": {"type": "string"}
}
} {
"$id": "https://example.com/schemas/some-entity/1.1.0",
"type": "object",
"properties": {
"foo": {"type": "integer"},
"stuff": {"type": "boolean"}
}
} The instance: {
"foo": 42,
"bar": "hello",
"stuff": false,
"nonsense": null
} validates against either the 1.0.0 or 1.1.0 schemas. If you're doing something with So if I'm an API client using the "some-entity" resource, I may have started out while it was on version 1.0.0. I would use content negotiation (media type parameter, HTTP Prefer header preference, or a new Accept-Profile or Accept-Schema header) to ask for a representation matching schema version 1.0.0. The response can come back with the instance above and link to both the 1.0.0 and 1.1.0 schemas, because it is valid according to both and therefore usable as either. I can use this information to log that the resource has been updated, and then a human can decide whether to start asking for 1.1.0 (because I can go into more detail but I'll pause here to see if this is making sense. |
On Jan 6, 2018, at 13:31, Henry Andrews ***@***.***> wrote:
What I'm thinking of is a REST API that evolves representations at the per-resource granularity, rather than doing coarse grained URI-based versioning.
....
I can use this information to log that the resource has been updated, and then a human can decide whether to start asking for 1.1.0 (because stuff is needed) or stay with 1.0.0 (because bar is still needed).
I can go into more detail but I'll pause here to see if this is making sense.
all if this makes sense and as you can tell, our general design thinking is very close: http://dret.typepad.com/dretblog/2016/04/robust-extensibility.html
what i am wondering is why you then wouldn’t simply call it what it is (semantic versioning, as bad as that name is), represent that in your instance, and be done. why the need to piggyback all the versioning ideas you have onto schemas and their naming conventions? it seems like you’re mixing things a little.
|
On Jan 6, 2018, at 12:52, Denis Laxalde ***@***.***> wrote:
I don't quite get the "schema-based content negotiation" part. Can you clarify? In my understanding, an instance can only have one schema. This contrasts with profiles (in the sense of "application profiles" a la Dublin Core); an instance may be represented through several profiles.
ever heard of DSDL? there almost by definition each instance has multiple schemas. it’s actually a very clever approach: modularize validation like any other non-trivial task, and allow schemas to be composed of multiple languages with each language having a specific focus.
|
This makes sense, thanks!
Actually, I was more thinking about "schema as a data model" rather than "schema as a validation tool". |
@dret <https://github.com/dret>
ever heard of DSDL? there almost by definition each instance has
multiple schemas. it’s actually a very clever approach: modularize
validation like any other non-trivial task, and allow schemas to be
composed of multiple languages with each language having a specific
focus.
Actually, I was more thinking about "schema as a data model" rather than
"schema as a validation tool".
that's what it looked like to me. but i think then you're more moving
into the direction of the "type" link, which opens up a different can of
worms (and probably should be treated as something different, as a
schema typically is an implementation of a type).
even something as widely known as HTML has different schemas, or at
least it used to have when it was more declaratively defined. there was
a "strict" schema intended to be used for production, and a "loose"
schema intended to be used for consumption. regardless of the details,
it just demonstrates that "schema" and "type" are different things.
@handrews <https://github.com/handrews> explained his intended usage of
content negotiation for the former "definition" and I now see how a
client may ask for a resource that follows a given data model.
that again sounds very much like "type".
So about multiple validation and DSDL, yes, a resource may have several
complementary schemas, but it's not clear to me how negotiation would
come into play as far as validation is concerned. Would a client ask for
a resource that validates with a particular technology? Would that
resource's representation be different if another technology had been
asked? Maybe it's just irrelevant and only the data model (or profile)
point of view matters for negotiation.
well, DSDL probably is not a things anymore. but yes, it used to be the
case that you would validate for the concern you had at the moment. if
you wanted to see if character ranges were respected, you would validate
with that particular schema. to figure out how namespaces were being
used you used the language specialized in that, and so forth. it was
actually quite an elegant design.
but really, let's not talk about DSDL. i just wanted to point out that
there can be multiple schemas, and that what you seem to be talking
about is more concerned with just versioning and possibly "type".
|
https://json-schema.org/latest/json-schema-core.html#rfc.section.11.1 proposes: But then the final paragraphs of https://json-schema.org/latest/json-schema-core.html#rfc.section.11.2 has: I don't understand why these use different link relations? Sorry if I should have raised this as a new issue, @handrews! |
it seems that -07 recommends using
schema
-typed links (as one method) to link an instance to a schema. this link relation is not registered (yet). this means either there should be some spec doing this, or maybe the JSON schema draft should do it. if the latter and you need some help, let me know.The text was updated successfully, but these errors were encountered: