Skip to content
This repository has been archived by the owner on Apr 13, 2019. It is now read-only.

Profile header or profile link? #11

Closed
RubenVerborgh opened this issue Apr 14, 2017 · 61 comments
Closed

Profile header or profile link? #11

RubenVerborgh opened this issue Apr 14, 2017 · 61 comments

Comments

@RubenVerborgh
Copy link
Member

Should entity bodies be described by:

  1. a Profile HTTP header?

  2. a Link HTTP header with a profile relation?

  3. both?

@larsgsvensson
Copy link
Member

If I remember the discussion in Amsterdam correctly, the idea was to use the Profile header and the Link header with the keyword "profile" (not rel="profile" but rel="self"; profile="<urn:example:whatever>")

@RubenVerborgh
Copy link
Member Author

not rel="profile"

Then we go against RFC6906, which seems problematic.

@larsgsvensson
Copy link
Member

Do we know how widely deployed RFC 6906 is? (or put differently, is it really a problem if we propose another solution to the problem RFC 6906 tries to solve if the new solution is better (which I think it is since we can supply more information about the referenced resource, e. g. the combination of media type and profile(s)))?

@RubenVerborgh
Copy link
Member Author

RubenVerborgh commented Apr 18, 2017

We'd have to very carefully weigh our options.

We should ask @dret if he knows about usage statistics of RFC6906.

@larsgsvensson
Copy link
Member

OK, can you contact him?

@RubenVerborgh
Copy link
Member Author

Yes, the ping above should work.

@dret
Copy link

dret commented Apr 20, 2017 via email

@hvdsomp
Copy link
Collaborator

hvdsomp commented Apr 20, 2017

Hey Erik, thanks for commenting. Unfortunately, despite your blog post, I (and I assume others) remain in the dark as to what "profile" really is about and what it is not about, despite the RFC, your blog post and your response to my comment to it. I'm not trying to be thick, just plain honest. What would, for example, help is to show what language in the RFC excludes "profile" from being used for e.g. schema.

Cheers

Herbert

@dret
Copy link

dret commented Apr 20, 2017 via email

@larsgsvensson
Copy link
Member

Hi Erik and thanks for your thoughts.

Like @hvdsomp, I'm also a bit at odds figuring out exactly what a profile is. The posts you've shared provide some clarification, but still leaves some questions open.

I'm particularly interested in your views on the choice of "profile" vs "schema" since you use both terms in your posts. What I want to be able to exchange information on are things like XML Schemas, SHACL shapes or Dublin Core Application Profiles, allowing a server to say that an XML document would validate against a specific XSD document, or an RDF representation (irrespective of serialisation (RDF/XML, Turtle...)) would adhere to the constraints specified in a SHACL or ShEx document. Would you consider the SHACL, ShEx or XSD files to be profiles or schemas?

In the first version, the I-D was about "schema negotiation" (cf #1) and later we decided to call it "profile negotiation" instead. Perhaps we need to reconsider that. I've re-opened #1 until we've decided on that.

@RubenVerborgh
Copy link
Member Author

I would propose to follow the description of a profile from RFC 6906:

For the purpose of this specification, a profile can be
described as additional semantics that can be used to process a
resource representation, such as constraints, conventions,
extensions, or any other aspects that do not alter the basic media
type semantics.

And from XML Schema:

An XSD schema is a set of components such as type definitions and element declarations. These can be used to assess the validity of well-formed element and attribute information items (as defined in [XML Infoset]), and furthermore to specify additional information about those items and their descendants.

Following the above, it seems to me that schemas are a proper subset of profiles. In other words, everything that is a schema is also a profile, but not every profile is a schema.

Given that we likely also want to support several JSON scenarios, and that most cases there do not have formal schemas that allow validity assessment, I would opt for "profile". Also, I see no need to restrict us unnecessarily to schemas, when we can also support the wider set of profiles with the same effort.

If we want our own definition of "profile", I would (following my SDSVoc contribution) propose something like:

A profile is a set of structural and semantic constraints of representations, which can apply in addition to syntactic, structural, and semantic constraints defined by a MIME type.

@dret
Copy link

dret commented Apr 24, 2017 via email

@dret
Copy link

dret commented Apr 24, 2017 via email

@RubenVerborgh
Copy link
Member Author

but you are thinking in terms of "there must be one resource that represents
the profile in a machine-processable way that is the one true
representation of the profile".

That is not the case, so we should be good then.

@larsgsvensson
Copy link
Member

like i said, i seems to have a problem expressing this. one of the main aspects of profile said very explicitly in the RFC is that it is an identifier, not a link. these are different things.

OK, sloppy terminology on my side: the proposal we're discussing here is about profile (or schema...) identifiers, not about links

@larsgsvensson
Copy link
Member

to achieve the biggest impact in practice my suggestion would be to go for profiles, but to treat them in their RFC 6690 spirit: they are identifiers only, (and profiles can be implemented by 0-n schemas, but these outside of the scope of profile negotiation).

I'd say that even if profiles and schemas are different things, we cannot completely ignore the schema negotiation. When a client has requested (and received) a resource it also might need to find a corresponding schema against which it can validate the resource and that requires negotiation, too (e. g. for an XML document to decide if the profile should be implemented by a DTD, an XSD or a RELAX NG document, or for an RDF document if the schema is implemented by a SHACL or a ShEx shape).

@RubenVerborgh
Copy link
Member Author

When a client has requested (and received) a resource it also might need to find a corresponding schema against which it can validate the resource and that requires negotiation, too

So how about we use the same mechanism for negotiation profiles and schemas? (Given that "a schema is one possible way to implement a profile.")

@dret
Copy link

dret commented Apr 24, 2017 via email

@dret
Copy link

dret commented Apr 24, 2017 via email

@RubenVerborgh
Copy link
Member Author

how would that reflect situations where the number of schemas fir a profile is != 1

The idea is that multiple profiles/schemas can be indicated.

@dret
Copy link

dret commented Apr 24, 2017 via email

@dret
Copy link

dret commented Apr 24, 2017 via email

@RubenVerborgh
Copy link
Member Author

how would you know which ones are profile identifiers and which ones are schema links?

The assumption is that the client has them hard-coded.

per RFC, multiple profile URIs mean that multiple profiles are signalled as being present by using multiple identifiers.

+1

@dret
Copy link

dret commented Apr 24, 2017 via email

@RubenVerborgh
Copy link
Member Author

how would a client know if it encounters an
unknown URI if this is a profile identifier it does not know or a schema
link it can follow to find a schema resource?

With the current design, it would not. We can add this to requirements (#16) if this is a necessity.

@dret
Copy link

dret commented Apr 24, 2017 via email

@RubenVerborgh
Copy link
Member Author

The current scenario is:

  • clients ask for profiles
  • server responds with a subset of those profiles

So clients know what to expect.

@dret
Copy link

dret commented Apr 25, 2017 via email

@RubenVerborgh
Copy link
Member Author

Yes, that's what we aim to do.

@dret
Copy link

dret commented Apr 25, 2017 via email

@larsgsvensson
Copy link
Member

frankly, it seems to me you're trying to engineer around one of RDF's oddities, which is the assumption that everything is just RDF

This draft is not intended to be specific for RDF at all (see #13).

+1 (that's why I use XML examples)

A "profile" could also signal, for instance, the presence of a certain key/value pattern in a JSON document. It could also be HTML with a certain microformat. It could be many, many things, only a couple of which are specific to RDF (certain vocabularies, certain JSON-LD frame).

JSON is definitely a use case (particularly since Ruben's talk in Amsterdam was titled "Your JSON is not my JSON". The reason I use XML examples (with profiles implemented as DTDs, XSDs and RELAXNGs) is simply that I don't know enough about the capabilities of JSON Schema.

@dret
Copy link

dret commented Apr 25, 2017 via email

@larsgsvensson
Copy link
Member

On 2017-04-25 14:34, Ruben Verborgh wrote:
Yes, that's what we aim to do.

i sense a disconnect between what you're saying and what lars is talking about. maybe it would be good to clarify how this mechanism is supposed to be useful and to work on the web in general.

My vision is that profile negotiation works just as negotiation on mediatype (Accept / Content-type) or language (Accept-Language / Content-Language). If we assume (and that's a large if, just to show how I intended this to work) that the profile content negotiation is made using the Accept-Profile and Profile headers (analog to Accept-Language and Language) the interaction would be along those lines:

GET /some/resource
Accept: application/xml, text/xml
Accept-Language: en;q=0.8, fr;q=0.7
Accept-Profile: <http://example.org/profile-1>;q=1.0, <http://example.org/profile-2>;q=0.7

HTTP/1.1 200 OK
Content-Type: text/xml
Content-Language: fr
Profile: <urn:example:profile-2>

Here, <http://example.org/profile-1> and <http://example.org/profile-2> are profiles that can be implemented using different schemas (DTD, XSD, SHACL, DCMI Application Profile, ...). Originally, I intended those profiles to be applicable to specific media types but having the profile as an abstract model that has specific implementations in specific schema languages is fine, too, as long as we can get from the profile to the schema.

@larsgsvensson
Copy link
Member

sorry for being a broken record, but just to make sure we keep the eye on the prize: XML and DTD/XSD/RELAXNG do not have the same relationship as media types and profiles. DTD/XSD/RELAXNG are vocabularies (defined in specific schema languages) based on a specific metamodel (and as you pointed out, just getting any XML isn't going to cut it if you depend on a specific vocabulary being used). profiles are constraints on vocabularies (constraining/extending them) and you always get the same vocabulary, with or without profile. the profile is just a "style" of using it.

What I hear is that my suggestion is more tied to what you call "schema" than to "profile". Is that correct?

@dret
Copy link

dret commented Apr 25, 2017 via email

@dret
Copy link

dret commented Apr 25, 2017 via email

@hvdsomp
Copy link
Collaborator

hvdsomp commented Apr 25, 2017 via email

@larsgsvensson
Copy link
Member

i guess the only use case i can come up with if people wanted to negotiate on both levels:

  • their API supports various metamodels such as XML and JSON.
  • they also support various vocabularies in each of these metamodels.
    i have never seen anybody asking for this when they designed an API.

I think (but I'm not quite sure) that the use case that @dvh submitted yesterday (cf #17) goes in this direction. The spatial data is available in several formats (including GML (application/gml+xml) and RDF) and the customer can choose if they want the coordinates in EPSG:28992 (the Dutch reference system), in EPSG:4668 (the European reference system) or plain old CRS84 aka EPSG:4326. Is that correct, Dimitri?

A side note: GML is available in different flavours (that they call "profiles" as opposed to "application schemas"), too, cf. https://en.wikipedia.org/wiki/Geography_Markup_Language#Profile

@larsgsvensson
Copy link
Member

what i am wondering: apart from RDF

OK, my original use case comes from RDF and that's where I have an itch to scratch. In the DNB, we have (for historical reasons) two services publishing data about the same authority entities (people, places and things). One is a "traditional" linked data service (LDS) using content negotiation to serve html, turtle and further RDF serialisations. The canonical URIs we mint for our entities resolve to this service and all RDF serialisations it offers use the same data model. The other service (EntityFacts) offers a beefed up version of the same data, enriched with images, further links etc. and originally only served JSON that later was made to JSON-LD. The URIs (really URLs) and the content model of EntityFacts are different from that of LDS. We also have a similar issue with two bibliographic formats (DINI-KIM and BIBFRAME), so it isn't specific to just one type of data.

Since we want to use the LDS URIs as the entry point to all our services, the goal is to integrate EntityFacts into the LDS. That would, however, mean that we need a way for the client to specify if they want the LDS format or the EntityFacts format (or profile, or schema, or whatever). That's why I wrote this spec and yes, it was heavily geared towards that use case. Then I got the feedback that the spec shouldn't be too geared towards RDF so I started to make it more abstract. And as Herbert wrote, there are non-RDF use cases for this from the cultural heritage domain.

@larsgsvensson
Copy link
Member

as long as we can get from the profile to the schema.

not in any way supported by the profile spec. and as discussed earlier,
profiles are not necessarily implemented by a schema. they may be
defined in prose, for example. and that's fine, too.

Yes, profiles defined in prose are fine. But to perform validation or build interfaces (e. g. Java XMLBeans) we need an actual schema, too. That means that we need a way to get from the profile to the schema. If that's not part of the profile spec then we need another solution.

@dret
Copy link

dret commented Apr 25, 2017 via email

@larsgsvensson
Copy link
Member

On 2017-04-25 20:03, Lars G. Svensson wrote:
Yes, profiles defined in prose are fine. But to perform validation or build interfaces (e. g. Java XMLBeans) we need an actual schema, too.
why is that? if a profile is defined in prose that's what you have. you'll have to write code implementing that prose. you might prefer to have a schema and if you have the energy and skills you could write one
yourself, but that doesn't change the fact that the profile is not defined by a schema.

I'm not saying that the schema defines the profile, I only say that for some uses we need a schema. And while the user could write a schema I'd say that if the profile publisher also publishes a schema that implements the profile, that would improve interoperability.

That means that we need a way to get from the profile to the schema. If that's not part of the profile spec then we need another solution.

i think assuming that profiles always have one authoritative schema associated with them is not an assumption backed by how things work in practice. profiles have to be identifiable. that's it.

I'm not assuming that profiles always have one authoritative schema. But it would be helpful for profile users if there is a standardised mechanism that allows them to find schemas associated with a certain profile. That could be done using the Link header, too:

Link: <http://example.org/someProfile/schema1.xsd>; rel="schema"; type="application/xml",
    <http://example.org/someProfile/schema1.rng>; rel="schema"; type="application/xml",
    <http://example.org/someProfile/schema2.ttl>; rel="schema"; type="text/turtle",
    <http://example.org/someProfile/schema2.jsonld>; rel="schema"; type="application/ld+json"

(that would require the registration of the relation type "schema", but that should be easy; I find it more worrysome that I couldn't figure out a way to say that the first schema is an XSD document, the second a RELAX NG, the third uses SHACL (in Turtle) and the fourth ShEx (in JSON-LD)).
So what I'm trying to say is that we need a standardised mechanism to go from a profile document to associated schemas. A document talking about how to use that would then say something like "Profile publishers MAY want to link from their profile documents to associated schemas. If they do so, they SHOULD use the standardised mechanism". So I'm not saying that everyone has to do that. I'm only saying that we if someone wants to do that, we should give guidance on how to do that.

@dret
Copy link

dret commented Apr 26, 2017 via email

@dvh
Copy link

dvh commented Apr 26, 2017

@larsgsvensson wrote:

I think (but I'm not quite sure) that the use case that @dvh submitted yesterday (cf #17) goes in this direction. The spatial data is available in several formats (including GML (application/gml+xml) and RDF) and the customer can choose if they want the coordinates in EPSG:28992 (the Dutch reference system), in EPSG:4668 (the European reference system) or plain old CRS84 aka EPSG:4326. Is that correct, Dimitri?

Yes, that's correct, except right now the spatial data is only available in JSON (as embedded GeoJSON). However for the statement of @dret:

  • they also support various vocabularies in each of these metamodels. i have never seen anybody asking for this when they designed an API.

I don't think CRS negotiation has anything to do with a vocabulary. I just want to tell the server which CRS I prefer, just like using the Accept-Language to tell the server which language I prefer, apart from the mime-type returned.

@dret
Copy link

dret commented Apr 26, 2017 via email

@dvh
Copy link

dvh commented Apr 26, 2017

I know, this is exactly the reason why I came here :)
But as we both seem to agree that CRS is not a profile-like thing, should we conclude that my challenge can't be solved using Profile negotiation?

@dret
Copy link

dret commented Apr 26, 2017 via email

@dvh
Copy link

dvh commented Apr 26, 2017

About the GeoJSON RFC I'm glad this is also part of it, otherwise we couldn't use GeoJSON at all:

However, where all involved parties have a prior arrangement, alternative coordinate reference systems can be used without risk of data being misinterpreted.

In my interpretation this means that arrangements between clients and servers about the CRS being requested/provided are no part of the GeoJSON RFC but should be made somewhere else, e.g. with request/response headers. @lvdbrink, seems like OGC/W3C needs to come up with an alternative anyway ;)

@dret
Copy link

dret commented Apr 26, 2017 via email

@dvh
Copy link

dvh commented Apr 26, 2017

True, I'm off to https://github.com/w3c/sdw ;)

@larsgsvensson
Copy link
Member

larsgsvensson commented Apr 27, 2017

On 2017-04-26 10:33, Lars G. Svensson wrote:
I'm not saying that the schema /defines/ the profile, I only say that
for some uses we /need/ a schema. And while the user /could/ write a
schema I'd say that if the profile publisher also publishes a schema
that implements the profile, that would improve interoperability.

you may need a schema, but how to find it is a different matter. profile
URIs are identifiers. so logically speaking, you would need to find the
specification document for the profile, and then see if that contains
any authoritative schema. markus lanthaler had similar questions and
concluded that a profile registry would help with making profile specs
easier to find. the current contents of that registry speak for itself.

https://www.iana.org/assignments/profile-uris/profile-uris.xhtml

Indeed not very much in there...

but that doesn't mean that the idea itself doesn't have merit. only that
maybe a central registry isn't what's really needed here. but come to
think of it, profile URIs are a web concept and with or without registry
should be the kinds of things people care about when they manage their
API landscape. adding to web concepts!

I'm not assuming that profiles /always/ have one authoritative schema.
But it would be helpful for profile users if there is a standardised
mechanism that allows them to find schemas associated with a certain
profile. That could be done using the |Link| header, too:

that's assuming there's a standard way to first find anything from a
profile URI
. there isn't. it's just an identifier.

Yes, I shouldn't assume that things are as webby as they should be... If the profile URI is e. g. a URN, then there isn't much to do (except perhaps trying to use the IANA registry), but if the profile has an http URI and there is a document available on the web, then it's at least theoretically possible that the publishers can use a standardised way to point agents at schema definitions.

(that would require the registration of the relation type "schema", but
that should be easy; I find it more worrysome that I couldn't figure out
a way to say that the first schema is an XSD document, the second a
RELAX NG, the third uses SHACL (in Turtle) and the fourth ShEx (in
JSON-LD)).

the schema link relation was suggest many times, but apparently there
never was enough momentum to really get it spec'ed and registered. i
find that one of the more promising ideas (but quite different from a
profile). if you think that would be worth pursuing, i'd be willing to
help.

Let's see where this spec takes us and then we can decide.

regarding the missing schema language media types: that's indeed a bit unfortunate and part of the general problem that many spec authors don't think about (and aren't encouraged when the specs go through review processes) how their vocabularies can be identified on the web level.

Do you think there is a solution to that problem at all or is everything lost already?

@larsgsvensson
Copy link
Member

BTW, I just saw in the PR for Linked Data Notifications that they use the NS Document for ActivityStreams as a JSON-LD profile URI, adding it to the media type in the Content-Type headers

POST /inbox/ HTTP/1.1
Host: example.org
Content-Type: application/ld+json;profile="https://www.w3.org/ns/activitystreams"
Content-Language: en

Cf. https://www.w3.org/TR/2017/PR-ldn-20170321/#sending-notification-request. To me, this feels a bit weird since the namespace document defines a term "profile" but never says that it is a profile (nor what restrictions it imposes...)

@dret
Copy link

dret commented Apr 27, 2017 via email

@dret
Copy link

dret commented Apr 27, 2017 via email

@larsgsvensson
Copy link
Member

you could write up drafts for registering specific schema language media types. maybe that's actually a worthwhile thing to do, i had the same issue with XSD many times.

Thinking of it, would you agree that XSD could be seen as a profile of XML? The specification does add syntactic and semantic constraints...

for the specs still under development, just raise issues with their editors. but then again, for the RDF-based ones, minting media types is not something these communities ever do (unless there's a new RDF serialization), so you might get some resistance to this idea. for other communities i'd say they probably are open to the idea, but getting enough momentum behind these drafts would take quite a bit of energy and persistence.

The upcoming ODRL vocabulary specification (soon going to CR) also talks about how to create and document profiles and I think they've got it right. They could serve as an good use case since they specify three different encodings of their model: RDF/OWL, XML and JSON-LD.

@dret
Copy link

dret commented Apr 28, 2017 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants