-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New mime types for RDF-star serializations (inc. SPARQL results) #43
Comments
I think we do need a new media type for these formats. We don't want an RDF*-unaware client request an SPARQL* endpoint, and choke on the results it gets from the server. I would keep the same XML namespace, however, as we essentially only extend the original one. |
It might be abusing the system, but given the number of different mime types potentially affected, perhaps a profile parameter would be easier to handle. We’ve used these in JSON-LD, for example. |
@gkellogg A difference between JSON-LD profiles and what we're doing here is that with JSON-LD, regardless of the profile used, the document produced is always syntactically valid JSON-LD. Whereas here we are extending the syntax in a way that makes it actually syntactically invalid when considered by a non-extended parser. |
Not entirely, the profile But, otherwise, consider if we wanted to make a sub-type. This might involve creating the following sub-types (not necessarily suggesting
But, then you get to things like But, if we used a profile, it could be applied uniformly across a variety of mime-types:
(And, yes, you can specify multiple profiles, IIRC). |
Fair enough, wasn't aware of that. In any case I'm not necessarily against the use of profiles, I just thought I'd point what I thought was a distinction. I am tempted by the notion of not having to introduce separate MIME-types for everything yet still having a meaningful way to distinguish. |
@gkellogg, GraphDB and RDF4J have defined new MIME types: https://graphdb.ontotext.com/documentation/free/devhub/rdf-sparql-star.html#mime-types-and-file-extensions-for-rdf-in-rdf4j. I'm against using profiles for this, for the reasons expressed by @jeenbroekstra plus:
|
@VladimirAlexiev - what do you propose for the issue I mentioned on #55? The same situation happens. The issue is that one format is an extension/superset of the other, "quite different" does not have that nuance. What is the MIME type of the result of
Note that it is not only the client library that is the problem - in some libraries, they let the application set the I don't think there is a perfect answer; we are making a "least bad" choice. |
Another factor is what do we want the end state to be like when all (or a substantial majority) of clients and servers have adopted RDF-star. MIME types do not go away. Introducing a new MIME type is a permanent commitment. I can't think of a practical, transition phase if the long term outcome should be one MIME type (existing or new) because transition introduces two points of change - start and finish. So we have to ask are new MIME types for results or content and file extensions the desired, long term outcome? c.f. |
In the case of SPARQL, there is some argument that .rq could be different for SPARQL*, but not really the result formats. In general, if you make a query using SPARQL* features then you would be expecting SPARQL* results. In my experience, the client is the one initiating the request and controlling the query, so content-negotiation for Star features doesn't seem to useful. That's somewhat different for Turtle*, but per Andy's reasoning, I'd be wary of introducing a new content type specifically to enable such features. If RDF 1.2 were to introduce new syntax, not necessarily Turtle*, I wouldn't expect that working group to use a new content-type/file extension. And, as I've said before, we didn't do this for JSON-LD 1.1. In general, it's sufficient for future proofing that clients raise an error if they see features they can't handle, which any non-star Turtle client would do when faced with the new syntax. Worse would be (as was the case for JSON-LD 1.0) that clients would silently ignore new features and generate different results. |
Yet another reason that RDF(etc)* appears to me to be ill-considered.
I think it's useful to have all considerations on one page, so here's a reproduction of the table in the GraphDB/RDF4J docs --
Perhaps it's time to recognize that RDF(etc)* is taking shape as a fork of RDF 1.1 (and all its serializations), not an extension of (un-versioned) RDF (nor its serializations), and thus REALLY TRULY HONESTLY needs an entirely distinct name, or a drastic rethink, and pursuit through the sparql-12 project (which isn't really just about SPARQL). (I'm more in favor of the latter.) |
I asked:
@TallTed - are you proposing that the MIME type of content that may contain RDF-star syntax is (note: |
I'm not proposing anything. I'm making observations. The table of MIME types above did not come from me; it was relayed by me from GraphDB/RDF4J. "The desired, long term outcome" from my seat is that the RDF(etc)* effort be (re-)connected to RDF(etc). |
The proposal I was referring to is:
A new MIME is not required; it is an option to consider by working through the advantages and disadvantages. There is no perfect answer here. RDF-star is one additional feature and current Turtle data remains valid. Of the 4 cases, "old client - new server" is the hardest. On the web, client and server do not move together where they just might in an enterprise setting if necessary. Who moves first? What are the consequences? Consider how to roll out a server which supports RDF-star. What is the MIME type of content that may contain RDF-star syntax? If it is Or SPARQL results? When |
It's generally not helpful to pluck tiny phrases from their larger context. That larger context:
Existing clients will (hopefully) be indicating Your other questions are worth further exploration in a broader sphere than MIME types and XML namespaces -- and likely should be treated as distinct issues in the overall consideration of whether RDF(etc)star should continue as an RDF fork or reunite with the evolutionary path of "conventional" (i.e., unstarred) RDF(etc). |
The list of RDF-related languages is getting long-ish (more in the W3C wiki):
¹ constrained by XML I believe that every one of these formats and implementations thereof would have to be updated to accept some encoding of embTriple (a triple in
Apart from ProposalsUsing Turtle, JSON-LD and RDF/XML to stand in for the unconstrained, JSON-constrained and XML-constrained formats:
Meta: I (or anyone with edit privs) can edit this to keep it representative of the proposals. (long live application/x-www-form-urlencoded) |
@ericprud - nice work, links and all! We need NTriples and NQuads to write tests case! |
Yeah, I kinda confronted that after I edited this. Given the use cases, I wonder if it's a change to NTriples or another language a lot like NTriples. |
@ericprud --
SPARQL Results are a serialization/materialization of results, i.e., of data, not of the query that produced those results. Therefore, I think this bullet should be reduced to SPARQL, and the SPARQL Results variants should be moved to the preceding bullet. I don't believe the proposed "prefixed" MIME types fit with the standard MIME rules for fallback interpretation, which use the E.g., E.g., To the contrary, Turtle* cannot be interpreted as plain-Turtle by an older Turtle processor. I'm expecting that JSON-LD* will not be interpretable by a plain JSON-LD processor, and I think it likely that the same will be true for RDF/XML*, i.e., that RDF/XML* will not be interpretable by a plain-RDF/XML processor, though these a reasonably likely to be (incompletely and imperfectly; i.e., without LD or RDF features) interpretable by plain-JSON or plain-XML processors. I think this makes the "prefixed" MIME types non-starters for (what I understand to be) their intended purpose. The proposed Indeed, the RFC you cite says explicitly that what is contemplated here is forbidden --
I don't see what benefit there is to having I think this makes the proposed "profile" MIME types a small improvement, if any, over the proposed "prefixed". The proposed "embedded" MIME types seem generally workable as MIME types, but I don't see them doing anything for usability of RDF-classic tools on RDF-star data. Which is fine, IFF we embrace that RDF(etc)star is a fork in the road of RDF and related tech. |
The Turtle* tests uses NT* : N-Triple with added |
I moved them into their own <li/> 'cause they're primarily tabular structures with terms in the cells. |
FTR, a mediatype of the form IIRC the grammar of media-types allows for several |
@pchampin -- I think your IETF media type info is outdated. The DID WG is pursuing registration of |
This was discussed during today's call https://w3c.github.io/rdf-star/Minutes/2021-01-15.html#item03 (end of the discussion) |
For some perspective on the long-term association of MIME types and changing formats, consider text/html and text/css, and others. Both specs have evolved significantly since introduced, and a 2000's era HTML client would not be able to properly parse HTML5, much less interpret, without knowledge of the tag change over time. Even the announcement DOCTYPE has been deprecated. I see Turtle* etc. as a logical evolution of the RDF formats, and the principle of follow-your-nose (once part of an RDF REC) would hold. Of course, this is a CG publication, and can't have formal weight, but early versions of HTML5 would have still used text/html prior to standardization. I repeat my suggestion that a profile parameter (if anything) is most appropriate. |
@gkellogg , I'm not sure that the HTML precedent applies because, while HTML has evolved enormously, any step change which went from something an SGML parser could consume to something it couldn't (e.g. dropping DOCTYPE) didn't occur until long after SGML/HTML parsers were obsolete. The high cost of changing media types (c.f. With Turtle et al, I don't think this community can afford the same resource investment that doctcoms made in HTML. I think tactically, it's better to invent an explicitly incompatible media type (i.e.
Unlike the cost of supplementing HTML media types, I think the cost of duplicating the RDF media types is almost purely aesthetic. |
+1 Is there an example of a MIME type which evolved over time so there is a MIME type for "v1" and a MIME type for "v2"? How different were the versions? (Turtle itself changed between initial registration 2007 and W3C REC 2014) |
The way I see it:
Granted, in the second case, the client may also crash miserably (but would that be advisable anyway, regardless of RDF*?), or consume a lot of resources before it realises that it can not parse the whole content. However, and I think that is @afs' point, the second option buys us a smooth transition for all RDF* content that happen to also be plain RDF (i.e. not containing embedded triples). |
The idea is that any "star" format will be a superset of the respective traditional format. So any traditional content is also valid "star" content. But not vice versa: if "star" features are used, then the content is not backward compatible.
My oh my, I didn't realize all these complications exist :-( After reading the above, I completely agree with Andy's sentiment. Andy makes strong points that a server may not know what content it holds or what result it returns, or it may be too expensive to determine the type of result precisely. So that's an argument FOR keeping only the traditional formats. Now let's examine the CONS, i.e. arguments for introducing "star" formats. (Out of @ericprud's classification, I think the "profile" or "embed" styles of expressing the formats can fill the bill.)
I think here's a good compromise:
Sorry, this is very imprecise, and a bit TL;DR. In brief, I believe that:
|
We had a straw-poll during our last call and there seem to be a general agreement to keep the old mime-types, but augment them with a profile or another parameter. |
Two use case that profile does not address:
|
@gkellogg, that "precedent" sounds to me like a violation of the whole idea of a profile, which is that the profile language is a subset of the original language: every legal sentence in the profile language should be a legal sentence in the original language. In fact, the Iana Considerations section of JSON-LD 1.1 I am not seeing a good justification for serving an RDF-star document with a MIME type for which it cannot be parsed. The value of having a MIME type is diminished if it does not accurately describe the actual content. |
It says that the frame document is processed the same with, or without the profile parameter. The frame document is processed by a JSON-LD processor when provided as the "frame" for transforming an input document. The framing algorithm will operate the same with or without any profile parameter. If a frame document were processed as the input for an algorithm such as Expansion or Compaction, or even Framing, a processor could complain about the presence of keywords such as In any case, that was the decision of the WG and goes back to the 1.0 version of Framing (which wasn't a REC). Perhaps it was made in error, but certainly at the time, something like |
I think @dbooth-boston's point was less about JSON-LD Framing, and more about Turtle vs Turtle-star, and this has come up before. To wit: The Turtle-star media type doesn't make sense as a profile of the Turtle media type, because Turtle-star is not a subset of Turtle. I agree strongly with this, and similar applies to the rest of the RDF-star serializations and their media types. Put simply: Profiles on (standard) RDF media types won't work for media types of RDF-star serializations. Parameters (which were the other option in the air for the non-binding straw poll) somehow being attached to (standard) RDF media types might work for future tools which understand RDF_star, but parameters are not part of the IANA media type registration at all, and there's no standard way to communicate them. One major expressed concern about having new types has been that an RDF-star server might not know whether it was going to deliver RDF-star data until late in a query response where it was delivering (standard) RDF data in response to a request for such (standard) RDF media type. I think the only answer we can have to this is, if a request specifies Turtle or other non-RDF-star media type, the server must then either commit to (and follow through on) delivering that, or reject the request. RDF-star data must only be delivered in RDF-star serializations, with appropriate media types. If an RDF-star server cannot commit to such a delivery -- such is life. |
It seems to me that there are two related but distinct questions here:
I am not clear whether people in favor of a new media-type are of this opinion because they answer "no" to the 1st question (making the 2nd moot), or because they answer "yes" to the 1st, but still "no" to the 2nd. Note that |
Good point, separating those will help us distinguish between where we want to end up and what we are comfortable doing now.
True: the 2011-03-28 text/turtle registration did not include SPARQL The RDF 1.1 changes are arguably different in kind from the change proposed here. They liberalized syntax, which gave content producers latitude to produce more readable documents, at the expense of breaking deployed parsers. If the consumer upgraded their parser, they'd be able to consume the data without further changes to their infrastructure. The current change extends the model, which has implications thoughout the consuming toolchain. Replacing a parser won't allow you to stick |
I believe that the long term ideal is one RDF, not RDF with a separate extension. We should aim for the ideal outcome, then document the migration issues. To introduce MIME types is, in effect, to make a permanent distinction if RDF-star as a separate extension. A MIME is permanent/very-long-time (c.f. It is more than a pair-wise migration. There are three parties: application, client library, server. Undoing a MIME type is getting harder! We could propose a file extension to indicate the use of RDF-star in downloadable data file. This would be a useful indication when retrieving dumps and more likely human-in-the-loop. Also, a mapping to/for reification so adapting to un-upgraded software (HDT example) can be delegated rather than built-in to the proposed changes. My concern is for "old application, new server" situations where new MIME types may interfere with existing applications even when the applications are asking for data they previous successfully accessed. I believe breaking what works is more damaging and harder on support. |
Agreed, but I think RDF-star falls short for inclusion as a permanent part of RDF, because (IMO) it does not add enough functionality to justify the added complexity. Specifically:
In short, it feels like RDF-star goes part way toward addressing these needs, but not all the way. That means that if RDF survives and these issues are addressed more fully in the future by some other syntax or mechanism, then we would be stuck with the remnants of RDF-star in addition to the more general solution. But maybe I'm just wishing RDF were more like N3, for its elegance and power. |
In my view, the purpose of this CG is to explore the space for Property-Graph relative technologies and RDF-star has fair adoption among providers, at least conceptually, and Notation-3, for all of it's great attributes, does not. But, it's up to a future WG to consider the various alternatives. You may, or may not like the approach, but it's worth making sure that the work is complete for it to be of value in the future. I've encouraged others to describe their issues with the solution (mostly semantic), and I think that the RDF-star final report should include both majority and minority positions, so that we can reduce the time that a future Working Group would need to spend on re-hashing the arguments. So, please consider such a constructive contribution. Regarding a mime-time, I think it would be premature to establish what would be about 10 different mime times to deal with the different serialization formats, result formats, SPARQL syntax. I would argue that if they were adopted by future WG(s), as @afs and @pchampin have suggested, they could arguably fall within the range of the existing mime types, much as other formats have changed over time without requiring new mime types. Consider what this group is ultimately is producing is a final CG report that is intended to be considered for future standardization. Specifying too much (such as mime types and permanent named-spaced URIs) doesn't help the effort of future adoption, and if there is no future adoption, then it will be irrelevant, anyway (or the standard efforts become irrelevant to implementors, which would be even worse). |
@dbooth-boston It does not block N3. And, of course, |
Good point. Maybe I am being overly concerned. |
this was discussed during today's call : https://w3c.github.io/rdf-star/Minutes/2021-04-23.html#t03 |
@ericprud -- See relevant minutes section... Any comment about |
Of course, the Turtle 1.1 spec does have an IANA Section "Internet Media Type, File Extension and Macintosh File Type". But, perhaps it was never actually sent to IANA? Probably not too late to send that in. |
Given that the last media type I registered took just shy of six months, this might take a while and run afoul of a great idea or collective resignation (aka "consensus") we arrive at in the next few months. I'd be tempted to hold off on that update unless we think that the risk that we stumble across a better plan is low. (It's also possible that such an update would take a day. The bottle neck is the IANA expert reviewers.) |
Here's an LTS (let's update with new scenarios) list of relevant axes for a common HTTP GET scenario: Emitter Use Case Axes:
link to e.g. Requester Use Case Axes:
link to e.g. Proxy behavior is unlikely to be an issue; we can always add what we need to a Vary header. PUT and PATCH are basically a forced GET with the roles of client and server reversed. In short, I think that GET axes on client and server should enable most of the necessary analysis. explorationThis is a large search space; will iterate with updates to axes.
TODO (may, but here are a couple):
Should "old" be called "flat"? Anyways, here's were I've gotten so far. |
this was discussed during today's call https://w3c.github.io/rdf-star/Minutes/2021-05-21.html#t06 |
In addition to defining the extended formats for serializing the result of a SPARQL* SELECT query (#12 and #13), we have to decide whether we need/want new mime types for these extended formats? Similarly, do we need/want to introduce another namespace for the extended XML result format?
The text was updated successfully, but these errors were encountered: