Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Required RDF serialization of WebID resource #3

Closed
acoburn opened this issue Aug 13, 2021 · 273 comments · Fixed by #66
Closed

Required RDF serialization of WebID resource #3

acoburn opened this issue Aug 13, 2021 · 273 comments · Fixed by #66

Comments

@acoburn
Copy link
Member

acoburn commented Aug 13, 2021

The current draft WebID specification document asserts that WebID documents must be serialized as Turtle:

in section 2:

The server must provide a text/turtle [turtle] representation of the requested profile

and in section 5:

WebID requires that servers must at least be able to provide Turtle representation of profile documents

and in section 6:

The Agent requesting the WebID document must be able to parse documents in Turtle

This presents implementation challenges in cases where Turtle is not the relevant serialization. In Solid-OIDC, for instance, we would like to build client identifiers on top of WebID, but the baseline serialization for us is JSON-LD. Requiring Turtle in addition to this places a higher burden on server implementations to support content negotiation, when such negotiation is not needed in the context of Solid-OIDC.

It would be much easier to build on WebID if a particular serialization was not required. Rather, if the specification simply required an RDF serialization of the WebID resource, it would make building on this specification much easier.

@bblfish
Copy link
Contributor

bblfish commented Aug 13, 2021

I'll add that WebId-TLS has -

A WebID Profile is an RDF document which must uniquely describe the Agent denoted by the WebID in relation to that WebID. This document must be available as Turtle [turtle]. This document may be available in other RDF serialization formats, such as RDFa [[!RDFA-CORE] or RDF/XML [RDF-SYNTAX-GRAMMAR] if so requested through content negotiation.

The WebID draft spec also states:

A WebID Profile is an RDF document which uniquely describes the Agent denoted by the WebID in relation to that WebID. The server must provide a text/turtle [turtle] representation of the requested profile. This document may be available in other RDF serialization formats, such as RDFa [RDFA-CORE], or [RDF-SYNTAX-GRAMMAR] if so requested through content negotiation.

So currently all other RDF formats are allowed on content negotiation. That is the requirement on Turtle is just as a default format for when the mime type does not specify a format in particular, perhaps such as */*.

We should definitely add json-ld to the list of serializations.

@acoburn
Copy link
Member Author

acoburn commented Aug 13, 2021

I understand that Turtle is "just a default format", but the way the specification is written, it means that if a server wants to use some other serialization type, then it MUST also support content negotiation and include support for Turtle. That is what I am suggesting is the problem. Content negotiation is not always "just" an easy thing in all cases.

@acoburn
Copy link
Member Author

acoburn commented Aug 13, 2021

Here is a specific example about content negotiation.

In the case I outlined above with client identifiers, many of these clients exist as static JavaScript clients. That is, all resources are static and there is no content negotiation. In most cases, these applications would host their "WebID" resource as a static resource, too, but per the WebID specification it would be a violation to serialize that as JSON-LD.

@bblfish
Copy link
Contributor

bblfish commented Aug 13, 2021

@acoburn wrote:

It would be much easier to build on WebID if a particular serialization was not required.

It would be easier for whom?
As far as I know you are writing a web server Trellis that does content-negatiation out of the box. Apache servers don't require much setup for content negotiations either.

Is the risk not, that if we don't specify a default, that we instead have every client that needs to parse all the serializations of RDF that exist?

Your use case seems to be people deploying Solid Apps that want to authenticate using a tweaked version of OAuth especially for Solid. It seems to me highly unlikely that people deploying such RDF aware clients or those OIDC servers are going to have trouble with either the content negotiation, or making a request with the json/ld mime type.

I am not against being open on RDF serialization on the whole. It just seems to be a case of making things for some people easy can make it more difficult for others. And the text can definitely be improved.

@acoburn
Copy link
Member Author

acoburn commented Aug 13, 2021

OK. I suppose the best option in this case would be to not rely on WebID.

@acoburn
Copy link
Member Author

acoburn commented Aug 13, 2021

As far as I know you are writing a web server Trellis that does content-negatiation out of the box.

That is irrelevant in this case. Just because system X handles feature Y doesn't mean that everyone out there uses system X.

Is the risk not, that if we don't specify a default, that we instead have every client that needs to parse all the serializations of RDF that exist?

That does not follow from the argument I am making. A client only needs to understand the Content-Type header of HTTP, which surely HTTP clients generally do.

Your use case seems to be people deploying Solid Apps that want to authenticate using a tweaked version of OAuth especially for Solid. It seems to me highly unlikely that people deploying such RDF aware clients or those OIDC servers are going to have trouble with either the content negotiation, or making a request with the json/ld mime type.

Today, these apps are widely deployed as static clients, as I mentioned above. Content negotiation doesn't come into play.

From your comments, it does appear that WebID is not the right technology for what we are trying to build.

@bblfish
Copy link
Contributor

bblfish commented Aug 13, 2021

OK. I suppose the best option in this case would be to not rely on WebID.

I guess it depends on

  • what others who have WebID deployments think of this,
  • if there are other ways to get around the problem that would work too

It is not really for me to decide on this. I was just trying to tease out what the problems were.

In the case I outlined above with client identifiers, many of these clients exist as static JavaScript clients. That is, all resources are static and there is no content negotiation. In most cases, these applications would host their "WebID" resource as a static resource, too, but per the WebID specification it would be a violation to serialize that as JSON-LD.

yes, I can see that being a tricky situation.

I would not myself have trouble with parsing json-ld, turtle, rdf/xml, Trig, N3 (I'd like good implementations of that to tell the truth). But you can see that things are also getting very complicated there.

Perhaps one could say MUST for json/ld or Turtle. That would at least reduce the number of formats down to 2.

@csarven
Copy link
Member

csarven commented Aug 13, 2021

It would be much easier to build on WebID if a particular serialization was not required. Rather, if the specification simply required an RDF serialization of the WebID resource, it would make building on this specification much easier.

Key considerations:

For extensibility purposes, if the other specifications are explicit about the required formats, then I agree. In this case, the WebID 1.0 spec needs to be clear about its position in that higher level specifications need to call the required formats. As for example Solid-OIDC's JSON-LD use of WebID. Good, we can combine specs and have a clear path for implementations.

For cases when the WebID document is needed by software without being driven by the requirements of any specification, then stating "an RDF serialization" is required may fall short. In this case, WebID 1.0's position is that software needs to be able work with all formats. That's theoretically possible for interop but sets a high bar - I think generally impractical.

The current setup requiring one format (Turtle) for minimal interop is great but obviously less flexible about it being used in other specs.


Perhaps MUST a concrete RDF syntax in addition to SHOULD Turtle and JSON-LD (or whatever is desired). That's just shy of saying MUST Turtle and JSON-LD, and so perhaps might as well at that point. It is rather sketchy considering existing software and deployment. Some of the existing software/deployment still conforms. New software may not be able to work with profile documents published in Turtle if there is any possibility that Turtle is not required. Then again I don't expect drastic changes in the spec.

@bblfish
Copy link
Contributor

bblfish commented Aug 13, 2021

Part of the discussion regarding OIDC and WebID, continued in solid-oidc issue 35:
Generalize Solid-OIDC protocol beyond WebID
, that is where the aim is to generalise to DIDs.

@jonassmedegaard
Copy link

Please do not misrepresent others, @bblfish: Your main opponent in that referenced discussion explicitly says

I am not suggesting a generalization of WebID to include DIDs. You are correct that such a thing is not possible.

The issue raised here seems to me to be that the current draft WebID spec requires all implementations to implement a Turtle parser which is problematic especially for lightweight agents constrained in the programming languages available to them.

I love Turtle and am no particular fan of JavaScript. Regardless, I strongly support this request to relax dependecy on Turtle from a MUST to a SHOULD.

@bblfish
Copy link
Contributor

bblfish commented Aug 14, 2021

Thanks @jonassmedegaard for your suggestion to relax it to SHOULD.

I am not sure how it helps with lightweight agents though, since by relaxing the number of mime types published, we have a requirement on clients to support more parsers.

We seem to have a simple law: the more flexibility we allow for the mime types on publication, the more complex the clients have to be.

As I suggested above I think having 2 RDF parsers in this day and age, such as JSON-LD and Turtle, seems quite reasonable. But you seem to think that would be too much.

@jonassmedegaard
Copy link

Exactly: I think it is too much to require that all agents support both JSON-LD and Turtle.
I think it would be most helpful if the spec mandated either/or, and I fail to understand what speaks against that.

@jonassmedegaard
Copy link

If one single RDF serialization needs to be mandated (e.g. for minimal interop, as mentioned by @csarven), then JSON-LD better serves that purpose as it was explicitly designed for use in web environments "in this day and age".

(since you bring up that phrase...)

@jonassmedegaard
Copy link

jonassmedegaard commented Aug 14, 2021

I am not sure how it helps with lightweight agents though, since by relaxing the number of mime types published, we have a requirement on clients to support more parsers.

We seem to have a simple law: the more flexibility we allow for the mime types on publication, the more complex the clients have to be.

I suspect we are talking past each other.

Your point seems to be that if we relax at the server side, then we must tighten at the client side - like this:

  • The server must provide either a text/turtle [turtle] or a application/json [JSON-LD] representation of the requested profile
  • WebID requires that servers must at least be able to provide either Turtle or JSON-LD representation of profile documents
  • The Agent requesting the WebID document must be able to parse documents in both Turtle and JSON-LD

The suggested change seems to be to relax at both server and client side - like this (examplified specifically for turtle versus JSON-LD - suggestion by @acoburn seems to be to permit any RDF serialization):

  • The server must provide either a text/turtle [turtle] or a application/json [JSON-LD] representation of the requested profile
  • WebID requires that servers must at least be able to provide either Turtle or JSON-LD representation of profile documents
  • The Agent requesting the WebID document must either be able to parse documents in Turtle or to parse documents in JSON-LD

I changed my mind: Makes sense to mandate a single serialization, but I think that one default serialization format should be changed to be JSON-LD: In some environments (notably web browsers executing JavaScript code) where implementing and running a Turtle parser is more painful than implementing and running a JSON-LD parser - but there are no environments with the opposite pain.

I see one argument for Turtle over JSON-LD as default: Some existing implementations in the wild mightt support Turtle but not JSON-LD. WebID is however only a draft spec, so such early adopters should be expected to adapt.

@jonassmedegaard
Copy link

jonassmedegaard commented Aug 14, 2021

So to be clear, I concretely propose to change to this:

  • The server must provide an application/json [JSON-LD] representation of the requested profile
  • WebID requires that servers must at least be able to provide JSON-LD representation of profile documents
  • The Agent requesting the WebID document must be able to parse documents in JSON-LD

...and

  • A WebID Profile is an RDF document which must uniquely describe the Agent denoted by the WebID in relation to that WebID. This document must be available as JSON-LD [json-ld]. This document may be available in other RDF serialization formats, such as Turtle [turtle] or RDFa [[!RDFA-CORE] or RDF/XML [RDF-SYNTAX-GRAMMAR] if so requested through content negotiation.

@bblfish
Copy link
Contributor

bblfish commented Aug 14, 2021

Ok thanks for the input @jonassmedegaard .

We already have quite a large deployment of WebIds I think so we have to take that into account. If we were the size of Google, we'd try to collect the data on installation, size of clients, availability of them, etc... to test your claims, and make an informed decision. Lets see what others think.

@jonassmedegaard
Copy link

You must mean that a large number of draft WebIDs exist - because the spec is not yet finalized.

Certainly makes sense to try take into account the amount of identifiers in the wild or on the horizon which "complies" with unofficial early drafts of the WebID spec.

My point is that it also makes sense to try take into account the amount of would-embrace-webid-if-not-too-heavy-burden identifiers in the wild or on the horizon.

I say "try" for both - I disagree that the latter is questionable while the former not.

@bblfish
Copy link
Contributor

bblfish commented Aug 14, 2021

@jonassmedegaard wrote:

You must mean that a large number of draft WebIDs exist - because the spec is not yet finalized.

You get the order of causality the wrong way round. First you have implementations that work together, then you have a finalized spec. The Web appeared before it was finally specified. Indeed it is never finally specified, as it is an ongoing process.

Kingsley Idehen has stated that his company "OpenLink has been using WebIDs since forever". He would rather be completely agnostic on formats, but at a minimum have Turtle and JSON-LD (see his recent statement).

Then there is the Solid Project that is using WebIDs with at least 3 servers implementing it in use, and I am writing another in Scala. There are communities writing clients of all types for Solid. Every month there is a Solid World event I recommend people attend. Those communities would need to be asked too.

@jonassmedegaard
Copy link

You get the order of causality the wrong way round. First you have implementations that work together, then you have a finalized spec.

Ok. :-)

(yes, I am aware of the works of Kingsly Idehen and you and Solid)

@jonassmedegaard
Copy link

jonassmedegaard commented Aug 14, 2021

Let me try again, then - with causality adjusted:

We have no spec yet, so we cannot unambiguously say WebID yet.

You must therefore mean that a large number of WebID-as-drafted exist.

Certainly makes sense to try take into account the amount of WebID-as-drafted in the wild or on the horizon.

My point is that it also makes sense to try take into account the amount of WebID-not-depending-on-turtle in the wild or on the horizon.

I say "try" for both - I disagree that the latter is questionable while the former not.

Apparently Solid uses WebID-not-depending-on-turtle with their recent switch to use Solid-OIDC spec.

I guess Solid represent the largest use of WebID.

@jonassmedegaard
Copy link

Correction: I guess Solid represent the largest use of some-form-of-WebID.

@jonassmedegaard
Copy link

jonassmedegaard commented Aug 14, 2021

Reading (again) the thread at the webid list makes be reconsider (again) if it really is needed for any serialization to be a MUST at all.

I am using the image of passports containing same facts but written in different scripts - latin, cyrillic, devanagari, etc.

My previous thinking was that yes, we need all passports to be in same script, otherwise we will not have interoperability.
But now I suspect my reasoning was wrong in two ways: a) we already permit varying scripts only default is fixed, and b) default default scenario really is comparable to what most people in the world (excluding maybe diplomats and secret agents) experience with passports: We got only one passport, not one written for each language region of the world.

So please try spell it out for me: Why do WebID need to have a fixed default at all? Why not simply s/MUST/SHOULD/ for both server and agent?
I mean, worst thing that happen is that an agent and a server fails to exchange data because one end didn't state wanted serialization and the other end provided an unusual serialization - i.e. exact same type of failure as if one end explicitly requested a serialization that the other end was unable to provide, no?

@bblfish
Copy link
Contributor

bblfish commented Aug 14, 2021

As you see @jonassmedegaard, you have moved from one extreme to the other in a matter of a few hours. So that would indicate that we need a lot more data to be able to come to a stable conclusion, or else we will be changing the spec every few days as people come along and argue for the opposite point of view.

@jonassmedegaard
Copy link

I don't think your description is accurate.

I would say that I have moved from "yes, let's relax to SHOULD" to "no wait, if we really cannot relax to SHOULD then let's at least change default to align with current hype" back to "yes, let's relax to SHOULD".

I have neither visited the option of "let's stick with Turtle as default", neither the option of "let's relax to MAY". Those would be extremes in my opinion.

I would appreciate an answer to my question: Why do WebID need to have a fixed default at all?

@bblfish
Copy link
Contributor

bblfish commented Aug 14, 2021

I would appreciate an answer to my question: Why do WebID need to have a fixed default at all?

Because there are people who are arguing like your former self from 8 hours ago that

Makes sense to mandate a single serialization, but I think that one default serialization format should be changed to be JSON-LD: In some environments (notably web browsers executing JavaScript code) where implementing and running a Turtle parser is more painful than implementing and running a JSON-LD parser - but there are no environments with the opposite pain.

You were arguing that this ruled out a Turtle parser because it would be too heavy for light-weight parsers, and you went on to argue for json-ld. Now you are arguing for SHOULD which does not help a client know which parsers it will need to have available, and so it means the client should have all parsers available, which could be 8 or so.

The use case put forward by Aaron at the beginning of the thread was that some people are using github to publish and they can't therefore get content negotiation on resources. (Though somehow they will be ok with blockchain and DID). We don't know how many of these people we have though. And of course Github could at any time add content negotiation. Perhaps we could make a request for that.

I put forward that we could mandate two content types on the client to allow for interoperability.

We now have a list of all the following options for default:

Server default Client Must be able to parse at least
1 MUST Turtle Turtle
2 MUST JSON-LD JSON-LD
3 MUST Turtle OR Json-LD Turtle AND JSON-LD
4 Any RDF Serialization All RDF Serializations
5 SHOULD for Turtle ???
6 SHOULD for Json-LD ???

Other options
A. ask github to fix content negotiation
B. allow some other subset of serializations specifically
C. don't specify the serialisation and let the market decide (perhaps keep a tracker of mime types used, and see which wins) - but that is the same as any RDF serialization

All of these have pros and cons.

@jonassmedegaard
Copy link

Now you are arguing for SHOULD which does not help a client know which parsers it will need to have available, and so it means the client should have all parsers available, which could be 8 or so.

No, a client could have a single parser.

@acoburn
Copy link
Member Author

acoburn commented Aug 14, 2021

The use case put forward by Aaron at the beginning of the thread was that some people are using github to publish and they can't therefore get content negotiation on resources

What I stated earlier is that many app developers publish their software as static resources. I never mentioned GitHub, though that is surely one example. There are hundreds of such examples.

@bblfish
Copy link
Contributor

bblfish commented Aug 14, 2021

@acoburn your problem would be solved by options 2, 3 or 4 above right?

@jonassmedegaard
Copy link

Assuming options 5 and 6 are ones I propose, then option 5 should read "Any RDF Serialization" for both server and client, and there is no option 6.

@kidehen
Copy link

kidehen commented Jan 14, 2024

Those exist because content-negotiation isn't supposed to be such a distraction. This issue is the direct consequence of this straw poll document that I serendipitously stumbled upon while trying to locate TimBL's profile document -- which uses the same portable approach via "#".

Defaulting to RDF-Turtle (i.e., the MUST directive) has now been conflated with a narrow interpretation of content-negotiation that involves 303 redirection on the part of publishers that want to support other document type i.e., totally ignoring options presented by "#" and the use of response headers (so-called "signposting").

@kidehen
Copy link

kidehen commented Jan 14, 2024

Calling everyone to join in the fun: any strong objection to a MUST on ConNeg, Turtle and JSON-LD for publishers?

MUST on content-negotiation is the poor outcome from the straw poll at the bottom of this document..

Content-negotiation is supposed to be about how clients and servers negotiation content-types using a variety of heuristics -- many of which end up becoming distracting to both casual users and developers. Mandating it in a spec guarantees the very problem we are grappling with right now.

Content-negotiation should never be a MUST in this kind of endeavor. A publisher can handle content-type negotiations using a variety of techniques, include the use of HTTP response headers and LINK: relations (a/k/a as signposting).

@kidehen
Copy link

kidehen commented Jan 14, 2024

a loosening of requirement of either ConNeg or Turtle, or both if possible.

Yes, because they never should have been tightly coupled in the first place.

@kidehen
Copy link

kidehen commented Jan 14, 2024

I feel the opposite. We actually all seem to be agreeing on Turtle and JSON-LD.

Correct, albeit via different routes.

What maters here is the final destination i.e., Turtle and JSON-LD.

@jacoscaz
Copy link
Collaborator

jacoscaz commented Jan 14, 2024

/chair hat on

I feel the opposite. We actually all seem to be agreeing on Turtle and JSON-LD.

@woutermont indeed, I'm almost tempted to call this a joyous occasion :)

What maters here is the final destination i.e., Turtle and JSON-LD.

@kidehen yeah, and my bad to putting together things that shouldn't be put together. Just so that I can better understand your objection, do you reckon you could live with a formulation like this?

  • publishers MUST use Turtle or JSON-LD whenever expressly requested by consumers

I'll also point out that the question is not well formed. As it does not say WHAT spec you want to change. ie is the proposal backwards compatible, or non backwards-compatible, does the proposal bump the version number?

@melvincarvalho as I've been saying for quite some time, how to get to Turtle and JSON-LD, assuming no major objections, is something we're going to figure out. Personally, I'd be inclined to do so in a new 2024 ED. For now, it's very good to see that the group appears to actually be willing to converge on a shared path forward, at least in abstract. May I ask you to please let me know whether that -1 is a "blocker" - i.e., an indication that you would not be able to live with a spec having a MUST on Turtle and JSON-LD - or whether it's an indication that there are ways to get there that you would be ok with and ways that you wouldn't?

In any case, time for me to update my draft and start shifting a little bit of focus on process...

@kidehen
Copy link

kidehen commented Jan 14, 2024

@kidehen yeah, and my bad to putting together things that shouldn't be put together. Just so that I can better understand your objection, do you reckon you could live with a formulation like this?

* publishers MUST use Turtle or JSON-LD whenever expressly requested by consumers

Yes, that's fine.

@TallTed
Copy link
Member

TallTed commented Jan 23, 2024

This VERY large comment addresses about two weeks, and about 150 of the 255 comments that preceded this one. I choose to make this one large comment rather than a bunch of smaller ones because I'm a horrible person. I'm sorry for the challenge reading it may pose. But I think it's mostly signal with relatively little noise, and I hope it helps bring this thread to a conclusion that a WebID server SHOULD support ConNeg, and MUST support delivery of (and optimally upload of) Turtle AND JSON-LD according to client request/need, the latter of which could be addressed via ConNeg, signposting, or other means.


A friendly reminder of the Priority of Constituencies found in the Web Platform Design Principles. In short, we (the WebID specification writers) bear the pain, so that developers of WebID servers/clients don't, so that WebID users/consumers don't. (In long, User needs come before the needs of web page authors, which come before the needs of user agent implementors, which come before the needs of specification writers, which come before theoretical purity.)

I strongly believe that the ED's minimum requirement of servers delivering a Turtle serialized profile document when clients dereference a WebID, with its unstated (which perhaps should be stated) option of servers delivering that profile document as JSON-LD or other serialization including non-RDF serializations based on client request (i.e., Content Negotiation), should be retained. I have met many humans (usually non-programmers) who have no trouble comprehending a Turtle document, when the same data expressed in a JSON-LD document is immediately confusing (sometimes just because of the {} wrapper). (Software also usually has less issue with Turtle than JSON-LD, mostly because bugs happen.)

Web (including but not limited to LDP and Solid) server developers SHOULD have no problem adding serialization translation support to client-requested content types (JSON-LD, Microdata, N-Quads, TriG, N-Triples, etc.) via relevant libraries which are already available as prebuilt libraries for many runtime environments, as server add-ons for many popular Web servers, and as open-source code which can be ported to or recompiled to serve new runtime environments and/or Web servers.


[@kidehen] Moving content-type support requirements to the client is another route to letting developers CHOOSE their preferred structured data representation format.

This would invert at least part of the Priority of Constituencies, and I am strongly against that. In my opinion, broad content-type support (including content negotiation) requirements MUST be imposed on server implementations, not on client implementations, and certainly not on the user.


[@melvincarvalho] [@timbl] thought web developers would want JSON-LD data islands, but that he personally would want turtle, and it may be possible to have both (which could be messy).

It is absolutely possible to have both JSON-LD and Turtle data islands. It can take a bit of work to ensure that correct HTML syntax surrounds both islands and that HTML character escapes are handled properly within the islands, but I've done it. (Regrettably, the pages on which I did this work were not exposed to the world, and were not archived in that form.)


[@melvincarvalho] If you request RDF from a server, the RDF should be consistent independent of the mime type

This might be a goal in some spheres, but it has never been a universal requirement. Note that when you dereference a URI, your receive a representation of the resource identified by that URI. If you issue an HTML GET for a JPG (image/jpeg) representation of a resource that is stored as a GIF (image/gif), there will be some difference between what's on the server and what you receive -- because, for instance, a GIF may have a transparent background which is not supported by JPG, and JPG is a lossy format which blurs color transitions while GIF specifies the color of each and every pixel. Similarly, if you issue an HTML GET for a plain-text (text/plain) representation of an HTML (text/html) resource, you might receive the HTML source with a declared media type of text/plain, and you might receive an ASCII rendition of the text that is visible when the HTML is rendered by a browser, among other possible representations of the HTML resource your HTTP GET requested.

Similarly, if you request an RDF serialization which would require the server to make a transformation it does not fully support, or a transformation to a serialization which does not handle all the data the server has, the server might decide (well, the server programmers might decide) to deliver a partial transformation, perhaps omitting attributes and values which are not fully supported by the requested serialization. Along similar lines, a server might be set up to only deliver full representations in compacted/compactable RDF serializations that minimize whitespace like Compacted JSON-LD, and partial representations in uncompacted/uncompactable RDF serializations like N-Triples or N-Quads.

All of these are valid behaviors of generalized Web (or LDP or Solid) servers, and I expect them to be valid behaviors of WebID servers.


[@jacoscaz] would you be ok with adding to the above that, in the event that a consumer does not request a specific format via the Accept header, a publisher MAY respond with whatever format it pleases?

This is already the case for generalized Web servers, and I can think of no reason it should not be the case for WebID servers.


[@jacoscaz] Most people still need to weigh in

Yeah, there's been a LOT of activity on WebID — including this (currently 255, soon to be 256) HUGE thread! — and I've got dozens of other repos to also try to keep up with, several of which are rapidly approaching Charter-end and hence CR/PR transition, with some thorny aspects in At-Risk status, all of which require careful consideration and negotiation with many participants.


[@jacoscaz] Unless we were to restrict the list of formats to Turtle and JSON-LD, optionally within data islands, I would find this too onerous as it would virtually mean that publishers should support all formats.

Servers should support as many serializations as easily available transformation libraries can. We should not restrict the list of acceptable or deliverable RDF serialization formats at all. The point of requiring that WebID server implementations support Turtle and/or JSON-LD (I believe we should require both today, where I believed it should only be Turtle circa 2014, because transformation library capabilities have improved substantially in the past decade) was (and is) to provide a baseline that any client can emit and expect to receive. WebID servers that provide more options to WebID clients have always been compliant, and should remain so.


[@stain] May I suggest also using Signposting to navigate to the documented RDF serializations

Absolutely. Embrace the power of AND! You may notice that Virtuoso (produced by my employer) supports both conneg and signposting (try curl -LkI https://dbpedia.org/resource/Virtuoso_Universal_Server or similar on your local commandline). Typically, anything that's available via conneg is also signposted, for the broadest range of client support.


[@jacoscaz] it may be done using either link headers or link elements

Again, embrace the power of AND! A typical DBpedia page, hosted by Virtuoso, includes both! For example, check out the files saved by these commandline commands --

curl -LkI https://dbpedia.org/page/Virtuoso_Universal_Server -o test-head.txt ; grep "Link:" test-head.txt

curl -Lk https://dbpedia.org/page/Virtuoso_Universal_Server -o test.html ; grep "<link" test.html

[@melvincarvalho] It's not accurate to say that WebID is a work in progress, it was regarded as complete and ready to go.

I was an active member of the WebID WG circa 2014. I do not remember the 2014 ED being "regarded as complete and ready to go", at all. If the ED had been "regarded as complete and ready to go", I am quite sure we would have requested transition to CR and thence to PR in 2014–2015. Even now, if the ED were "regarded as complete and ready to go", we could request transition today as 1.0 as is, hopefully followed fairly quickly by a 1.1 or 2.0 (depending on the degree of change made by whatever WG adopts it, Solid WG or otherwise). (Though I do not think the Solid WG wants to take on a heavy revision task, the WebID CG could hand them both the ED as 1.0 and its revision as 1.1 or 2.0, with hope they would be transitioned together, which could probably be justified based on the history of this project.)

say that every mime type returns the same RDF triples

As above (search this web page for "universal requirement"), I am strongly against this as a MUST. I might be OK with it as a SHOULD, but everybody has to remember that SHOULD means "do it unless you have a good reason not to and fully understand the impact of not doing it", and implementers sometimes feel that "I don't want to" is a good reason and "interop may fail" is sufficient understanding of the impact.


[@melvincarvalho] Regarding signposting, I dont know. It seems new

Signposting is not at all new. See this archive.org capture from 2016-06-16. Also a current capture from 2024-01-12.


[@kidehen] rearrangement of the heading (WebID 1.0) and subheading (WebID Identity and Discovery) in [the 2014 Editors Draft].

I don't know how, or if, it's possible to see the commit history on the 2014 Editors Draft. I'd like to see when that subhead was added, not least because there are precisely 2 occurrences of the word "discovery" in that document. One is in the subhead; the other is in the following sentence:

HTTP Content Negotiation can be employed to aid in publication and discovery of multiple distinct serializations of the same graph at the same URL, as explained in [COOLURIS]

I don't think "discovery" really belongs in the subhead, since it's not really discussed at all, and certainly not in any way specific to WebID, as it should be for this document.


[@jacoscaz] This is clearly a breaking change, yes, and would likely justify working on a new document rather than the 2014 ED.

I will note that MOST version incrementing specs (and there are at least a few) at W3C were started with the "existing" spec and relevant edits were made to it, rather than starting from a blank page for each new version.

And, as has been noted elsewhere, that Editors Draft is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress. It is unfortunate that many chose to implement based on that DRAFT document, and more unfortunate that their implementations MAY not function going forward as they do today, but perhaps such implementers will be encouraged by such breakage to join W3C and WebID-relevant WGs/CGs/etc., and more to the point, actively participate in the process so we know they're out there and such that spec revisions going forward allow for their implementation's continued function.


[@jacoscaz] There's a risk that Solid has settled on a MUST on Turtle because of WebID

I'm fairly certain Solid has settled on a MUST for Turtle because of @timbl. As I recall, the Turtle MUST in the WebID 2014 ED was also because of @timbl. I have no cites for these, just memory of conversations and participation in related groups.


[@jacoscaz] Calling everyone to join in the fun: any strong objection to a MUST on ConNeg, Turtle and JSON-LD for publishers?

Some objection to MUST on ConNeg, because it requires server-level configuration which ordinary users may not be able to implement.

A fairly strong degree of support for MUST on delivering Turtle and JSON-LD (and optimally accepting both for updates by user or user delegate), whether via ConNeg, signposting, or other means.


whew

@jacoscaz
Copy link
Collaborator

/chair hat off

because I'm a horrible person

@TallTed for what it's worth, I strongly prefer reading posts like this one!

Some objection to MUST on ConNeg, because it requires server-level configuration which ordinary users may not be able to implement.

@kidehen raised this already and I proposed the following alternative wording in #3 (comment) to generalize:

  • publishers MUST use Turtle or JSON-LD whenever expressly requested by consumers

A fairly strong degree of support for MUST on delivering Turtle and JSON-LD (and optimally accepting both for updates by user or user delegate), whether via ConNeg, signposting, or other means.

Very happy to see this! Another confirmation that this is a good way forward.

I will note that MOST version incrementing specs (and there are at least a few) at W3C were started with the "existing" spec and relevant edits were made to it, rather than starting from a blank page for each new version.

There's discussion about this in #45 , which is the first actual change to the spec that's being discussed. Generally, I think some of us are concerned (rightfully, in my opinion) that iterating on the current draft without a "clean slate" approach might make for exceedingly slow progress. I think it's worth giving it a try first and only consider a "clean slate" approach if that actually proves to hinder progress.


/chair hat on

To everyone: would you be ok with me locking this thread? We have finally converged towards a selection of formats and, although this cannot be closed until an actual change to the spec is made, I think it'd be better if additional ideas / proposals / issues WRT formats were to be discussed in separate threads without requiring participants to read through this many comments.

@melvincarvalho @webr3 @woutermont @namedgraph @jonassmedegaard @kidehen @TallTed

@jonassmedegaard
Copy link

One minor nit:

  • publishers MUST use Turtle or JSON-LD whenever expressly requested by consumers

The above can be read in two ways: One reading is that Turtle is always a must and JSON-LD is a must when explicitly requested (which implies that both are optional, if conneg is optional), the other reading is that both are a must when requested (which implies that both are optional, if conneg is optional).

Possibly there is no doubt to a native english speaker, but to clarify/emphasize/restate which of the meanings we are aiming for, I propose to place a comma in either of two places:

  • publishers MUST use Turtle, or JSON-LD whenever expressly requested by consumers
  • publishers MUST use Turtle or JSON-LD, whenever expressly requested by consumers

@webr3
Copy link

webr3 commented Jan 24, 2024

Perhaps reverse?

Whenever Turtle or JDON-LD is expressly requested by consumers, publishers MUST respond with that requested media type.

@TallTed
Copy link
Member

TallTed commented Jan 24, 2024

Perhaps...

WebID publishers/servers MUST accept and/or provide Turtle and JSON-LD, whenever either is expressly requested (e.g., GET with Accept: text/turtle) or submitted (e.g., POST or PUT with Content-Type: application/ld+json) by a consumer/client

@melvincarvalho
Copy link

melvincarvalho commented Feb 3, 2024

  • i.e., an indication that you would not be able to live with a spec having a MUST on Turtle and JSON-LD

I had thought I had given this indication many times, but let me do so again. Cant live with JSON-LD = MUST because Turlte is already MUST

Introducing Conneg is an absolute NO for me representing the worst of all worlds.

It's also, on top of that unclear about the versioning. Leaving it as 1.0 with such a massive change and changing 2014 to 2024 is not going to fly.

To everyone: would you be ok with me locking this thread?

Yes, please do. We will not get consensus on this.

It can be revisited after the solid WG is up and they state what their preferences.

We still have many avenues to do productive work.

I would be keen on a JSON-LD extension profile, so that at least we can all know what it looks like, what the format is, what the example is, which predicates, how to test etc. After that is done, it could easily be imported into a consensus driven.

EDIT: I have repeated this objection a few times. Let me say that tone is strong, and that is mainly because I dont feel I was heard, rather than having such an emphatic distaste to the ideas. For me at this point, I feel the only way to get my point through, is to state it emphatically. I would certainly compromise on things in the future, but not now. The Solid WG, clear messaging around the version number of ED 2024, and progress on extension profiles, would be new information from my POV.

@woutermont
Copy link
Contributor

@melvincarvalho, repeatedly shouting 'I don't want this' is not an objection. I have scrolled back through all your comments the past months, and you have never pointed out a real issue. Given that everyone is converging towards a consensus on this, I suggest you stick by your last message, and bow out of this discussion.

As for the versioning: we work on a running Editors Draft of which snapshots have been published and archived. Given that this is not even a WG document, providing any kind of backward compatibility is a bonus. By converging on the current consensus, we actually preserve compatibility with all existing clients! That's a major achievement. On the server side, there will be ample time to implement conneg during as the specification progresses through the recommendation track (e.g. in the Solid WG).

@melvincarvalho
Copy link

and you have never pointed out a real issue. Given that everyone is converging towards a consensus on this, I suggest you stick by your last message, and bow out of this discussion.

No, they are not. I bowed out with a strong objection. I cant live with a change to the spec JSON-LD=MUST. Conneg MUST is a non-starter.

This has been consistently the case for years, which is why there will never be consensus.

I simply ask that this strongest possible objection be taken into account when assessing the controversial nature of proposed changes.

providing any kind of backward compatibility is a bonus

This also is not something I can live with. It is dismissive of the fact that the spec has been is broad use for a decade. And other eco systems are based on it.

On the server side, there will be ample time to implement conneg

Not simple in the slightest, and a breaking change.

-1000 on all

@melvincarvalho
Copy link

melvincarvalho commented Feb 3, 2024

FWIW: of the ideas proposed JSON-LD = MUST w/ conneg seemed to me to be the worst option by far

The one that might have a chance, imho is sign posting.

The subset / superset spec with extension profiles that we all agreed to, would of course work for everyone.

Anyway, I think I've at this point, now been very clear!

@woutermont
Copy link
Contributor

woutermont commented Feb 3, 2024

I've responded in #58, even though it is rehashing the discussion we had here the past months, and on which all active participants have already reached a consensus. The only thing still up for discussion was the precise phrasing (which is actually up to the editor).

@kidehen
Copy link

kidehen commented Feb 3, 2024

On the server side, there will be ample time to implement conneg during as the specification progresses through the recommendation track (e.g. in the Solid WG).

I gave a plus one to your comment, but assuming what you actually meant by the above is as follows:
On the server side, there will be ample time to deal with techniques that are compatible with MUST for both Turtle and JSON-LD as the specification progresses through the recommendation track (e.g. in the Solid WG).

Turtle and JSON-LD parity can be spec'd in a manner that relegates content-negotiation to an implementation detail.

@namedgraph
Copy link

I have to agree that issue #58 has some merit: MUST on both Turtle and JSON-LD can break backwards compatibility.

I know this sounds counter-intuitive to many here, but I still think that removing any hardcoded serializations (including Turtle) might be the only way to resolve this conflict, because it would restore the orthogonality between the WebID and HTTP specifications.

@kidehen
Copy link

kidehen commented Feb 4, 2024

MUST for Turtle was always a bad idea.

@TallTed
Copy link
Member

TallTed commented Feb 5, 2024

@kidehen

MUST for Turtle was always a bad idea.

MUST for Turtle may have always been a bad idea, but that can only be evaluated with hindsight.

In 2014, JSON-LD 1.0 had only just become a REC. It was not yet in wide use, nor what I would consider mature. Remember, for instance, that JSON-LD 1.1 was needed in 2020 in order to make JSON-LD fully support relative URIs.

In 2014, MUST for Turtle support — which was included in both WebID's 2014 ED and LDP's 2015 REC — was necessary at the time to have a lowest-common-denominator for interoperability, and was considered a minimal burden for users, far lower than JSON-LD would have been and remains for many non-programmers, including beginner (and even experienced) web developers who only work in HTML.

Yes, advanced web developers tend to have some familiarity with JSON, and they can make the leap to JSON-LD relatively easily. But these are not the beginners both the WebID WG and LDP WG hoped to empower with the specs we produced.

@kidehen
Copy link

kidehen commented Feb 5, 2024

MUST for Turtle was always a bad idea.

MUST for Turtle may have always been a bad idea, but that can only be evaluated with hindsight.

Yes, of course.

My comment had more to do with the vote we had on the matter, in which our votes against content-type specificity was defeated etc..

In 2014, JSON-LD 1.0 had only just become a REC. It was not yet in wide use, nor what I would consider mature. Remember, for instance, that JSON-LD 1.1 was needed in 2020 in order to make JSON-LD fully support relative URIs.

Yes, JSON-LD only became viable starting with version 1.1.

Others:

Turtle's status was a consequence of the matter highlighted here, not some draconian attempt at pushing Turtle.

Ultimately, I don't believe TimBL will have any issues with support for JSON-LD 1.1 alongside Turtle.

@melvincarvalho
Copy link

melvincarvalho commented Feb 5, 2024

We have finally converged...

I dont understand the term "we" here.

Please do read the original post from 2021:

It would be much easier to build on WebID if a particular serialization was not required. Rather, if the specification simply required an RDF serialization of the WebID resource, it would make building on this specification much easier.

There's no indication that you've even satisfied the OP of this issue, let alone the whole group.

It's typical to close an issue when the original poster is satisfied. There is an opportunity do so. But what was suggest 200+ posts later does not do that.

@TallTed
Copy link
Member

TallTed commented Feb 5, 2024

I give you, the Priority of Constituencies!

In case of conflict, consider users over authors over implementors over specifiers over theoretical purity. In other words costs or difficulties to the user should be given more weight than costs to authors; which in turn should be given more weight than costs to implementors; which should be given more weight than costs to authors of the spec itself, which should be given more weight than those proposing changes for theoretical reasons alone. Of course, it is preferred to make things better for multiple constituencies at once.

In other words, end-user life should be made as easy as possible, passing the burdens back to client apps (and client app developer; the authors), thence to servers (and server developers; the implementors), and thence to us (the specifiers). I think we should consider theoretical purity to be out of scope.

A MUST on Turtle support for all involved software sets a pretty low bar across the board — and Solid developers should have little to no trouble adding a library to each and every Solid server implementation, such that clients who request Turtle get Turtle, whether it's returned as a document from a flatfile store, automagically converted from Solid's preferred JSON-LD, again where Solid is using flatfiles instead of a triple/quad store, or automagically generated on-the-fly just like the JSON-LD would be when such a triple/quad store is holding the data.

Proof of concept is found in Virtuoso's implementation. You can store flatfiles of Turtle or JSON-LD (or other serializations), or you can write the triples/quads to the RDF store. You can request Turtle, JSON-LD, and various other serializations, and get what you asked for, regardless of what the Virtuoso DB actually holds.

@melvincarvalho
Copy link

In our ongoing discussions about the specification, it's essential to follow the principle of not repeating ourselves (DRY) to make our work more efficient. This requires a common WebID definition across all specifications for coherence and compatibility.

Establishing a shared WebID definition would allow for different extension profiles, ensuring no one is pushed towards an unwanted direction. A quick refactoring leaves everything the same, but provides a point of flexibility for going forward. Something that should have been done a long time ago, and if it had, we would not have any issues now.

After defining WebID, my preferences would be:

  1. Integrating JSON-LD and Data Islands as a MAY replacing RDFa: This non-disruptive strategy could likely gain widespread support, meeting our needs while preserving backward compatibility. Non-breaking. Acceptable.

  2. Optional Inclusion of JSON-LD as a MAY: Simply adding JSON-LD as an option maintains backward compatibility and introduces flexibility, a strategy that should face little resistance and could help achieve broad agreement. Non-breaking. Acceptable.

  3. Requiring JSON-LD Exclusively: This more significant change modernizes our specifications effectively. It allows those who prefer Turtle to continue using the 2014 spec, while advancing with JSON-LD presents an inclusive, forward-thinking approach. Breaking. Acceptable.

  4. Mandatory Adoption of Both JSON-LD and Turtle: Although this inclusive option might overly burden server resources and complicate maintaining parallel serializations, a nuanced approach, like an extension profile, could provide a viable solution without forcing universal compliance. Breaking. Could not live with, at this time..

These are my ranked preferences if choosing only one were necessary. It's important to note that pushing the group to settle on a single option too soon could be a mistake. Utilizing extension profiles can address everyone's needs, and insisting on one choice could disappoint many and potentially fracture the group.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.