Specify/advise semantics of default Container resources (index.html/.ttl) #69
A topic that causes a lot of implementor and app-developer confusion is the handling of default resources for Containers -
In either case, we should either add it to spec or provide non-normative text advising devs what to do.
The text was updated successfully, but these errors were encountered:
There is rough consensus in solid/solid-spec#134 that handling of LDPC resources is an implementation detail.
I've created #109 to provide some answers to the general issue around representation handling.
What kind of non-normative text do you think may help? If we can, I prefer to not include such text.
I had made a comment about creating a /public/index.html file in a pod on gitter and that it can cause problems for some users so, I wanted to give an update on how to restore your public folder if someone had done this.
Log into your pod, select "your stuff" on the drop down menu, select the "your storage" tab, locate and open the index.html file, click the gear icon and select delete, and your public page will be restored to the default view.
Documenting a clarification on this issue (based on meeting with @dmitrizagidulin ):
I think we should first have agreement on the interaction, ie. should request to
Seen in isolation, I feel that this issue should have been resolved without exposing the existence of
The risk that people will use both
If we do come up with a consistent way to deal with representations more generally, (#109), then it does make sense to apply that to this problem too. I think it is crucial that the minimal container triples and containment triples are in all representations, and that we minimize the risk that people will use different URIs for what is actually the same resource.
Forgive me if I'm off topic, but I think this is the same issue:
It seems to me like there are 2 types of resources in a Solid pod.
To me, it seems the problems described in this issue are caused by the lack of this distinction in NSS and in Solid in general right now. The Server managed resources should not be able to be managed by the user.
I think this paradigm of separating the User managed from the Server managed is helpful. It could even be advertised by the server. For example, servers could list their server-managed resource(s) (i.e. their "entry point" for users) when receiving a
I also discussed
There are some problems that I see with this approach (that we didn't have time to discuss further), since there are some things that are important in a container's representation, notably the containment triples, the minimal server-managed LDP triples, and the "POSIX metadata", i.e. metadata to indicate ctime, mtime, atime, etc, which would come in handy if Solid was used as an actual file system, e.g. through FUSE.
So, I am uncomfortable about not having them in the representation, even though people could get at them by adjusting the
From an LDP purist's point of view (which I am not), the HTML representation is also a non-RDF source, so it breaks LDP's model in which LDPC isa LDP-RS.
One possible resolution to this problem is to require the inclusion of the RDF as RDFa. Even though people might
I see three ways to resolve this:
My preference would be the latter.
Any representation of a container is required to be RDF bearing; RDF Source; LDP-RS... having triples that can be parsed by an RDF parser... Full stop. Whether it has a certain number of triples or even the "right" triples is orthogonal. That is a different requirement about representations and equivalences we can work out.
Not necessarily. If an HTML has no RDFa in it, an RDFa parser will obviously find 0 triples. That's exactly the same as a Turtle, if there are no triples in the document. Attempting to creat or update
#45 (comment) proposes how to handle equivalent representations:
Right, so, @timbl saw this as behaving like Apache does, i.e., there's nothing on the surface, the server just silently takes the
That'd be the LDP purist's view, ;-) this stems from the LDP's hierarchy of interaction models, it is not a concern of Solid as currently designed.
Right, but I take the pragmatic view: What are the useful pieces of LDP that we actually need? The mentioned RDF is something we need, and from that, it follows that they must be included with the container representation.
Now, I'm sufficiently purist myself to be wary of a design that doesn't include the container representation with all possible representations, regardless of
However, neither of these views are consistent with the view that @timbl gave me of his motivation behind the
Right, but my (unstated) assumption is that the HTML would also contain content that is not represented as RDF, that's the whole point of putting HTML there, which makes it non-RDF in LDPs (flawed) model where LDP-RS and LDP-NR are mutually exclusive, regardless of whether there is RDF content hosted in it.
No, that does not appear to address the problem, the point is that the
Moreover, the UNIX filesystem analogy extends to containers too, in that the container just contains resources, it is not intended to have an extensive representation on its own.
Edit/Note: I wrote the prior to seeing Kjetil's comment above / in #69 (comment)
This issue was intended to work out whether
I still think the handling of
#109 (comment) actually tries to resolve this:
The point of that was to have a clear split between resources that come to existence via Solid's prescribed interaction vs. by other means, and so re "exists by other means" ie. some implementation happens to have
What databrowser, Apache are doing with
RDF 1.1 says that RDFa in markup languages qualifies it as RDF. If there is any information in there that's not part of an RDF graph, then it doesn't suddenly become non-RDF. If two RDF graphs are isomorphic, that's all that counts towards representation equivalence.
Eh, well, it makes it a not RDF source, at least, from LDP:
So, HTML+RDFa is an RDF Source iff it is fully represented by the hosted RDF. If it contains information that is not represented by the hosted RDF, it is neither an LDP-RS nor an LDP-NR. Then, my proposition is that the whole point with having an
I don't think so, because what happens then if you round-trip between the HTML+RDFa and Turtle?
Right. So we agree that it is a good thing if the
The intended state of an HTML+RDFa representation is what corresponds to an RDF graph. Round-tripping is a non-issue because information that's not marked in RDFa is neither intended or expected to preserve. HTML+RDFa is an LDP-RS.
There is no need to specify
I think the original issue title reflects what's discussed and linked. I find the one you've changed to omits key information. Can we revert?
Really? I don't think many if any HTML+RDFa creators would agree with you. Certainly, I would find it a very large problem if my HTML+RDFa documents were to suddenly lose all HTML content and be reduced to RDF in any serialization.
No, it is not! Please comment on each of my points if you disagree!
I strongly disagree! This is very much a special issue, as it describes a feature that is in Solid and has been in Solid since the dawn of ages. You will then have to argue for the removal of a feature that people rely on and that the Director has voiced a clear opinion on.
Sure, but then, please be a little sensitive to the fact that some of this happened in a F2F discussion that you didn't attend to. I have tried to explain it in clear terms, but you seem to simply dismiss the discussion without considering the merits of the arguments.
If the intention is to preserve content in the RDF graph universe of things, it needs to emit itself to get picked up. What wrote HTML+RDFa and what decisions did it make?
Dmitri's initial comment is the original issue covering a bunch of related stuff, hence the title!
When you say "No Content-Location or anything", that can be taken as a response to #69 (comment) :
being the preferred interaction from that list. [Perhaps that's another way of looking at "Please comment on each of my points if you disagree!" ;)]
Note how not requiring
The point of resolving issues like #119 is so that we understand the scope and methods in which resources make their way into a system. How did
If an implementation accepts
That's a misinterpretation of LDP-RS, LDP-NR and the RDF Source that it links to (RDF 1.1). RDF 1.1 is clear about RDFa:
I think you're mistakenly overloading the term "fully". LDP-RS uses that term to differentiate from LDP-NR. Just as LDP-NR uses "do not have useful RDF". The use of the term "fully" alone is inadequate to cover all the intricacies or even to create a new constraint with a whole set of ramifications without definitions. It is not LDP's place to do that because the simplest explanation is that it respects spec orthogonality. LDP is merely classifying the kind of documents for its interaction model - the intended semantics being "RDF-bearing" or not.
Proposing that (HTML+)RDFa is somehow incompatible with, falls between, or depends on conditions (?) for RS and NR is nonsensical and renders things useless for no practical benefit.
If it helps to be sure, use the LDP-RS interaction model when communicating representations with RDFa - something I've already suggested elsewhere. If we don't need LDP's interaction models in the end, nothing fundamentally changes, so there is nothing else to do here. Great. If
This is going to go down in history as an example of why some stuff should be agreed on F2F, as we now ended up generating more heat than illumination ;-) So, we're in "violent agreement" for the most part, I guess. There's just one thing I still feel like responding to here:
since it does have a practical consequence, that "fully" means it is round-tripable, i.e. you can choose any RDF serialization, and the semantics will be the same.
That aside, I think we have an agreement on the following:
Any misrepresentation in the above? If not, I think the main question is whether the HTML representation MUST have the container's triples as RDFa or if it should be a SHOULD.
I could not disagree more strongly with the above, which I do not believe can be found in any RDFa specification nor guidance.
You are approaching RDFa from the wrong side. HTML+RDFa is embellished HTML (hence, HTML plus RDFa), it is not embellished RDF (which would be RDFa plus HTML).
Regarding LDP-NR vs LDP-RS classification —
As a member of the LDP WG, I understood us to be saying that LDP-NR might include RDF content, but always include non-RDF content which is meant to be preserved, so the document must be preserved as PUT or POSTed.
LDP-RS are 100% RDF, and might be stored by the back-end in their original form, or transformed into another RDF serialization, or loaded into a graph store and not preserved as a document per se — though always retrievable in either Turtle or JSON-LD serialization.
(I was a minority in thinking that Turtle — which may include out-of-band, non-RDF comments and statement order — should be considered LDP-NR, and thus should be preserved entirely.)
I'll try to comment just on the behalf of myself, and since I woke up early and couldn't sleep. :-)
First, @elf-pavlik ,
Yes, but that is I think, an important feature, as a container has, per definition, an RDF representation, which will always be significant. Moreover, it is important for Databrowser, because it can easily embed itself in the HTML document at an appropriate place if the RDF is there, which I also think is desireable. However, I think we can allow more embeddable RDF formats, it doesn't need to be constrained to RDFa.
I'm all for being use case driven, but I am concerned that we are spending too much time on this pretty edge-case feature (this will be the 79th comment), so I would much prefer to find an urgent resolution to it. I think the use case is pretty clear, people have maintained HTML representations of containers since the dawn of ages, and we don't want to break that even when there is a client-side generated view.
I'm afraid you didn't capture the discussion very well, because I fundamentally disagree with the design ;-)
What I do agree upon is:
I also think this is inconsistent with the design that manipulations go through
Yes, but if you insist that
Yeah, but it needs to be stronger, since it just cannot update the server-managed triples, so it doesn't capture the nuances.
So, this proposal is not consistent with Trellis (which is in my 3.ii space), NSS (which doesn't manipulate through
No, once again, the original issue including the title and the comment that Dmitri created is about and I quote: "index.html and/or index.ttl". If you're only interested in focusing on the index.html bit, that's fine, but the issue still needs to address index.ttl or at least see it index.* as a specialisation of #109 for starters - there is a reason why I've created that first so we can revisit this (also mentioned that before). "index.*" was used as an alias to both of those indexes (and others obviously). There are also numerous references to both (if not more) index formats elsewhere.
You've misunderstood. The base requirement is that interactions go through /. If however a server exposes the representation URLs (see issue 109), interactions can still go through /. That doesn't exclude index.* being their own resources... and whether a read or write can happen. Authz policy on the representations is still.. literally what's set for /. Issue 109 and issues involving ACL and representations is clear about being set on the primary resource (as opposed to the representation).
The whole point of "RDF bearing" was if client/server deems a resource to be so, the rules on containment applies. Heck, we can even go all the way back to this: https://github.com/solid/solid-spec/issues/202#issuecomment-512902223 . However server handles / with text/html, the rest follows. It would literally allow /'s text/html representation to be treated as RDF bearing or non-RDF bearing.
Clearly you are mistaken:
Obviously not all implementations are doing the same. Some of the informal criteria that I've mentioned is what NSS does and what Trellis either does or can do. Your 3. is a non-starter based on... guess what "we've already highlighted use cases where some implementations may want to expose representation URLs." So, you can't just ignore that and still try to force your preference.
I've suggested that I can clarify and expand. I've also suggested that you should take the comment as a whole and see how it connects with the agreements made elsewhere. That wasn't an arbitrary decision and I didn't mention all that for fun.
I'm reverting the issue title until there is consensus without obvious objections or at least minimal approval from the creator of this comment.
Clearly we are talking past each other. I am as frustrated (if not more) as anyone else. We can pick this issue up in a call or a F2F :)
Let me first apologize for the fast turn of the events here. It was truly not intentional: This issue is interesting as it have brought out pretty much all the tensions between the different components and philosophies of Solid, the LDP, the UNIX file system, the roles of representations on the Web, etc. However, it is also an issue quite far on the edges, and we cannot keep it open just for the undeniable intellectual exercise it provides. With more than 80 comments, I think it has been on an extensive hearing, and suddenly today, I had the rare opportunity to take it up with Tim in real life, and so I hope people aren't too annoyed if we can take that conversation as the guide. Moreover, you will quickly notice that my own favorite, the "3.ii." direction was quickly dismissed. So, here we go:
Only when a read operation is executed on
The ACLs will be applied as follows: First, the container's ACL will be applied. If the client has read access to the container, an internal redirect is made. If the
Time ran out as we started to discuss what should be done if the client is authorized to read the container, but not the
The diff to @csarven's comment is that index.* is not considered. index.ttl has had a different mission in Solid and has had so for some time (it has certainly nothing to do with my preference, I have not even been aware of this before Tim told me about it, we've just basically had a collective misunderstanding around it). Most operations on
So, this wasn't my favorite resolution, but with the clarification on how the ACLs are applied, it resolves the initial issues that prompted many of the reports on this. It also eases some tensions on the method definitions, as
Housekeeping: The reference to "index.ttl" in this issue should be left as a representation. The data augmentation case raised by data browser (which happens to use index.ttl) will be addressed in #144 .
The use cases that are brought up are are among the most common practices on the Web. Solid must be able to address them.
With the proposal that's brought up:
I'll respond to the questions I've raised in #69 (comment) :
The proposal suggests that the representations are at the very least expected to be equivalent based on RDF graph. The proposal also suggests that they will be tracked and updated independently.
The details indeed need to be worked out:
The discussion in #108 shows that server interference ie. injection of containment information into HTML is not particularly practical or mature. There is also no implementation experience. Moreover, when updating non-RDFa RDF bearing resources, server interference is not expected, that is the server will either allow the request or reject. Explained further below based on existing consensus.
The intended state of
helps to clarify client and server expectations. That is, client needs to ensure the integrity of the containment information in an HTML with RDF bearing representation when making changes and the server verifies the request. This is the same criteria for all RDF bearing representations.
As no details are provided on equivalence or particular information persistence beyond encoded RDF graph, it can be deemed to be compatible with #69 (comment) :
In #69 (comment) , I proposed a relaxed version of representation equivalence which can be determined by the agreement between a server and a client:
It meant that a representation in HTML may or may not be RDF bearing. If deemed to be RDF bearing, there are specific expectations. This is a far simpler design for both servers and clients. For instance, updating an RDF bearing representation does not entail that the non-RDF bearing representations needs to be updated for some "equivalence", and vice-versa.
To summarise suggestions (preferably selecting one from this order):
No objection, however I think that should be inherited from the general case in #109 .
Do I understand you correctly in that the order of this check is different to resources in general ie. if a resource in a container has its own ACL, it will be applied, otherwise, the inheritance algorithm is applied. What I'm not clear about the proposal is if index.html's ACL exists it will be applied instead of container's ACL. Can you clarify that bit?
I don't particularly see why index.html's ACL needs to exist (as opposed to just inheriting container's). It opens up more complications than it actually helps. While a container's representations are resources in their own right, it doesn't mean that they must have their own ACL. Simply use container's (fixed reference). What's the actual use case to be different?
I just want to comment very briefly on this:
That's wrong. Trellis already supports it as indicated above. We also support it trivially with Perl:
my $gen = RDF::RDFa::Generator->new; $gen->inject_document($dom, $model);
There is plenty of implementation experience, and it is very mature. I would be very surprised if it is not equally simple in JS, it is just about generating the RDF and adding it to the DOM tree, that's all there is to it.
Edit: Didn't notice your comment before sending mine, so here is a quick reply:
It is not an arbitrary injection. Obviously triples can always be thrown in somehow. Even possible with
In any case, I've noted below that this part of the update is an implementation detail.
Revisiting this.. there is more commonality in the approaches than they seem.
Appending a resource to a container:
It should include a triple like:
Following are equivalent:
Is this correct:
Effectively establishes container's HTML to be RDF bearing.
Influences required RDF serializations: #45 ie. adding RDFa (or script with RDF) to Turtle and JSON-LD.
When a new resource is appended or removed from a container, all of its representations (including HTML) needs to incorporate the changes to the containment.
Appending another resource:
How exactly a server includes the containment triples is an implementation detail. Same level of requirement for Turtle and JSON-LD.
This may be a less of an issue when /index.html is updated directly (eg. PUT /index.html) because the container's integrity falls on the server-imposed constraint ie. listing containment triples.
Server should reject if update to /index.html changes containment triples (aligned with global rule on updating containers). That entails that an RDFa parser is required. Having an RDFa parser makes it possible to serialize to other RDF.
If we treat all representations of a container as equivalent (based on underlying RDF graph), then adding new resources to a container ultimately requires an update to container's description. So the representations need to include the containment triples.
The following criteria can work for updating resources but not sufficient for appending or deleting a resource from a container:
So, we do need the following any way:
Having said that, there is the question of whether the HTML representation of / may be exempt from this and so a SHOULD instead of MUST. Relaxing the requirement definitely simplifies server and client implementations and allow more flexibility on / (eg. as arbitrary homepage or directory listing).. Keeping it strict means more consistency but also there is a chance that the resulting HTML is not necessarily what the client would like to see/interpret.
TL;DR: Please explain benefit motivating the requirement of embedding RDF in HTML representation of a container.
I honestly don't understand benefit of requiring HTML representation of a container to include RDF (embedded via
There is much repetition at this point. I'm responding below but it'd be great if we can continue the recurring themes and ideas in public chat, calls, F2F etc.
Once again, if the representations are expected to encode equivalent RDF graphs, then being consistent is a reasonable design decision. The contrary is easy to raise: why should format x and y be expected equivalent but not z, especially when the resource is expected to be RDF bearing with containment information to begin with. This case is not to be conflated with a resource that's deemed to be non-RDF bearing (like an image) and then also providing an RDF-bearing representation. That's not what we want or should be practiced. The homepage case not only predates but is widely deployed than root storage or container. Hence, if the distinction between a homepage and root container is so important, then it only makes sense to leave the homepage or a directory index alone at / and simply use another URI for the root container. Certainly that's not a nice option to some people and it only complicates the situation. So, for the time being, we have to look into how to accommodate both cases - which are actually quite similar - using the same URI.
I am a user. That's not what I want. I do not want to switch between applications just to update different aspects of a resource, or worse, have it switch perspectives based on a particular representation - the classic "pure RDF" and "not so pure RDF". Nothing like that is in practice or can be considered a good design. I want to be able to use an application like dokieli to update / where it works as storage root (including containment information) as well as a human and machine-readable homepage including my WebID Profile. I consider my WebID Profile in a HTML+RDFa to be canonical because that gives the most utility. I want to be able to authenticate using that WebID. If a server can't provide an RDF bearing representation of my WebID Profile or a client unable to parse an RDF bearing representation (as in RDFa), I can't authenticate. Currently, only Turtle (and JSON-LD) are acknowledged by some servers and authentication clients. We need to clarify this gap so that people are not prevented from publishing as they wish while adhering to minimal global requirements.
clears throat Soooo, since we're not aligning here...
May I just throw out a totally breaking greenfield idea...?
Lets make containers server managed, but have it link to any metadata and any other data it might point to for a client to make a reasonable representation of it.
Having containers that have protected data, but also data that can be changed has caused all kinds of problems. Compound state LGTM, but probably not something we can find consensus around.
Then, a more elaborate aux resource system is under consideration, and so, it seems straightforward that a client will pull in data from various sources after
Interacting directly with it, which was @timbl 's preference can be done trivially if
But so much would be simpler if we just made the container server managed and told clients the resources they might want to get for a given application.
Could we have an exception for
We could. We could also say that the container RDF needs to be brought along by injecting RDFa into the resulting representation. But is that exception really worth the bother given that UAs tend to slurp in a large number of resources anyway?
Nooooooooooooo (sorry I couldn't help it @kjetilk)
I do have to say that I'm strongly -1 on this approach at the moment, because it would have an immediate breaking impact on a lot of code (including mine), and specifications like shape trees and application interoperability. All of that said, if you provided some concrete examples of what you're proposing, specifically in cases where there is a dependency on and usage of data in the graph of the container resource, I'd be happy to look at ways to reconcile.