Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does the server determine whether a resource is an ACL resource? #31

Closed
RubenVerborgh opened this issue Aug 11, 2019 · 21 comments
Closed

Comments

@RubenVerborgh
Copy link
Contributor

RubenVerborgh commented Aug 11, 2019

No description provided.

@pmcb55
Copy link

pmcb55 commented Aug 11, 2019

This issue seems specifically aimed at addressing 'ACL-ness' of resources from a file-based resource store perspective (i.e. Ruben is specifically saying 'the filename ends in...', and not necessarily the IRI of the resource itself).

In my opinion, the 'correct' position on this is that the 'ACL-ness of a resource' is represented explicitly as a triple within that resource, e.g. a simple <resource> a wac:Acl. triple. For file-based backends, this has the drawback that we only know upon reading that resource whether the agent has read access to it. That is different from other resources.

But my initial thinking is that representing 'ACL-ness' shouldn't infect the architecture or our low-level interfaces (especially fundamental ones like ResourceIdentifier), but that file-based stores may need optimization workarounds (that should be isolated and unique to them, e.g. creating a class FileBasedResourceIdentifier extends ResourceIdentifier;, or something like that...).

And indeed, the ACL-ness of a resource is really just one important piece of meta-data associated with a resource. So I think we need a flexible, extensible way to express meta-data about resources (without needing to always store that meta-data 'within' the resource itself). We already have that notion in Trellis (i.e. various forms of resource meta-data, e.g. user-provided meta-data (like the GPS coordinates for a JPG photo, or the camera aperture setting), server-provided meta-data (e.g. the region a request was processed in, or the host-name of the server that authorized a GET request), ACL meta-data, etc.).

Perhaps the best approach is to allow an arbitrary number of named-graphs to be associated with any resource (means those resources can only ever contain triples, and not quads!), and for file-based systems, those named-graphs could indeed be stored as files with extra special extensions (like Ruben's suggested .acl for ACL meta-data for instance).

@RubenVerborgh
Copy link
Contributor Author

This comment is not intended an opinion either way, I just want to emphasize the consequences of this.

For file-based back-ends, this has the drawback that we only know upon reading that resource whether the agent has read access to it.

This would imply that a server with a filesystem-based back-end:

  • would need to parse every RDF resource in order to determine its permissions (so just echoing files from disk is no longer possible)
  • cannot stream files to the client (since the very last triple might be the wac:Acl one)
  • would need to support all RDF syntaxes (if ACLs can be in any syntax)

The above is to me a strong suggestion that ACL-ness could/should be a server-specific (or backend-specific) decision, which should not be mandated by the ACL spec. If a server wants to assume that (only) resources with a .acl extension are ACL resources, it should be able to do so. Or any other way of doing that (including the presence of a triple).

However, even the latter is not optimal, because then storage components would need to be aware about ACLs, whereas they can currently be agnostic. Also, this would introduce an extra call to the storage component for every single request.


And indeed, the ACL-ness of a resource is really just one important piece of meta-data associated with a resource. So I think we need a flexible, extensible way to express meta-data about resources

Fully agree.

And that solves the interface problem mentioned above. Instead of ACL-awareness, the store then just needs metadata awareness. And how the metadata is stored, could be specific to the store itself.


The conclusion for me currently is that determining ACL-ness is specific to the implementation and should not be in the spec.

@acoburn
Copy link
Member

acoburn commented Aug 11, 2019

A related question is whether the lifecycle of an ACL resource is tied to the lifecycle of the resource it controls. There are arguments on both sides of this issue, neither of which is mandated by any existing specification. And the assumptions about ACL lifecycle have implications for how this question is answered.

@justinwb
Copy link
Member

Another factor to keep in mind is data portability. When I take my data with me, I should also be able to take the permissions structure (and to @pmcb55's point - other related metadata) along with it.

@kjetilk
Copy link
Member

kjetilk commented Aug 12, 2019

It seems to me that the key problem now is that the link encoded in the header just points to an arbitrary resource, i.e. <.acl>, which may or may not exist. Instead, it should run the inheritance algo, so that each resource points exactly to the relevant ACL resource using the Link header, and then we could rely on that.

This would tie the resource to the ACL, but not necessarily the other way around, which is indeed a question we also have to clarify.

@michielbdejong
Copy link
Contributor

michielbdejong commented Aug 12, 2019

For the server reading ACLs from the pod data

At the spec level, the ACL doc that applies to a resource is defined by the link header, as @kjetilk correctly stated, and @kjetilk made a good point that maybe it should link to an existing ACL doc, but currently it's https://github.com/solid/web-access-control-spec#acl-inheritance-algorithm. If the ACL manager does not know what the ACL URL scheme of the data is, then it needs to do a lot of HEAD requests, For instance:

HEAD /foo/bar.txt // look for link header -> /.acl-docs/foo/bar.txt.ttl
HEAD /.acl-docs/foo/bar.txt.ttl // test for existence -> no
HEAD /foo/ // look for link header -> /.acl-docs/foo/index.ttl
HEAD /.acl-docs/foo/index.ttl // test for existence -> no
HEAD / // look for link header -> /.acl-docs/index.ttl
HEAD /.acl-docs/index.ttl // test for existence -> yes
GET /.acl-docs/index.ttl // get the ACL doc for /, that applies with `acl:default` for /foo/bar

If it does know that, then it can skip half of those:

HEAD /.acl-docs/foo/bar.txt.ttl // test for existence -> no
HEAD /.acl-docs/foo/index.ttl // test for existence -> no
HEAD /.acl-docs/index.ttl // test for existence -> yes
GET /.acl-docs/index.ttl // get the ACL doc for /, that applies with `acl:default` for /foo/bar

If the ACL then contains acl:accessTo triples for the resource, then those are what you look at. If it has acl:default triples for an ancestor (in the ldp:contains LDP-BC sense) of the resource, then you look at acl:default. If neither are present then access is denied.

Impact on 'move your pod'

Each server can invent their own schema for it. Note that this means that if you're unlucky, this may lead to clashes during data migration from one server implementation to another. So you would need a migration script, then.

For the server determining whether to require acl:Control

It's basically impossible for the ACL manager to not know whether /.acl-docs/index.ttl is an ACL doc or not without knowing about the scheme used by the data. It would theoretically need to test each resource on the entire pod with HEAD requests, to find out if any of those resources maybe point to /.acl-docs/index.ttl as their ACL doc. So that's when building a Solid pod server, you need to make sure that the code that reads ACLs from the pod data, and the code that controls whether to require acl:Control, have a a common 'AclUriScheme' dependency on some code that maybe exposes two methods:

  • AclUriScheme.getAclUriForResourceUri (url => url), and
  • AclUriScheme.isAclUri (url => boolean)

@michielbdejong
Copy link
Contributor

@acoburn also asked an important question about life cycle. There is, in the current spec, no link between the life cycle of a resource and the life cycle of its ACL doc. This implies that an ACL doc could exist for a resource you create, and that a 404 response also needs to give a link header pointing to the ACL doc that would start applying when the resource would be created.

@RubenVerborgh
Copy link
Contributor Author

currently it's solid/web-access-control-spec#acl-inheritance-algorithm.

Good that you bring this up, because this algorithm does put constraints on the form of ACL IRIs and their relations.

Interestingly, there is an explicit mention of:

It is considered an anti-pattern for a client to perform those steps, however. A client may discover and load a document's individual resource (for example, for the purposes of editing its permissions). If such a resource does not exist, a client SHOULD NOT search for, or interact with, the inherited ACLs from an upstream container.

So the pattern seems restricted to servers. Which makes me wonder whether it belongs in the spec at all, given that the specs govern the interaction between clients and servers, but not their internals.

Another take on this is that the pattern is specific to a Solid filesystem, which also might need a spec then for interoperability reasons (to @justinwb's point).

@michielbdejong
Copy link
Contributor

the pattern seems restricted to servers. Which makes me wonder whether it belongs in the spec at all

Good point, it should not be used by clients. But it should be in the spec because it needs to be used by any ACL editing app, so that the user knows what effect their edits will have.

@kjetilk
Copy link
Member

kjetilk commented Aug 12, 2019

Indeed, I think it should be restricted to servers, but that it should run when a resource is requested. That way, you will never do searching like the ones detailed by @michielbdejong .

If you do

GET /foo/bar/baz.ttl

it will respond with the exact ACL resource it relates to as per the inheritance algo, e.g.

Link: </foo/.acl>; rel="acl"

It is a good point that ACL editing apps need to understand it, I didn't think about that. I think we could use a brainstorm around that problem. It is quite likely that ACLs resources will evolve to be pretty large beasts, and so, the server should be able to exert some control over how they are formulated, that it is the server's task to simplify them if it can.

Say that different editing app puts a read for the same group in both /foo/baz/ and /foo/bar/, and so the /foo/baz/.acl and /foo/bar/.acl ends up being identical. Then, the server could find that commonality, and join them in a rule in /foo/.acl. I could envision this becoming a major optimization discipline in Solid, so I think it makes sense to leave the responsibility of placing the ACL resources to the server. Perhaps we could have a protocol where the editor asks the server what the URL of the ACL resource needs to be?

Anyway, I'm diverging from the original topic here, sorry about that.

@acoburn
Copy link
Member

acoburn commented Aug 12, 2019

If the response headers link to the effective ACL, there needs to be a very clear distinction between that and the (possibly-non-existent) ACL for the current resource.

From the perspective of a client, there may be two different questions being asked:

  1. what is the ACL governing the current resource?
  2. if I want to add or adjust an ACL specific to this resource, where do I go to do that?

If a client simply follows the link header, treating the URL as opaque (which it arguably should be doing), how does that client know that the target ACL resource will affect only the desired ACL?

Imagine the case where there is a single ACL resource at the Pod root. A client wants to make a particular sub-container world-readable. If the link header displays the effective ACL (the root ACL), and the client adds an Authorization rule such that an AuthorizedAgent can read the given resource, that client may inadvertently give read access to the entire Pod. This is, IMO, the best argument for providing only a link header to the resource-scoped ACL, whether or not it exists (and not the "effective ACL").

@kjetilk
Copy link
Member

kjetilk commented Aug 12, 2019

If the response headers link to the effective ACL, there needs to be a very clear distinction between that and the (possibly-non-existent) ACL for the current resource.

There will always be an effective ACL, since the root ACL is required to exist.

From the perspective of a client, there may be two different questions being asked:

1. what is the ACL governing the current resource?

Yes, and that will always be in the Link header.

2. if I want to add or adjust an ACL specific to _this_ resource, where do I go to do that?

AFAICS, it could add that to the ACL governing the current resource, it would just have to be specific about that in the acl:accessTo predicate.

If a client simply follows the link header, treating the URL as opaque (which it arguably should be doing), how does that client know that the target ACL resource will affect only the desired ACL?

It would know that there are no closer ACLs in existence, and therefore, it can add a rule to the current ACL resource. For example, it can well add a rule into /foo/.acl saying:

<#auth1> a acl:Authorization ;
  acl:accessTo </foo/bar/baz.ttl> 
[...]

The problem that could/will occur is that while the client is doing this, a different client adds /foo/bar/.acl and thus disabling /foo/.acl for /foo/bar/baz.ttl.

Imagine the case where there is a single ACL resource at the Pod root. A client wants to make a particular sub-container world-readable. If the link header displays the effective ACL (the root ACL), and the client adds an Authorization rule such that an AuthorizedAgent can read the given resource, that client may inadvertently give read access to the entire Pod. This is, IMO, the best argument for providing only a link header to the resource-scoped ACL, whether or not it exists (and not the "effective ACL").

That's not how I see it. I think it is overly primitive to rely on clients making the right decision, we can have that in the first iteration, but I don't think it will scale. ACL editors will need to be specific about what they grant access to, in their acl:accessTo and acl:default. So, obviously, the server does check the Control mode, which is one of the lines of defense against that situation, but granted, it is not a complete defense.

What I envision is that servers will rewrite the ACLs to respond to situations like the above, where concurrent clients would override each other, and to optimize ACL lookups, and this could provide another line of defense against the situation you outline. Perhaps we should indeed only let clients modify the most specific .acl and let the server decide where to actually store that? It might confuse people to see that their recently submitted .acl "file" doesn't show up though... Not an ideal solution, for sure, but I do feel quite strongly that client-side management of ACLs isn't going to scale.

@timbl
Copy link
Contributor

timbl commented Aug 12, 2019

How a Solid server stores ACL resources and normal resources is completely up to the server.

For one which is file system backed and NSS compatible, it is specified in "NSS compatible server" spec which includes the file mapping, and which we should publish separately. Data portability is very important, and applies to the NSS-compatible-file-based-solid-server spec, not the solid spec itself.

From the perspective of a client, there may be two different questions being asked:

  1. what is the ACL governing the current resource?
    Yes, and that will always be in the Link header.

No. The link header always contains the ACL for the resource itself, not the one up the tree whose default apply.

To find the one up the tree to which defaults apply, we could add another header.

How do you find the ACL which is applying the default, or one where one could add a default? The clients find the ACls of parent containers by navigating to the parent container by using the '/' in the URI. In solid, slashes have semantics shared by client and server. The bit in the spec which discourages that is wrong and needs to be changed.

This is is WRONG: "It is considered an anti-pattern for a client to perform those steps, however. A client may discover and load a document's individual resource (for example, for the purposes of editing its permissions). If such a resource does not exist, a client SHOULD NOT search for, or interact with, the inherited ACLs from an upstream container"

The client navigates the folder hierarchy all the time, and understands ACL defaulting.

Remember that Solid a system in which the minimum smarts are in the server, and the growth of smarts is in the clients. Client Apps will set up folder structures for users and set the ACLs so that the right things happen. They will reply on the default system working as it does.

@RubenVerborgh
Copy link
Contributor Author

RubenVerborgh commented Aug 12, 2019

@timbl Just for clarity: you 👎'd the original issue; but it was not a proposal, rather an open question. Does it mean that .acl (and ,acl?) is indeed the way to determine whether a document is an ACL resource?

For one which is file system backed and NSS compatible, it is specified in "NSS compatible server" spec which includes the file mapping, and which we should publish separately.

Created #34 to track that.

In solid, slashes have semantics shared by client and server.

Do we currently have a description of that semantics? It's currently not present in the v0.8 draft spec, but we can add it in this repository. Created #35 to track.

The bit in the spec which discourages that is wrong and needs to be changed.

Reported in solid/solid-spec#211.
There are quite some mistakes still in v0.8 though, hence issues like this one to check (and hopefully these documents can take over soon).

The client navigates the folder hierarchy all the time, and understands ACL defaulting.

I myself would have preferred links to parent ACLs here, but you likely feel more strongly about this than I do.

@pmcb55
Copy link

pmcb55 commented Aug 14, 2019

The client navigates the folder hierarchy all the time, and understands ACL defaulting.

I myself would have preferred links to parent ACLs here, but you likely feel more strongly about this than I do.

I too would prefer explicit links to parent ACLs here (but perhaps we can do both, e.g. make them SHOULD's). My reason is pretty simple though: being explicit where you can is nearly always better than being implicit.

@csarven
Copy link
Member

csarven commented Sep 24, 2019

A foo-compatible-file-based-solid-server spec can reuse the notion of .. (parent directory) like in Unix with a child-to-parent property: #35 (comment)

@csarven csarven added this to the December 19th milestone Oct 4, 2019
@Mitzi-Laszlo Mitzi-Laszlo added this to To Do in Specification Oct 11, 2019
@RubenVerborgh RubenVerborgh assigned timbl and unassigned kjetilk Oct 29, 2019
@RubenVerborgh RubenVerborgh moved this from To Do to Under discussion in Specification Oct 29, 2019
@RubenVerborgh
Copy link
Contributor Author

RubenVerborgh commented Oct 31, 2019

Proposal following F2F meeting of 2019-10-30 with @csarven @timbl @kjetilk @RubenVerborgh:

  • A resource X is an ACL document for resource R if a response for R contains a Link header with rel="acl" to X.
  • From the perspective of a client, the extension does not play a role in determining whether a document is an ACL resource.

Open question:

  • What Link header should an ACL document have? None, or point to itself, or…?

@RubenVerborgh RubenVerborgh moved this from Under discussion to Rough consensus in Specification Oct 31, 2019
@kjetilk
Copy link
Member

kjetilk commented Oct 31, 2019

Mmmm, but that doesn't capture an ACL document relevant to resource R that has been found through inheritance, does it?

@RubenVerborgh
Copy link
Contributor Author

@kjetilk #106 does now 😉

But the above statement means that the ACL-ness of a document follows from its usage as a target in a rel="acl" relationship, not from any other means.

@csarven
Copy link
Member

csarven commented Nov 14, 2019

What Link header should an ACL document have? None, or point to itself, or…?

I think an assigned interaction model for an acl (and meta) resource will help address a number of interrelated issues. Proposed here: #105 (comment)

@Mitzi-Laszlo Mitzi-Laszlo modified the milestones: December 19th, February 19th Jan 14, 2020
@csarven csarven modified the milestones: February 19th, ~First Public Working Draft Jan 24, 2020
@csarven csarven moved this from Rough consensus to Drafting in Specification Mar 25, 2020
@csarven
Copy link
Member

csarven commented Jun 22, 2020

Rough consensus in PR commit: dd137e3

Follow up on additional link relation for the effective/inherited ACL: #106 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Specification
  
Done
Development

No branches or pull requests

9 participants