Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for groups in WebAclAuthorizer #900

Closed
wants to merge 5 commits into from

Conversation

iosonopersia
Copy link
Contributor

@iosonopersia iosonopersia commented Aug 11, 2021

Closes #239.

Hi everyone!
Together with @LudovicoGranata, we've implemented the ACL.agentGroup feature for the WebAclAuthorizer.ts module.
In our use case, we needed the possibility to define groups of users so that giving them permissions could become easier.

We are definitely aware that our implementation is still not 100% foolproof, as we decided (for the moment) not to tackle the "Infinite Request Loops in Group Listings" problem.

This implementation works for us, with some limitations:

  • vCard group files are not syntactically validated;
  • vCard group files MUST be publicly accessible: unauthenticated requests will be made to fetch them, avoiding at the root the "Infinite Request Loops" problem cited above. Future changes removing this limitation should also deal with finding a solution to that problem.

We hope that this contribution could be useful for the CSS project and we're open for suggestions on how to improve it!

@RubenVerborgh
Copy link
Member

Thanks @iosonopersia and @LudovicoGranata. This is a great contribution.
Will review in detail and get back to you.

@RubenVerborgh RubenVerborgh added the semver.minor Requires a minor version bump label Aug 11, 2021
Copy link
Member

@RubenVerborgh RubenVerborgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, and definitely well on the way to make it into the codebase.

Apart from the detailed comments, I also have an architectural comment.

You'll note from the code that it gets significantly longer, and that length will only increase if we also support acl:origin and acl:trustedApp at a later stage.

This is why I would suggest to move the group behavior to a new class, in such a way that WebAclAuthorizer then makes use of that class.

Concretely (and apart from the async changes, which I agree with), we should change hasAccess such that it asks an external component to do the group check. The code of this method then becomes (draft):

    // Check if authenticated access is allowed
    if (this.isAuthenticated(agent)) {
      // Check if any authenticated agent is allowed
      if (acl.countQuads(auth, ACL.agentClass, ACL.AuthenticatedAgent, null) !== 0) {
        return true;
      }
      // Check if this specific agent is allowed
      if (acl.countQuads(auth, ACL.agent, agent.webId, null) !== 0) {
        return true;
      }

      if (await this.accessChecker.handleSafe({ acl, auth, agent })) {
        return true;
      }
    }

where accessChecker is an AsyncHandler that is passed as an optional argument to the constructor.

Then you move implement the group code into a GroupAccessChecker.

At a later stage of the code, we can then replace all of that code block with

    // Check if authenticated access is allowed
    if (this.isAuthenticated(agent)) && await this.accessChecker.handleSafe({ acl, auth, agent })) {
        return true;
    }

and even the entire method by

private async hasAccess(agent: Credentials, auth: Term, acl: Store): Promise<boolean> {
  return this.accessChecker.handleSafe({ acl, auth, agent });
}

because the other checks can move into AccessChecker classes of their own. (@joachimvh: for semver.minor compatibility, we'd need to set the default access checkers in the constructor until the next semver.major).

With the above suggestion, we partition the code into smaller chunks that are easier to reuse and test, without increasing the complexity of the authorizer class (which is already quite long). As an added bonus, people can selectively enable or disable group support.

I hope this makes sense; I'm sure @joachimvh is willing to help with the refactor if you have doubts.


we decided (for the moment) not to tackle the "Infinite Request Loops in Group Listings" problem.

Let's put a (configurable) limit to how many documents we fetch to determine group access. That's also easy to add when the group checker is a separate class.


Thanks again for your contribution!

src/authorization/WebAclAuthorizer.ts Outdated Show resolved Hide resolved
src/authorization/WebAclAuthorizer.ts Outdated Show resolved Hide resolved
src/authorization/WebAclAuthorizer.ts Outdated Show resolved Hide resolved
src/authorization/WebAclAuthorizer.ts Outdated Show resolved Hide resolved
src/authorization/WebAclAuthorizer.ts Outdated Show resolved Hide resolved
src/util/Vocabularies.ts Outdated Show resolved Hide resolved
Copy link
Member

@joachimvh joachimvh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, this looks quite clean already. My main issue is the cache for which I added an extensive comment.

@@ -56,9 +73,12 @@ export class WebAclAuthorizer extends Authorizer {
const modes = (Object.keys(permissions) as (keyof PermissionSet)[]).filter((key): boolean => permissions[key]);
this.logger.debug(`Checking if ${credentials.webId} has ${modes.join()} permissions for ${identifier.path}`);

// Reset the cache of vCard group files at the start of each interaction
this.groupsCache = {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not use a class variable for this but instead pass this along where necessary, preferably in a way that not all functions need it as a parameter, but even if that can't be avoided it would still be better that way. If multiple connections happen at the same time they could overwrite this variable while another connection is still making use of it, which would delete the cache (while a server with that many connections would need it even more).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having gone through the code now, the issues of the above comment still stand, but I'm going to suggest a different solution. I would use a KeyValueStorage here. Specifically because such a storage can be wrapped in a WrappedExpiringStorage which allows temporary data. That would solve the comment about expiring data below. Config example: https://github.com/solid/community-server/blob/e1ed9c823e2943c2d71e9220f9dd086a0752bb9a/config/identity/handler/key-value/storage.json#L4-L9

Note that a WrappedExpiringStorage needs to be added to the list of Finalizers so it can be cleaned up once the server stops. The same config mentioned above also has this: https://github.com/solid/community-server/blob/e1ed9c823e2943c2d71e9220f9dd086a0752bb9a/config/identity/handler/key-value/storage.json#L10-L14

How the storages they are wrapped around are configured you can see in https://github.com/solid/community-server/tree/main/config/storage/key-value . Since you are storing Store objects though, this would have to be configured to always be a Memory storage (since we can't store JS objects in the ResourceStore), so its configuration does not have to happen in the folder above since it can't change. We could potentially change the cache to store the body of the fetch call instead of the Store, that way we could use the different storage, but that would be a more drastic change of this PR and not immediately required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @joachimvh and @RubenVerborgh ,
I'm the co-author of this PR. We are trying to implement the cache for groupAccessChecker. Since the handle method of this class will be executed in parallel multiple times, we would need a locking mechanism for accessing the cache. Can you suggest us a way to do this? have you already implemented something like that in this project? Thank you very much

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concurring with Ruben below, I don't think locking is necessary here. In the end this is also data that doesn't change once it is written to the cache, the only modification that happens is that it gets deleted at some point which will simply return undefined on a get. In the case of the memory map storage, objects get added or removed to a simple Map object, and in the case of the resource storage it goes through all the resource stores which includes a locking mechanism, so both of these are safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @joachim. I've force-pushed this branch some minutes ago.
Please, have a look at the GroupAccessChecker.ts file (in particular, at the isMemberOfGroup method). That method is called inside a "promiseAny" call, which means that potentially many instances of it could be executed in parallel.

What happens, when we try to use a cache inside that method, is that every instance reads the cache almost at the same time (discovering that it does not contain the required object) and so each of them proceeds by fetching that resource.

We thought about some locking mechanism because we wanted to prevent multiple fetches of the same resource, but maybe there's a different way to handle the cache that avoids this problem altogether... We don't know 🤷‍♂️...

src/authorization/WebAclAuthorizer.ts Outdated Show resolved Hide resolved
src/authorization/WebAclAuthorizer.ts Outdated Show resolved Hide resolved
test/unit/authorization/WebAclAuthorizer.test.ts Outdated Show resolved Hide resolved
test/unit/authorization/WebAclAuthorizer.test.ts Outdated Show resolved Hide resolved
@joachimvh
Copy link
Member

@RubenVerborgh

Let's put a (configurable) limit to how many documents we fetch to determine group access. That's also easy to add when the group checker is a separate class.

I don't see how we can do this, since the group checker (or the authorizer for that matter) would not be aware if the call it is getting now is part of an ongoing chain or part of a new request.

@RubenVerborgh
Copy link
Member

Thanks @iosonopersia, good stuff.

@iosonopersia iosonopersia marked this pull request as draft August 13, 2021 12:09
@iosonopersia
Copy link
Contributor Author

I've fixed almost everything. I'm now too tired, I'm going to implement the cache for GroupAccessChecker tomorrow.
This PR is becoming bigger and bigger... 😎

Comment on lines 2 to 10
* A replacement for the Promise.any() function, which is only supported
* by NodeJs starting from version 15.0.0.
*
* @remarks
*
* Predicates provided as input must be implemented considering
* the following considerations:
* 1. if they throw an error, it won't be propagated;
* 2. throwing an error should be logically equivalent to return false.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not an identical replacement for Promise.any. Which is fine (we need different functionality), but the text is confusing.

Given the we are implementing more like the equivalent of Array.some for promises, I wonder if we can

export function promiseSome(predicates) {
  return new Promise((resolve, reject) => {
    for (const predicate of predicates) {
      predicate.then(value => value && resolve(true));
    }
    Promise.all(predicates).then(values => resolve(false), reject);
  });
}

to also no lose errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If promiseSome propagates errors, then it will appear another little problem: there are legitimate situations in which accessChecker.handleSafe() can throw a NotImplementedHttpError (i.e. when the given acl rule does not contain a triple with predicate acl:agentGroup). This particular case should be handled.

Anyway, I see that an overall refactoring of the WebAclAuthorizer could be made to avoid certain functions to be executed potentially many times. Just an example: createAuthorization calls 2 times determinePermissions, which call 4 times hasPermission. hasPermission calls hasAccess for every rule that refers to a particular access mode (these rules could potentially be A LOT). Then, hasAccess calls once the accessChecker, which in turn calls isMemberOfGroup for every group referenced by a particular rule (these groups could, again, be potentially A LOT).

Considering the worst case, some lines of code could be executed 2x4xNxM times. It feels like some of this complexity can be reduced, hopefully leading to code easier to follow and understand. I'll take a look and reason about it in the following days.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are legitimate situations in which accessChecker.handleSafe() can throw a NotImplementedHttpError

Right, then we probably want to .catch(x => false) them

@RubenVerborgh
Copy link
Member

RubenVerborgh commented Aug 14, 2021

@iosonopersia @LudovicoGranata thanks so much for the great work on this.
It's high-quality code that we can directly use. Really excellent.

I've asked @joachimvh to look at it, so don't feel obliged to put more work in,
we're totally happy to handle the rest ourselves.

Since the handle method of this class will be executed in parallel multiple times, we would need a locking mechanism for accessing the cache. Can you suggest us a way to do this?

I don't think we need a lock for that, actually. You could have a cache of promises, so from URL: string to Promise<…> and thenawait it every time you use it.

@iosonopersia
Copy link
Contributor Author

@joachimvh , @RubenVerborgh ,
I only ask you to wait a bit 🙏: I found a way to refactor the WebAclAuthorizer, which should simplify subsequent work and which could slightly improve the performances.

I need some time to fix a few problems that are still there, then we will most probably leave this PR to you so that you can implement a cache for GroupAccessChecker.

@RubenVerborgh
Copy link
Member

Super, ping us when done!

@iosonopersia
Copy link
Contributor Author

iosonopersia commented Aug 16, 2021

Done 🎉! @RubenVerborgh , @joachimvh , @LudovicoGranata

What remains to do:

  • Edit commit "feat: Add function promiseAny": changes proposed by @RubenVerborgh should be applied;
  • Add another commit named "feat: Add cache for GroupAccessChecker", where a cache should be properly implemented for the GroupAccessChecker class.

Edit: both tasks were solved by @joachimvh in #923

@iosonopersia iosonopersia marked this pull request as ready for review August 16, 2021 09:54
Copy link
Member

@RubenVerborgh RubenVerborgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM!

Just want @joachimvh to check whether the algorithm rewrite is indeed what we want.

src/authorization/WebAclAuthorizer.ts Outdated Show resolved Hide resolved
src/authorization/WebAclAuthorizer.ts Outdated Show resolved Hide resolved
src/authorization/WebAclAuthorizer.ts Show resolved Hide resolved
src/authorization/accessCheckers/AgentAccessChecker.ts Outdated Show resolved Hide resolved
Comment on lines 47 to 50
/**
* If an error is thrown, we don't want this function to propagate it,
* because of the semantics of promiseAny.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
* If an error is thrown, we don't want this function to propagate it,
* because of the semantics of promiseAny.
*/
/**
* If an error is thrown, we don't want this function to propagate it,
* because of the semantics of promiseAny.
*/

(This component does not know whether it will be used with promiseAny.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A warning/error should also be logged here.

Copy link
Member

@joachimvh joachimvh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't comment on the Promise.any implementation since there are upcoming changes for this. I will look into the Storage later when the code is mostly stable. The solution is going to probably be a storage that has Promises as values that resolve to the needed result so only 1 fetch happens if there are multiple requests to the same path, as Ruben suggested.

src/authorization/WebAclAuthorizer.ts Outdated Show resolved Hide resolved
src/authorization/WebAclAuthorizer.ts Outdated Show resolved Hide resolved
const rules = new Set(acl.getSubjects(null, null, null));

for (const rule of rules) {
const hasAccess = await promiseAny(this.accessCheckers!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accessCheckers was made optional to not make this a breaking change. But this line would cause an error if it was not defined so it is still breaking. This line is going to have to change anyway in case it gets changed to a single checker as mentioned above, but it should not crash if the access checker is undefined.

@RubenVerborgh Since the access check behaviour is moved away to a different class this is actually still a breaking change since after this the Authorizer is not going to support the normal behaviour it did before anymore unless that parameter is defined. So I'm still more a fan of not making this optional and accepting that this PR requires a major version change.

src/authorization/accessCheckers/AgentAccessChecker.ts Outdated Show resolved Hide resolved
src/authorization/accessCheckers/GroupAccessChecker.ts Outdated Show resolved Hide resolved
src/index.ts Outdated Show resolved Hide resolved
src/util/Vocabularies.ts Outdated Show resolved Hide resolved
Comment on lines 76 to 78
new AuthenticatedAccessChecker(),
new AgentAccessChecker(),
new GroupAccessChecker(converter),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with our unit test structure, all three of these classes need their own unit tests, and in this file there should only be a mock AccessChecker. Testing the access check behaviour is no longer the responsibility of this class.

But I can look into this after this PR is mostly finished if you think this would be too much hassle for you.

const userPermissions = this.determinePermissions(agent, acl);
private async createAuthorization(agent: Credentials, acl: Store): Promise<WebAclAuthorization> {
// Get public permissions
const publicPermissions = this.determinePublicPermissions(acl);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After having gone through the code I don't think splitting determinePermissions up into two different functions is the way to go. determinePublicPermissions now implements some of the access check behaviour while determineAgentPermissions has offloaded that behaviour to the access checkers.

My suggestion would be to revert back to a single determinePermissions function that always uses the access checkers. I'm guessing the main reason for this split is because you don't want any heavy fetch calls to happen for the public part, but if you check the presence of the webid in those handlers then this is not going to be an issue, and each checker supports all agents, not just those with an identified WebID.

We would also need a checker to check the public agent class. I suggest renaming AuthenticatedAccessChecker to AgentClassAcessChecker, always checking for FOAF.Agent and only checking for ACL.AuthenticatedAgent if the webid field is defined.

@iosonopersia
Copy link
Contributor Author

iosonopersia commented Aug 17, 2021

Hi @joachimvh ,
I tried to fix everything I could. Test will fail because of coverage: the two lines (41-42) that throw an error in BooleanHandler are never reached because of the current problematic implementation of promiseAny.

Parameter accessChecker of WebAclAuthorizer is now mandatory.

Could you please try to fix this PR? I'm starting to feel a bit lost... 😖

iosonopersia and others added 5 commits August 17, 2021 18:29
Co-Authored-By: Ludovico Granata <Ludogranata@gmail.com>
Co-Authored-By: Ludovico Granata <Ludogranata@gmail.com>
Co-Authored-By: Ludovico Granata <Ludogranata@gmail.com>
Co-Authored-By: Ludovico Granata <Ludogranata@gmail.com>
Co-Authored-By: Ludovico Granata <Ludogranata@gmail.com>
@RubenVerborgh
Copy link
Member

Great work!

the two lines (41-42) that throw an error in BooleanHandler are never reached

A unit test for this class should fix that.

Would promiseSome still be of interest? (#900 (comment))

@RubenVerborgh
Copy link
Member

@joachimvh Could you do a final pass? I can re-review then.

@RubenVerborgh RubenVerborgh self-requested a review August 17, 2021 17:03
@joachimvh
Copy link
Member

Will look into finishing this up soon once I'm done with the other PRs, thanks already for all the work you put into this @iosonopersia and @LudovicoGranata

@joachimvh
Copy link
Member

Finished in #923

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver.minor Requires a minor version bump
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set groups for a specific access control rule
4 participants