Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

user-initiated sharing #4594

Merged
merged 67 commits into from Feb 7, 2024
Merged

user-initiated sharing #4594

merged 67 commits into from Feb 7, 2024

Conversation

minrk
Copy link
Member

@minrk minrk commented Oct 3, 2023

Implements the possibility of users granting each other permission to access their servers.

Summary of the design for review:

  • sharing is disabled by default
  • sharing can be enabled by granting the shares permission to all or a subset of users
  • there is no sharing UI, only an API
  • sharing can be explicit via user or group name (requires read:users:name or read:groups:name permission in order to grant specific access)
  • sharing can be granted via "share code", which only requires the shares permission, no other permissions
  • share codes must expire, and expire after 24 hours by default
  • share codes are exchanged for updating the Share for the given (user, server) combination
  • users can review/remove themselves from shares with the users:shares scope (part of default user scopes)
  • A user's permission on a server is always represented by exactly one Share row in the database. Accepting more shares updates this row, which makes expiration potentially confusing, so Shares do not currently have a way to expire.

Questions to consider:

  • default permissions (off by default, seems clear)
  • separate permissions for codes/explicit shares? It's currently not possible to enable sharing by name without also enabling sharing via code
  • limiting sharing? There is no mechanism by which to limit who a given user can share with (via code, at least). A limitation on read:users:name limits who they can share with by name, but there is not a mechanism to limit sharing by name if the user has other reasons to have read:users:name
  • there is no consent confirmation step for sharing by name. That means if you have scopes shares,read:users:name!user=xyz, then you can grant xyz access to your server without their participation at all. This is a major avenue for abuse and spam in Google Docs / Calendar, but I'm not sure that applies to JupyterHub? There's nothing you can do with a shared server once you grant the permission until the shared-with user visits the server and proceeds through the oauth confirmation steps.
  • Expiring Shares themselves. This is complicated by the fact that there is at most one Share for a given (user, server). What should happen if a user has expiring permissions and is then granted different permissions with different expiry? The current choices are: ignore updated expiration (surprising) or override old expiration (also surprising). Current solution is to avoid the situation by not expiring Shares at all, which matches ~all sharing I've ever seen (e.g. GitHub, Google Docs, slack, etc.) where access must be revoked explicitly once granted.

Scope limitations:

  • Shares are per-server, so you cannot use sharing to grant access to all of a user's named servers via sharing
  • Shares do not expire
  • Shares cannot grant access to anything except for

Original PR description below

Absolutely not close to working or ready for review (I'll be rewriting history a bunch), but opening as a placeholder since folks are asking about it.

The closest to ready is the new doc explaining how sharing is meant to work.

The idea is that a 'share' is a granting of permission (usually access) associated with a specific server to

shares are essentially mini roles, which always grant permissions restricted to a single server (usually just access:servers!server, but potentially including custom scopes, e.g. for read-only access). See #4349 for my thoughts on why I think this is probably going to work better than the more powerful but much more complex full user-initiated roles in #3858.

A Share

  • grants permission(s) for ONE server to ONE user OR group
  • record who created them
  • can expire

And users should be able to:

  • grant access to their servers (create Shares)
  • revoke access to their servers (delete Shares)
  • see who has access to their servers (list Shares by server / owner)
  • see what's been shared with them (list Shares by assigned user/group)

Managing sharing permissions is tied only to the server being shared, there is no limit placed on who it can be shared with. This also means an admin can grant access and the user can revoke that same access.

Questions:

  • should users be able to delete "shared with me?"
  • do we need a single 'list all the shares for everyone' endpoint? I started to implement this, but the permission filter is complicated (we have a similar implementation in listing users and groups, so it's doable, but nice to avoid
  • how do we address leaking user and group names via sharing (i.e. if i try to share with xyz, I can know whether there is a user with that name. Is that a problem?)?
  • should users have permission to share access to their servers by default, or should it be opt-in for admins (i.e. in the default user role?

I'd like to start as minimal as possible while still being actually useful.

As with other things, I plan to implement this as 'api-only' first, and UI can be its own task. Sharing UI is hard, and I'd like to try it in an extension first.

closes #4349

@minrk minrk added the new new features label Oct 3, 2023
@minrk
Copy link
Member Author

minrk commented Oct 17, 2023

Adding some notes from today's meeting (thanks everyone!)

  • invite links let us avoid needing username discovery in most cases
  • invite links should expire
  • sharing permission should be opt-in, not opt-out
  • sharing with users/groups should be limited by existing discovery permissions (e.g. read:users:name) to mitigate leaking and Google Docs / Calendar-style spam and abuse.
  • limiting who can be shared with may end up required by some deployments, but wait for someone to require this before attempting to implement it

@minrk
Copy link
Member Author

minrk commented Nov 13, 2023

I've got some time to put into this again before getting buried in the end of the semester, and I'm wondering if the created_by field is useful. Right now, I track who created a sharing permission, and I was thinking of reviewing permissions that you've granted, plus. But this isn't really something I see elsewhere, and as soon as modifying permissions starts to enter into it, it becomes super complicated.

Removing created_by from the db and leaving it to logging for audit purposes simplifies things, and then the "permissions granted by" question becomes one for logs, not for the db itself.

Another question is about modifying permissions. Normally, permissions are a single dict per user (i.e. a user has X permissions) and don't expire. But as written, a Server can be shared with a User more than once with different permissions and different expiration. This enables the possibility of e.g. "alice has read-only permissions to bob's server, but bob can grant alice elevated permissions for one hour, which will then revert to before."

Again, this is a weird situation, and I don't really like being in the business of weird situations, permissions-wise. So I see two main options in terms of permissions modification:

  1. modify a single record of permissions for a single user on a single server, which makes expiration weird
  2. allow multiple permission grants per user, which makes modifications like revoking permissions odd, but expiration of bundles of permissions sensible

So it seems to me that permissions should be a single record per (user, server) combination, and 'expiration' is purely a single number for expiration of all access to that server. The one weird case I can see for that is if a user already has permanent access to a server with permission e.g. access:servers!server and then accepts temporary elevated permissions admin:servers!server with an expiration date, what should happen? I see 3 choices:

  1. elevated permissions are granted, but permanently (exceeds request)
  2. elevated permissions are granted and all permissions expire (after share, permissions are less than before)
  3. error - if you already have permissions granted, you can't accept an invitation to modify your permissions, permissions for (user, server) must be modified explicitly via the API

Expiration is really the only thing that complicates this, because otherwise they are just additions/subtractions from a list. I'm thinking 3 makes the most sense, because there's no guessing involved. It's also not common for permissions to expire.

Finally, one last option is a bit in-between, but more complex to implement, which is to remove the concept of expiration from user permissions entirely, and only apply them to invitation codes. I.e. I can use the InviteCode approach to share permissions, and those permissions can expire, and they are tied to the expiration of the invite code. But only permissions granted that way. Right now, I have invitations as a temporary mechanism that exchanges a code for a real change in permissions, but once exchanged, there is no relationship between the code and the permissions granted (i.e. it's exactly the same as sharing with a specific user, you just didn't need to know the username beforehand). Changing this, such that permissions-from-invitations are more substantially different from permissions-granted-directly, would I think make expiration sensible (it makes a bit of sense to be 'invited' a few times), such that when a code expires or is revoked, all permissions granted by that code are also revoked.

I really want to make this first version as simple as possible, but combining invite codes and permission expiration is making this hairier than I'd like. I think it should be on the table to drop expiration of permissions entirely. It's quite a rare feature, and much more common for tokens than users.

Yet another possible simplification is to make the invite code singular per server, again matching Google Docs - where "people with the link" permissions are singular, and can't vary (they also don't expire, but can be revoked).

So here's a proposal that I think simplifies things, assuming we want to preserve the desire for sharing to expire:

  1. a 'share' can be with a user, group, or 'code'
  2. there can be only one 'share' for a given server per user, group, or code
  3. only codes can expire, so if you want to share permissions temporarily, you must use a code
  4. when a code expires, all permissions granted via that code are revoked

I'm on the fence about whether a given server should be allowed to have more than one code at a time. It makes sense structurally and in principle, but again this is something I haven't seen elsewhere and I don't like being in the business of unique things permissions-wise.

@manics
Copy link
Member

manics commented Nov 13, 2023

I agree with starting with something simple, especially if it's designed so that other features can be added without breaking changes.

Token sharing with automatic expiry is nice, how much demand is there for it? Is it essential for a first release?

Looking at the overall plan for sharing, does all of it need to be in core JupyterHub or could some of it be done in extensions/services?

@minrk
Copy link
Member Author

minrk commented Nov 14, 2023

Looking at the overall plan for sharing, does all of it need to be in core JupyterHub or could some of it be done in extensions/services?

I think it needs to be in Hub, because there is currently no mechanism by which an extension can grant permissions at runtime.

If I were to make the simplest user-initiated sharing proposal, it wouldn't include expiration:

  • share codes are exchanged for permissions, but not linked after permissions are granted (i.e. a share-code is identical to granting permission to a user by name, it's just letting the shared-with user enter their own name)
  • single permissions per (user, server) pair
  • share codes expire (usually short-lived), but permissions do not

That is: permissions are permanent (can be revoked, but do not expire automatically), and sharing via code can permanently expand a user's permissions, but not grant anything temporarily. I believe that solves most of the challenges. The other simpler alternative is:

  • single permissions per (user, server) pair (same)
  • single share code with single permissions (Google Docs-style "people with link can...") per server
  • access-via-code recorded separately, deleted when share code goes away
  • no expiry anywhere, but single link can be easily revoked

I think both make sense and are similarly simple. The first is more of an "invite code" model, where you are inviting folks to a server to stay, whereas the second is a "people with the link" model. The latter is easier to revoke permissions, while the former is easier to have short-term invite links. I don't know what folks would find most comfortable / familiar.

Another possibly simpler approach from the Hub perspective would be to implement a generic API for roles (#3858), then a 'sharing service' could layer on top the higher-level representation of what a 'share' is. That's not simpler for me, who needs to implement both, but it might be simpler to implement in the Hub (roles in general present their own challenges if they can be user-created). This is ultimately more complex overall.

@minrk
Copy link
Member Author

minrk commented Nov 21, 2023

Updating notes from today's meeting:

  • reiterating keeping it simple
  • that likely means only supporting invite codes permanently exchanged for 'shares'
  • single permissions per (user, server), so adding permissions is updating this field, not adding a record
  • no expiration for now, that can come later (invites expire, not permissions)

I think this is simple and clear enough that I can get something working before needing to bug folks for more feedback.

- create share via api
- revoke share via api
- permissions granted and tested

use pydantic for request validation
@minrk
Copy link
Member Author

minrk commented Dec 8, 2023

Still some work to do, but I now have sharing fully working where a user can grant and revoke access to their server to another user.

Still todo:

  • update doc
  • update REST API spec
  • test coverage
  • share code API

The fact that shares can be referenced both by the server and the user it's granted to has been surprisingly hard to make sense of in the REST API.

As it is now, I have:

By server:

  • grant permission: POST /shares/:user/:server (shared-with, scopes in body)
  • revoke permission: PATCH /shares/:user/:server (shared-with, scopes in body) (not delete because delete can't have body and don't have a URL that fully specifies owner, server, and shared_with)
  • revoke all shares for a server DELETE /shares/:user/:server

By user/group:

  • list shared-with-me (user): GET /api/users/:name/shared
  • list shared-with-me (group): GET /api/groups/:name/shared
  • opt-out of share: DELETE /api/users/:name/shared/:owner/:server (separate endpoint from PATCH above because this requires different permissions - users can always un-share with themselves)

@minrk
Copy link
Member Author

minrk commented Dec 13, 2023

Making progress, working on sharing codes, and came up with another question:

Is a sharing code more like a token, which should be stored hashed and irretrievable (must issue new code if you need access) or a more permanent, retrievable value.

I think how folks use are expected to use them informs this. If we're thinking of slack or otherwise invitation-style links, I think the token format is better.

If we're thinking of Google docs style "share via code", then being able to retrieve the code so you don't reissue codes too often makes sense.

Given that share codes are immediately exchanged for sharing permission, and permissions granted via the code cannot be readily tracked or revoked, I'm inclined to go with the token approach.

@minrk
Copy link
Member Author

minrk commented Dec 14, 2023

Still quite a few tests to write and things to update, but now I've tested the full process of issuing and accepting a new sharing code, and things are reasonably complete (not finished, but complete).

For this PR, what I plan to finish:

  • documentation explaining how it works and how to use it
  • sharing with individuals via API
  • list/modify/revoke shares via API
  • issue/list/revoke share codes via API
  • accept share code invitations via human page (the only HTML endpoint I plan for this PR)

What I don't plant to do in this PR:

  • UI for managing shares

UI is its own whole can of worms, and I'd like to be able to make a 5.0 release with sharing made possible via the API, and consider UI on a longer timescale (and via extensions in the meantime)

Copy link
Member

@consideRatio consideRatio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a blocker. I've not reviewed all changes in detail due to limited capacity =/

I've glossed over changes in orm.py and scopes.py, and hardly looked at shares.py. Nothing seems off from what I've seen in all other code sections.

Amazing effort into this @minrk! There are some unaddressed reivew points above, but none is critical so I'll leave it to you to decide if you want to take action on them before going for a merge.

@minrk
Copy link
Member Author

minrk commented Feb 6, 2024

I've deliberated on shares being tied to (user, server), an an alternative like (user, server, creation-uid) where creation-uid is generated on creation by perhaps being a hash of all relevant fields to ensure no duplicate entries are created. No change suggested, but I was thinking with such system, it may be possible to transition to time-outable shares etc. But, how important is this? I figure one may want to allow temporary read-only permissions or similar to view a session without letting someone view future sessions.

Implementation-wise, it would just be like other tables (User, group, etc.), where there's a unique id primary key, and drop the restriction on uniqueness of (user, server). This was in fact the original design, where you could grant expiring permissions. Short-term read-only access was indeed the use case I had in mind. Where things got really hairy was when we started to talk about revoking permissions, especially when a user can be granted the same permission multiple times with multiple expiration dates. What does a revocation look like when there are multiple Share objects per (user, server)? It is revoking a whole 'share', which may or may not revoke permissions? Updating a single record in-place makes this all a lot simpler, but at the expense of short-term expiring grants not being feasible. This will be a hard thing to change if we want to create expiring shares, because several assumptions in the API change - e.g. GET /user/:name/shared would have to change from a single Share model to a list of shares.

I've understood that there is a way for users to accept shares via share codes, but is this workflow unique for users, or is it something that could/should apply for groups as well? For example, could a user accept a share as a group its part of etc?

share codes are just for users. Sharing with groups can only be done explicitly by name. I'm not sure it makes sense use-case-wise for a user to be able to accept a code on behalf of a group. That's essentially delegating sharing permission, where the owner of the server is granting anyone with the code permission to share the owner's server with a group by name. It's not impossible to implement, and wouldn't require any changes to the database, but I'd like to see a real use case requirement first.

Permission issues with custom read-only scopes?

It is conceivable to have two images where one supports read-only scopes and the other doesn't, and the result would likely be ignoring scoped permissions and granting full access. I think this ought to be the responsibility of admins. E.g. if admins (and admins are required for read-only access, since they must define the custom scopes), then they shouldn't allow launching images that don't have read-only support. Allowing user control over the server environment (e.g. user-controlled images) means granting users the power to completely eliminate all security protections.

For server-session-scoped permission grants, this is feasible globally, but not on a per-permission basis because of the single (user, server) record. That choice - to make it easy to list, manage, and revoke permissions - means that all permission revocation must be manual (just like it is for ~all permission-granting I've ever seen, such as github, google docs, etc.). Also note that deleting a server, which is distinct from stopping it, does indeed revoke all share permissions because the relationship is removed. It is only typical to delete named servers:

Screenshot 2024-02-06 at 11 10 59

I'll address other comments in-line.

Thanks for reviewing!

@minrk
Copy link
Member Author

minrk commented Feb 6, 2024

but "active" isn't?

this was my confusion mixing internal variables with the rest API. .active is a Python variable, but not in the REST API at all (active = ready or pending, by definition). But I see that it's documented in the REST API, despite not being present! I've updated the rest docs to reflect reality.

@minrk
Copy link
Member Author

minrk commented Feb 6, 2024

In this example response I see an expires_at an created_at equal to each other, does that imply that it won't expire? It seems a bit more natural that the field expires_at wouldn't be included or that its extremely far into the future or something.

This is because unless we specify individual examples for everything, all timestamp fields use the same value (the current time). We can specify manual examples. I've expanded the description to state that expires_at is always in the future, and null means no expiration.

@consideRatio
Copy link
Member

consideRatio commented Feb 6, 2024

1. Warn that a read-only constraint is enforced in user environments

Permission issues with custom read-only scopes?

It is conceivable to have two images where one supports read-only scopes and the other doesn't, and the result would likely be ignoring scoped permissions and granting full access. I think this ought to be the responsibility of admins. E.g. if admins (and admins are required for read-only access, since they must define the custom scopes), then they shouldn't allow launching images that don't have read-only support. Allowing user control over the server environment (e.g. user-controlled images) means granting users the power to completely eliminate all security protections.

I think this info is critical info that should be surfaced very intentionally to admins.

  • Check if this info can be surfaced even more clearly to avoid security issues

Edit: handled via #4699

requires consolidating redirect validation into _validate_next_url to allow re-use
@minrk
Copy link
Member Author

minrk commented Feb 6, 2024

I think this info is critical info that should be surfaced very intentionally to admins.

I'll think about where to highlight it. I don't think it's particularly relevant to sharing, so I wouldn't add anything to the sharing docs, but it is relevant to the custom_scopes definitions. The user-control over the environment is discussed already in the security doc, with the statement that:

JupyterHub administrators must ensure that:
A user does not have permission to modify their single-user notebook server,

including installing packages, modifying $PATH, etc., all of which enable users to compromise security of their own servers.

@minrk
Copy link
Member Author

minrk commented Feb 6, 2024

Thanks for the reviews! I believe CI will be all green and all reivew has been addressed (at least that I didn't forget about).

This is a big PR with lots of iterations and merges from main to preserve the review history, so I was planning to squash & merge when folks approve it, rather than merging all the iterations and testing in the history.

Copy link
Member

@consideRatio consideRatio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing effort into this @minrk!!! 👍 For squash/merge/get this pr into main branch!

Copy link
Contributor

@sgaist sgaist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work !

@minrk minrk merged commit 41fff71 into jupyterhub:main Feb 7, 2024
20 checks passed
@minrk minrk deleted the user-shares branch February 7, 2024 07:34
@minrk
Copy link
Member Author

minrk commented Feb 8, 2024

Thanks so much for the reviews!

@krassowski
Copy link
Contributor

Very happy to see this merged 🎉 What is the release plan for this feature? Is it backwards compatible and could land in say 4.1 or will it need to wait for 5.0?

@minrk
Copy link
Member Author

minrk commented Feb 9, 2024

This is the main feature for 5.0, which I plan to start prereleases for shortly. Working on one more fix before that can start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new new features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

User-initiated sharing, distinct from roles
5 participants