Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross-user operation - umbrella issue #171

Closed
hjoliver opened this issue Feb 5, 2021 · 8 comments
Closed

Cross-user operation - umbrella issue #171

hjoliver opened this issue Feb 5, 2021 · 8 comments

Comments

@hjoliver
Copy link
Member

hjoliver commented Feb 5, 2021

(And related topics, such as how UI Servers behave when not looked at, etc.). See also:

This is to record an @oliver-sanders "brain dump" on the topic, in response to questions from @jarich, from Element chat, as that seems to be the current best record of earlier discussions (primarily the Feb 2020 workshop). To be refined and replaced by more focused Issues as we go...

Brain dump time (apologies in advance), contains some extra details and contents of discussions not currently documented...

If 2 or more users want to view Bob's workflows, will there be only one UI server running for Bob?

If I attempt to spawn a UI server from the command line for Bob, and there is already a UI server running for Bob, is that a no-op?

  • There will only ever be one UI server (UIS) per "owner" running under one hub.
  • When you visit the URL for that server Jupyterhub will spawn the UIS if it is not running then redirect, otherwise I think it is a simple redirect at the reverse-poxy (very efficient).
  • The UIS knows about open connections so is able to tell when it is not in use. We anticipate building in a configurable shutdown timeout.
  • An idle UIS should have a low footprint and starting up a UIS should be cheap.
  • We currently anticipate that startup/shutdown for the UIS will be efficient and that aggressively shutting-down unused UIServers is likely a good idea.
  • This will help to keep the UIServer flock on the latest version.
  • UIServers can be manually shutdown through the web interface.

Additionally:

  • The websocket connections between the UIS and UI have a "heartbeat" which gives us a robust way of telling when connections are inactive from the UIS end.
  • We have a plan for the UI to detect inactivity and suspend connections, freezing the UI but maintaining its state until the user interacts with it again.
  • When this happens the UI would probably go grey with a message in the middle saying "paused, click to resume" or something like that.
  • The time it takes to sync a workflow through to the UI is pretty good, even for larger flows so it should be fast to resume.

If the UI Server is not being run as a user with a password, then it will need to persist even with no connections. Or at least some part of it will have to. This could potentially be done with a flag on start up?

We will want to make the "shutdown on inactivity" feature configurable anyway so should be good there, however, if the UIS went down for some other reason it would have to be brought back up manually.

what is the pathway the user makes?

  • The URL for the UI contains the username, these URLs can be shared and should work for other users.
    Visiting a URL for someone else's UIS will cause you to (spawn if necessary and) connect to their UIS (providing you are authorised).
  • The UI loads to a default dashboard page.
  • In the top left-hand corner of the UI are two boxes for manually entering a username and hub address to facilitate changing between users and potentially hubs.
  • We can do user-name discovery from the user's authorisation configuration file (i.e. list user names that the user has themselves configured, low security impact).

If UI servers run on machines that are not running the/a hub, how does the hub find them? Can we load balance UI servers.

  • This comes down to the Jupyterhub "spawner". The spawner is the thing that starts the UIServers in the Cylc Hub (notebooks in regular JupyterHub).
  • We would like to build a special spawner which uses the scheduler-distribution logic from cylc-flow allowing load balancing of UIServers on startup.
  • This load-balancing system can rank hosts by psutil metrics e.g. cpu, memory, server load, etc as well as setting hard limits.
  • This plugin would just do the load balancing then defer to a regular distributed-spawner plugin to start the workflow (i.e. it wraps a spawner of your choice).

I'm going to use unix permissions here, to simplify things. So read (can view the workflow), write (can do edit runs, change anything we can change through the interface about how the workflow runs) and execute (can stop, start, restart, pause, release etc the workflow).

  • That's what we were thinking.
  • There isn't much written up though there is something on cylc-admin
  • I think the plan was to provide a two-tier system consisting of a site and user configuration.
  • This would enable the site to control the level of permissions that users are able to give away.
  • Permissions would be grouped into read/write/execute, the main use case would be for a user to give a certain level of permissions to another user.
  • However we would like these permissions to be fine grained for more particular use cases.
  • E.G. Could give a user read access but also the ability to start the workflow (execute level) without giving them write (or other execute) permissions in the process.

There is a mechanism to start a workflow from the command line that will spawn a UI server (or at least a workflow daemon?) for the user who owns it

  • There shouldn't be any need to start a UIS when a Scheduler starts, the UIS can be auto-started whenever its URL is visited.

list of UI servers that all users can find after they log in

There isn't a list per-se, the user would either need to:

  • Know the user name of the person whose UIS they want to connect to and type it into the username box (top left of the UI).
  • OR configure this user name in their authorisation configuration (which should cause the UI to list it).
  • OR visit the URL of the users UIServer (derived from the hub address and user name).

Bonus points if I can also interact with the UI server of another user from the command line for at least read access (potentially subject to constraints)

For that I point you to this cylc-admin proposal and this cylc-flow issue.

Long-story short we plan to achieve multi-workflow and cross-user functionality on the command line by going through the UIS, examples:

# stop a single flow (can be done without the UIS in cases where the user can interact with the flow directly)
$ cylc stop myflow//

# stop multiple flows in one command
$ cylc stop myflow1// myflow2// myflow3//

# stop all my flows
$ cylc stop '*'

# stop another users flow (subject to authorisation)
$ cylc stop ~otheruser/theirflow//

Not a top-priority right now, however, the new syntax has been sorted and a tokeniser that is able to back-support the old syntax in tangent with the new system has been written. Hopefully we can deliver this for Cylc 8.0.0.

Note that there will be a simple bash CLI completion to fill in flow//cycle/task names making this syntax more convenient.

@hjoliver
Copy link
Member Author

hjoliver commented Feb 5, 2021

And earlier, from me:

Trying to recall the plan off the top off my head ...

  • Cylc 8 schedulers run as the user will only accept connections from the user, so no authorization needed there.
  • (They can be started from the command line or the UI Server)
  • UI Servers run as the user too, and my UI Server will only interact with my schedulers
  • UI Servers however can accept connections from other users, and will control authorized access to its scheudlers
  • To view or interact with other users' schedulers, my UI will connect to the other users' UI Servers
    • Note in general this requires the UI/hub to be able to spawn UI Servers on other user accounts, not just mine, in case there's no UI Server running there already. It can do this, because it is the only privileged part of the system (not technically any different than spawning my own UI Server). We just have to ensure that it can spawn ONLY Cylc UI Servers.
    • I'm sure it is technically possible to connect to existing UI Servers rather than spawn them, but [I'm not sure of] the details of that, and besides (IMO at least) a UIS running as me is a UIS running as me, it shouldn't matter who spawned it.

@oliver-sanders
Copy link
Member

To summarise the above brain dump into some rough pathway:

  1. Investigate authorising to other users UI servers.

    • Can we do this through the hub?
    • Do we need to configure jupyterhub some special way?
    • Proof of concept.
  2. Develop a basic POC authorisation config file.

    • This is to allow one user to permit another to access their UIS.
    • We will need to add a permissions model later.
    • For a POC a simple yes/no with authorisation performed at the relevant end points will do.
  3. Develop fine grained authorisation.

@oliver-sanders
Copy link
Member

Update:

Tagged against 0.6.0.

@hjoliver
Copy link
Member Author

hjoliver commented Oct 20, 2021

cross-user access with two (real) user accounts:

[Update: this is while running the hub as root]

I have to be a [Jupyter Hub] admin user in order to spawn a UIS on another user's account:

  • c.Authenticator.admin_users = {'oliverh'}
  • c.JupyterHub.admin_access = True

Otherwise I can connect to an existing UIS, but I can't spawn one.

Further, even as an admin user I have to add the target user to the list of allowed users via the hub admin interface, or config c.Authenticator.allowed_users (or else: 404 Not Found, and "no such user" in the log).

This suggests we might not be able to use vanilla JupyterHub for cross-user access, unless we configure all users to be admins. (That might be OK in our context, but it doesn't sound good, and it would allow users to see a list of other user's servers and kill them as well as spawn them).

@oliver-sanders
Copy link
Member

I think this as expected?

To achieve the holy grail of being able to spawn another user's UIS (when you don't have permissions to do so manually) you may need to run the hub as a privileged account / use something like the sudo spawner.

There is ongoing work in JupyterHub to introduce authorisation which may helps us.

I think JupyterHub services may be an option too. I think services run as the hub user, there's an example service which culls idle servers, that approach could be used to keep them active too.

Will come back to this, we should be able to find a workable solution without having to modify JupyterHub.

@hjoliver
Copy link
Member Author

hjoliver commented Oct 20, 2021

Sorry, forgot to mention I was running the hub as root. (And, "admin user" is an additional Jupyter thing).

@oliver-sanders
Copy link
Member

Ok, this might be one for the Jupyter discourse.

@oliver-sanders
Copy link
Member

With the Jupyter Server migration complete and the new authorisation framework in place this issue is mostly done.

Bumping the remainder of this issue into:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants