Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HTTP API to scheduler #6270

Merged
merged 17 commits into from
May 17, 2022
Merged

Conversation

Matt711
Copy link
Member

@Matt711 Matt711 commented May 4, 2022

Closes #5935

This PR exposes some of the scheduler endpoints we need to replace the Dask RPC in the Operator.

  • Tests added / passed
  • Passes pre-commit run --all-files

@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

@quasiben
Copy link
Member

quasiben commented May 4, 2022

add to allowlist

@github-actions
Copy link
Contributor

github-actions bot commented May 5, 2022

Unit Test Results

       15 files  ±  0         15 suites  ±0   6h 55m 22s ⏱️ +18s
  2 792 tests +  4    2 713 ✔️ +  4    78 💤 ±0  1 ±0 
20 706 runs  +28  19 795 ✔️ +25  910 💤 +3  1 ±0 

For more details on these failures, see this check.

Results for commit 15ad992. ± Comparison against base commit af3b93e.

♻️ This comment has been updated with latest results.

@Matt711 Matt711 marked this pull request as ready for review May 12, 2022 19:42
Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great! It's exactly what we need for external resource managers to interact with the scheduler.

We should definitely add a documentation page about this API too.

distributed/http/scheduler/api.py Outdated Show resolved Hide resolved
distributed/http/scheduler/api.py Outdated Show resolved Hide resolved
distributed/http/scheduler/tests/test_scheduler_http.py Outdated Show resolved Hide resolved
@Matt711
Copy link
Member Author

Matt711 commented May 13, 2022

This is looking great! It's exactly what we need for external resource managers to interact with the scheduler.

We should definitely add a documentation page about this API too.

Great! I can create a follow up PR for more documentation

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

@fjetter would you mind reviewing here? Given you were involved in the early conversations about this it would be good to check you are happy with this approach.

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually thinking a bit more this probably needs a couple more changes.

We don't have an implementation here for workers_to_close. We have retire_workers which allows you to retire by name, but we don't have a way or retiring n workers. Exposing workers_to_close would allow that.

def workers_to_close(

I also think we need a little more documentation than this to get this PR in. Perhaps we could add another section to that page with a title like Scheduler API with these methods listed and an example of the request body they are expecting.

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround. I'm happy with this as a good foundation we can build on.

docs/source/http_services.rst Outdated Show resolved Hide resolved
docs/source/http_services.rst Outdated Show resolved Hide resolved
@jacobtomlinson
Copy link
Member

I intend to merge this in 24 hours if there are no further comments.

Copy link
Member

@fjetter fjetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are a few cosmetic questions around how we want to deal with exceptions but this should not block the PR. The more fundamental question is how we want to maintain API stability. Merely passing through kwargs is not what I had in mind when we first discussed this API

distributed/http/scheduler/api.py Outdated Show resolved Hide resolved
distributed/http/scheduler/api.py Show resolved Hide resolved
distributed/http/scheduler/api.py Outdated Show resolved Hide resolved
distributed/http/scheduler/api.py Outdated Show resolved Hide resolved
distributed/http/scheduler/api.py Outdated Show resolved Hide resolved
Copy link
Member

@fjetter fjetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits about documentation (this is intended to be public API for users, therefore I think we should invest a bit in documentation even though the API is simple)

@jacobtomlinson I would trust you with this. If you think the additional docs are not necessary, feel free to merge. These comments should not necessarily block anybody.

Comment on lines +55 to +66
Scheduler methods exposed by the API with an example of the request body they take

- ``/api/v1/retire_workers`` : retire certain workers on the scheduler

.. code-block:: json

{
"workers":["tcp://127.0.0.1:53741", "tcp://127.0.0.1:53669"]
}

- ``/api/v1/get_workers`` : get all workers on the scheduler
- ``/api/v1/adaptive_target`` : get the target number of workers based on the scheduler's load
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful to document an example response


Scheduler methods exposed by the API with an example of the request body they take

- ``/api/v1/retire_workers`` : retire certain workers on the scheduler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if this mentioned specifically that addresses are expected and not names. Sometimes they can be interchangable but this API expects addresses. The example below might still be misleading since we're setting names, by default, to the addresses.

@jacobtomlinson
Copy link
Member

We are blocked on this PR for some work we want to finish this week, so I think I'll just ask @Matt711 to make a follow up to improve the docs if that's ok. Totally agree with the comments though.

@jacobtomlinson jacobtomlinson merged commit 63cdddd into dask:main May 17, 2022
@psontag
Copy link

psontag commented May 17, 2022

Hey I just saw this PR.
My current understanding is that there are two modes for the /api/v1/retire_workers route:

  1. Retire n workers based on the workers_to_close method of the scheduler
  2. Retire the provided workers based on their address

But currently there is no API that exposes the workers_to_close method directly.
At least in our use case that would be really helpful though since our graceful shutdown implementation has two phases:

  1. Retrieve workers to close from the scheduler and do some custom graceful shutdown logic
  2. Actually make the retire_worker call to the scheduler.

@jacobtomlinson/@Matt711 Do you think that this might also be interesting for you?

@jakirkham
Copy link
Member

@philipp-sontag-by would recommend raising a new issue to track this. In fact we might want to do that with any of the meatier tasks left in this PR (if any)

@Matt711
Copy link
Member Author

Matt711 commented May 17, 2022

@philipp-sontag-by I think we can do this, especially since the point of the API is to cover most of the scheduler's methods. I'll create a follow-up PR to add this if @jacobtomlinson is okay with it.

@jacobtomlinson
Copy link
Member

Yeah let's discuss this in a new issue.

I'm curious what shutdown logic you want to do that retire workers isn't suitable for? As I see it the best way to gracefully shutdown is to allow the scheduler to end the worker process and then clean up the completed pods after.

@gjoseph92
Copy link
Collaborator

I'm concerned about a security regression here. By default, this is opening up an API that allows anyone to change cluster state (via retire_workers currently, but I imagine other things might be added someday too).

Prior to this, the only way to do things that affected cluster state was through the client. All the HTTP routes were effectively read-only. (Whether there is a vulnerability in the bokeh dashboard is another topic; it's pretty possible there is, but I'm just talking here in principle.)

I think it's rather common to expose the HTTP routes to the public internet. For example, I believe dask-cloudprovider does this:

By default a Dask security group will be created with ports 8786 and 8787 exposed to the internet https://cloudprovider.dask.org/en/latest/aws.html#dask_cloudprovider.aws.EC2Cluster

You want those ports exposed for convenience, so you can connect to them. But you don't want anyone to be able to do stuff to the cluster, so you set up TLS using temporary credentials. dask-cloudprovider does this for you as well:

When a cluster is launched with any of these cluster managers a set of temporary keys will be generated and distributed to the cluster nodes via their startup script. All communication between the client, scheduler and workers will then be encrypted and only clients and workers with valid certificates will be able to connect to the scheduler.
https://cloudprovider.dask.org/en/latest/security.html#authentication-and-encryption

Currently, if you set up TLS for your cluster, this is mTLS, meaning the scheduler verifies the client's certificate (docs, code). This serves as a form of authentication and authorization: if you've set up cluster security, you can only tell the scheduler to do things if you hold a valid certificate.

However, the HTTP routes have no authentication (they use standard TLS, not mTLS, because mTLS would be very inconvenient when you want to look at the dashboard with a web browser).

So after this change, someone who had gone to the trouble to set up mTLS for their cluster (or was using the defaults of their cluster deployment system) would, by default, have an unauthenticated endpoint running that allowed anyone with access to :8787 (aka the dashboard) to affect cluster state.


I think we should do two things:

  1. Short-term: disable the HTTP API if TLS is specified on the scheduler. This is a reasonable default.
  2. Long-term: figure out a security posture and authentication for the HTTP API that's consistent with the security posture of other things that can affect cluster state (aka the client).

@jakirkham
Copy link
Member

@gjoseph92 can you please file this as a new issue to make it easier to track?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add REST API to scheduler
8 participants