Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idempotent semaphore acquire with retries #3690

Merged
merged 1 commit into from Apr 16, 2020

Conversation

fjetter
Copy link
Member

@fjetter fjetter commented Apr 9, 2020

This adds a lot of logging information to the semaphore but most importantly refactors the internal structures to make the acquire call idempotent.

This relieves us from using the Client itself as a lifetime anchor for the leases and exchanges this scheme with an explicit refresh of the leases and lease specific timeouts.

This gives the following benefits:

  • I can create a unique ID for each acquire/lease which can be used for tracing using logs to debug the behaviour
  • The acquire requests themselves are idempotent, i.e. upon connection failures this can be retried
  • The requests are simpler since we no longer need to handle the client ID which was in the original implementation somehow redundant anyhow
  • If any weird "proxy did not get ACK" issues arise (e.g. scheduler logged the lease but the OK was never received by the client) the system will eventually self heal / not deadlock
  • The internal structure is subjectively more simple since we have only one dict to maintain which now holds timestamps

at the cost of additional complexity:

  • The client will spawn a new periodic callback to refresh the leases. This replaces the implicit coupling to the client heartbeat. I think this is a complexity worth taking since it decouples this implementation a bit from assumptions about the lifetime management of the client.
  • Another configuration option (I would argue this makes it more transparent to the users, though)

I sneaked in another change regarding the configurability of the retry options. Will open another PR for this but I needed it for the tests to succeed

@martindurant
Copy link
Member

@fjetter , thanks for submitting this!

@quasiben @jakirkham , any chance for a review here?

@quasiben
Copy link
Member

Looks like we need the PR (or add the changes here) for the retry_count config option

@marco-neumann-by
Copy link
Contributor

So if I understand this correctly the lease will not periodically refreshed by the holding worker. How does this interact with:

  • long running user payloads? (at which point do they NOT share the same thread anymore)
  • GIL blocking?

@fjetter
Copy link
Member Author

fjetter commented Apr 14, 2020

So if I understand this correctly the lease will not periodically refreshed by the holding worker. How does this interact with:

long running user payloads? (at which point do they NOT share the same thread anymore)
GIL blocking?

We will face the same issues as with the old implementation. In the old implementation everything was coupled to the heartbeat of the client. Instead, we'll now have a dedicated semaphore heartbeat/refresh.

long running user payloads

User payloads are always in another thread and don't impact the event loop of the worker unless...

GIL blocking?

If the GIL is held we're out of luck and need to counteract this with longer timeouts

@marco-neumann-by
Copy link
Contributor

So if the user schedules a task that is guarded by a semaphore but is prone to GIL blocking, it might run into refresh timeouts and systematically overbook the semaphore?

@fjetter
Copy link
Member Author

fjetter commented Apr 14, 2020

For the retry configurations, see #3705

@fjetter fjetter force-pushed the semaphore/idempotent_acquire branch from b640f5f to baab3e1 Compare April 14, 2020 07:57
Copy link
Contributor

@lr4d lr4d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass. Looks good, log messages could be clearer

distributed/semaphore.py Outdated Show resolved Hide resolved
distributed/semaphore.py Outdated Show resolved Hide resolved
distributed/semaphore.py Show resolved Hide resolved
distributed/semaphore.py Outdated Show resolved Hide resolved
@fjetter
Copy link
Member Author

fjetter commented Apr 14, 2020

So if the user schedules a task that is guarded by a semaphore but is prone to GIL blocking, it might run into refresh timeouts and systematically overbook the semaphore?

Currently, the timeout would trigger a log on warning level and once the GIL is released on the worker again, this would trigger an exception on scheduler level (caught and logged as an error). The tasks/computations would not fail, however, and the semaphore would be overbooked, correct.

@fjetter
Copy link
Member Author

fjetter commented Apr 14, 2020

@marco-neumann-jdas I believe we can either protect ourselves from resource starvation and deadlocks (intention of this implementation) or from overbooking. I don't think we can pull off both within this library. If I'm wrong about this statement, I'd be happy to be educated :)

What I could suggest is that a lease refresh will not simply warn/log/raise the overbooking but actually registers the lease again, i.e. we'd be briefly in a state where there are more leases registered than allowed which would then block new leases until we're in a normal state.

For applications where the timeout is ill configured and every task breaches the timeout this would at least stop an unlimited avalanche.

@fjetter fjetter force-pushed the semaphore/idempotent_acquire branch 3 times, most recently from 6e9904f to f94a1ac Compare April 14, 2020 12:33
@fjetter
Copy link
Member Author

fjetter commented Apr 14, 2020

@marco-neumann-jdas I added the treatment for oversubscription. If this is detected, the lease is registered and further acquisitions are blocked until we fall back to a normal state. See test test_oversubscribing_leases (I documented it as well as I could since the logic is not exactly trivial)
This should at least protect scenarios where the GIL makes up only a fraction of a tasks runtime. If the entire runtime is a big GIL we're completely out of luck but I would argue that in these cases, the users should be able to adjust the configuration since we log a lot of warnings to point them to the proper options.

Once we equip the semaphore with some prometheus metrics this should also be clearly visible.

@fjetter fjetter force-pushed the semaphore/idempotent_acquire branch from 2f5d094 to 3545383 Compare April 14, 2020 15:36
distributed/semaphore.py Outdated Show resolved Hide resolved
distributed/semaphore.py Outdated Show resolved Hide resolved
@fjetter fjetter force-pushed the semaphore/idempotent_acquire branch from 2500cde to 679e1da Compare April 16, 2020 13:52
@fjetter
Copy link
Member Author

fjetter commented Apr 16, 2020

@martindurant @quasiben any feedback? If not, I'd like to merge this.

@martindurant
Copy link
Member

I'll defer to @quasiben here, if he has time

@quasiben
Copy link
Member

This looks great and I definitely appreciate the comments around tests. @fjetter do you want to honors of hitting the green button ?

@fjetter fjetter merged commit ee8cff4 into dask:master Apr 16, 2020
@fjetter fjetter deleted the semaphore/idempotent_acquire branch April 16, 2020 16:05
@mrocklin
Copy link
Member

Hi Folks, this introduced a test failure in master. #3717

Is there any chance that people here can help to resolve this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants