Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle thundering herds #3

Merged
merged 5 commits into from
Nov 28, 2023
Merged

handle thundering herds #3

merged 5 commits into from
Nov 28, 2023

Conversation

isaacs
Copy link
Owner

@isaacs isaacs commented Nov 21, 2023

This changes the startup logic pretty significantly, to provide comprehensive protection against many clients all attempting to start the daemon at the same time.

A lockfile is used to ensure that only a single contender is responsible for either starting the daemon script, or usurping a wedged or misbehaving server.

The lockfile contains the pid of the server attempting to take over, and is checked to ensure that it STILL contains the correct pid before confirming, to handle the race when multiple contenders all try to delete a stale lockfile at the same time. That is, the lockfile used as a mutex against startup races is also subject to races, since it can be disregarded and deleted if it's stale, so that has to be protected against.

This also makes it so that it attempts to listen first, and only usurps a wedged process if the listen fails, which is more efficient in the common happy-path case, and only involves minimal waiting in the edge cases.

This changes the startup logic pretty significantly, to provide
comprehensive protection against many clients all attempting to start
the daemon at the same time.

A lockfile is used to ensure that only a single contender is responsible
for either starting the daemon script, or usurping a wedged or
misbehaving server.

The lockfile contains the pid of the server attempting to take over, and
is checked to ensure that it STILL contains the correct pid before
confirming, to handle the race when multiple contenders all try to
delete a stale lockfile at the same time. That is, the lockfile used as
a mutex against startup races is also subject to races, since it can be
disregarded and deleted if it's stale, so that has to be protected
against.

This also makes it so that it attempts to listen *first*, and only
usurps a wedged process if the listen fails, which is more efficient in
the common happy-path case, and only involves minimal waiting in the
edge cases.
@isaacs isaacs force-pushed the isaacs/thundering-herd branch 2 times, most recently from 5916b69 to d710a88 Compare November 22, 2023 18:07
@isaacs isaacs merged commit a7ae54a into main Nov 28, 2023
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant