Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition when locking Linux processes #30

Closed
acj opened this issue Jul 31, 2021 · 1 comment · Fixed by #31
Closed

Race condition when locking Linux processes #30

acj opened this issue Jul 31, 2021 · 1 comment · Fixed by #31

Comments

@acj
Copy link
Contributor

acj commented Jul 31, 2021

We've had a couple of reports that locking processes on busy production servers can fail, returning an ESRCH error (no such process). My best guess is that it's related to threads exiting between the calls to self.threads() and thread.lock here, resulting in a lock error. The locking code is tolerant of new threads spinning up during the loop but immediately returns an error if a thread fails to lock.

Would it be acceptable for remoteprocess to log (e.g. debug/warn) failed locks but not return an error? Or to count the number of failed thread locks and only return an error if all of them fail? I'd be happy to make a PR if we can agree on a path.

Downstream issue with repro steps: rbspy/rbspy#334

@benfred
Copy link
Owner

benfred commented Jul 31, 2021

We should handle the race condition - we're tolerant of new threads being creating during this lock call, and we should also allow threads to exit .

How about we detect the in ESRCH error in thread.lock - and just ignore it ? I'd like to still fail out on permission denied errors, or if there were no threads we managed to lock though.

I'd be happy to accept a PR for this -

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants