Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock scenario in process._remote_check due to use of sub-process stdout/stderr pipes and read methods #62

Closed
cfsnyder opened this issue May 14, 2021 · 0 comments · Fixed by #63

Comments

@cfsnyder
Copy link
Contributor

The use of read methods on subprocess stdin/stderr pipes can cause deadlocks. The following warning is noted in the python docs:

Warning Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.

This can happen in process.py._remote_check.

In this particular case, a deadlock occurs when the sub-process fills the OS stderr pipe buffer and is blocked while attempting to continue writing to stderr. In that case, stdout never reaches EOF and _remote_check is blocked trying to read from stdout.

This issue was discovered because it was causing a deadlock while attempting to upgrade a Ceph cluster from 15.2.10 -> 16.2.3. This deadlock brought our whole upgrade process to a halt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant