[BUG] salt.util.cloud connections hang indefinitely. #60216
Labels
Bug
broken, incorrect, or confusing behavior
bugfix-bckport
will be be back-ported to an older release branch by creating a PR against that branch
Salt-Cloud
severity-high
2nd top severity, seen by most users, causes major problems
Milestone
Description
If a host or network goes away during a cloud deployment ssh connections to the box can be left in a hung state indefinitely.
Steps to Reproduce the behavior
Run 50+ deployments using the saltify cloud provider when some of the hosts might go away before the deployment finishes.
Expected behavior
We should add
ServerAliveInterval
andServerAliveCountMax
options to all connections insalt.utils.cloud
. Add these options anywhere we are setting theStrictHostKeyChecking
option.Values of these options can be something like
ServerAliveInterval=10
andServerAliveCountMax=3
which will detect network failures and timeout after 30 seconds.We can make these values configurable but we should at least have some sane defaults for them. Making them configurable is not necessary to close this issue.
Screenshots
If applicable, add screenshots to help explain your problem.
Versions Report
Observed on 3002.5
This issue came from debugging #59903
The text was updated successfully, but these errors were encountered: