New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transport TCP minions don't reconnect/recover #39463
Comments
Looks like i'm able to replicate this. The way I replicated this was using your exact steps but when I stopped the master I had to wait 5-10 minutes before starting up the master again to see this behavior. We will definitely need to get this fixed. |
Thanks, this issue has cost me some hair ;) |
I noticed similar exception after upgrade to 2016.3.5 on Debian Jessie minions. Few times a week I get this:
Though I could not say that minions lose connections, they seems to be accessible without restarting. Master keeps running 24/7. |
@githubcdr we have a pending PR that should fix this if you want to take a look #40614 |
I will give it a shot this week, works kinda busy |
Will close since this is resolved. Please let us know if we need to re-open if there is an issue with the fix. |
will this be making its way into 2016.11.4 ? |
Yes |
Could you tell me the above problem solved,I use the 2016.11.3 also encountered the same problem |
It is in 2016.11.4 |
Confirm, it is solved in version 2016.11.4 ? |
I fixed it in 2016.11.4 |
Description of Issue/Question
When using "tcp" as transport option I noticed that minions don't alway reconnect when connection to master is lost. Simply restart the salt-master and minions are lost... forever :(
Setup
Saltmaster running on CentOS 6.8, with 250+ minions on CentOS 5,6 and 7. Network is stable and this issue seems to be related to 2016.11.2 only.
Steps to Reproduce Issue
Sometimes minions reconnect, but most of the time I see errors like this;
Failing minion
Working minion
I'm sure the salt-master has enought resources and no IO wait etc. I even disabled reactors and increased worker threads etc, but no dice. The result seems to depend on "luck", when restarting the salt-master sometimes minions reconnect.
Versions Report
The text was updated successfully, but these errors were encountered: