Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let minions try to connect to master indefinitly #50855

Merged
merged 2 commits into from Dec 17, 2018

Conversation

Projects
None yet
6 participants
@dwoz
Copy link
Contributor

commented Dec 13, 2018

What does this PR do?

Do not exit when the minion can't connect to a master.

What issues does this PR fix or reference?

#50854

Tests written?

No

Commits signed with GPG?

Yes

@dwoz

This comment has been minimized.

Copy link
Contributor Author

commented Dec 13, 2018

@Ch3LL @DmitryKuzmenko @garethgreenaway

I've tested that this also resolves #50791. Pun intended ;)

Scenario's tested:

  • Single master not started. The minion connects when master is started.
  • Multiple masters with 1 not being able to resolve. The minion fails over to second master.
  • Multiple master with 1 not started. The minion fails over to the second master.
  • Multiple masters with none started. The minion connects a master is started.
@damon-atkins

This comment has been minimized.

Copy link
Member

commented Dec 14, 2018

Random sleep time adjustment? i.e. to prevent all of the minions trying at about the same time to connect, and causing a wave effect against the master when it returns. It not I suggest its added to the code. + or - 10% of the interval time. interval - (interval * 0.05) + random(interval* 0.1)

@Ch3LL

This comment has been minimized.

Copy link
Contributor

commented Dec 14, 2018

@dwoz i believe when @garethgreenaway was trying to fix this initially it was to fix multi-master with dns (dns names as the master in the minion config). Did you happen to test with dns or ip addresses?

@Ch3LL

This comment has been minimized.

Copy link
Contributor

commented Dec 14, 2018

nevermind. I looked it up here: #49764 and it seems it should only work when using the new retry_dns_count configuratino

@dwoz

This comment has been minimized.

Copy link
Contributor Author

commented Dec 14, 2018

@Ch3LL I my testing I used DNS with resolvable and un-resolvable hosts.

@damon-atkins That is a good idea but this is a bugfix. Changes should go in a feature.

@cachedout

This comment has been minimized.

Copy link
Collaborator

commented Dec 14, 2018

@damon-atkins That is a good idea but this is a bugfix. Changes should go in a feature.

Given that this is still pre-release, this might still be feasible, IMHO.

@dwoz

This comment has been minimized.

Copy link
Contributor Author

commented Dec 15, 2018

@damon-atkins That is a good idea but this is a bugfix. Changes should go in a feature.

Given that this is still pre-release, this might still be feasible, IMHO.

I'd still like to see that go in a different PR to keep added functionality separate from the bugfix found here.

@cachedout

This comment has been minimized.

Copy link
Collaborator

commented Dec 17, 2018

We need to ensure this does not regress this fix, IMHO:

e66dc18

@dwoz

This comment has been minimized.

Copy link
Contributor Author

commented Dec 17, 2018

@cachedout We discussed this on slack and @garethgreenaway agreed that this change is what we want. Just waiting for @garethgreenaway to sign off by approving this PR.

@garethgreenaway garethgreenaway merged commit 80197bc into saltstack:fluorine Dec 17, 2018

3 of 10 checks passed

jenkins/pr/docs Testing docs...
Details
jenkins/pr/py2-centos-7 running py2-centos-7...
Details
jenkins/pr/py2-ubuntu-1604 running py2-ubuntu-1604...
Details
jenkins/pr/py2-windows-2016 running py2-windows-2016...
Details
jenkins/pr/py3-centos-7 running py3-centos-7...
Details
jenkins/pr/py3-ubuntu-1604 running py3-ubuntu-1604...
Details
jenkins/pr/py3-windows-2016 running py3-windows-2016...
Details
WIP Ready for review
Details
continuous-integration/jenkins/pr-merge This commit looks good
Details
jenkins/pr/lint Python lint test has passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.