New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nodegroups compound change between 2015.5.3 and 2015.8.10/2016.3.0 #33553
Comments
@MelkorLord, thanks for the report. |
Hi, I've taken some time to thoroughly investigate this issue from a user perspective. Here are my findings I've used one of the systems I managed. A single host running Ubuntu Server 14.04 with 6 LXC instances, all of them Ubuntu 14.04 handcrafted by hand (deboostrap + few postinstall scripts). One of the LXC instances is a salt-master and every other LXC instance and the host have a salt-minion. The host and LXC instances were running Salt 2015.3.5 without trouble for a long time now. I only worked on the salt-master (dedicated LXC instance) to see where the problem lies. I decided to gradually upgrade Salt in a step-by-step basis. 1/ Change the APT repo and key to point to the new SaltStack repo (I was using the old PPA repo) and upgrade to the latest 2015.5 branch which is 2015.5.10. The upgrade went well and surprise : The "nogroups" issue described above does not show up, everything seems to work fine. This is strange but ok then, proceeed. 2/ Upgrade to the 2015.8 branch which is 2015.10. Same as above! This is strange, on my other system, the "nodegroups" issue is clearly showing! 3/ Upgrade to the 2016.3 branch which is 2016.3.0. A lot more packages were pulled to upgrade Salt. The logs showed a complaint : (the log is in one line, I broke is at dashes for readability)
at this point, if I Anyway, the "nodegroups" issue is still not showing up which is really confusing now! 4/ Then I remember : I read something about cleaning up Obviously, something that was "cached" in some form allowed the salt-master to behave correctly even after major upgrades but breaks the salt-master "nodegroups" handling once cleared up. I hope this helps pin-point the source of the problem. Sorry for being so lengthy but I think more is better when trying to debug something :-) |
Hi, I'm getting really annoyed with Salt behaviour, it is unpredictable at best in the current situation... I took some more actions to see what happened. 1/ Downgrade 2016.3.0 to 2015.8.10. Keeping /var/cache/salt/master or deleting it is the same, "nodegroups" issue shows up 2/ Downgrade 2015.8.10 to 2015.5.10. Same as above 3/ Downgrade 2015.5.10 to 2015.5.3 (from PPA). Same as above. This is a problem, even getting back to the original situation does not fix the situation :-( Fortunately, I backed up the LXC instance before playing with it. I stopped salt-master and salt-minion and I only restored "/var/cache/salt/master". Restarting the salt-master (and minion) gave me back the "nodegroups" functionality as I want it to work. Obviously, there's something in the way "/var/cache/salt/master" is handled that makes Salt behave erratically. Something got broken at least before 2015.5.3. I never had to cleanup /var/cache/salt/master until I upgraded to 2016.3 which recommended it because of the "hash_type" option (deprecated md5 default) I started using Salt with 0.17 (Ubuntu 14.04 official package) then using the PPA upgraded to the 2014.x branch and then 2015.5.3 which got stuck there until I wanted to use repo.saltstack.com. I hope this helps. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue. |
Description of Issue/Question
I manage different sets of servers. By sets I mean "one master + several minions" and every set is unrelated to each other, they are independent. On one set, I updated salt (2015.5.3) to the latest version (2016.3.0) to check for any functional regressions.
The nodegroups configuration now produce different results after update.
Setup
salt-master relevant configuration part
This allows me to target the physical servers (Linux) and LXC containers.
Steps to Reproduce Issue
With version 2015.5.3 (and all versions I worked with before that)
salt '*' test.ping'
,salt -N Linux test.ping
andsalt -N LXC test.ping
work exactly as expected.Starting with version 2015.8.10, including 2016.3.0 we have :
salt '*' test.ping
=> Works as expected : OKsalt -N LXC test.ping
=> ReturnsTrue
from all LXC targets then stales for some time and returnsMinion did not return. [Not connected]
from all physical servers which should NOT have been targeted in the first place!salt -N Linux test.ping
=> ReturnsThe text was updated successfully, but these errors were encountered: