-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Description
Description
All,
We use nodegroups in our monthly patching cycle, building the final list of hosts to patch with several "not" groups anded together. These are in turn Grain and List based themselves. This leads to 4 hosts consistently yielding "Minion did not return errors" -- those minion should not have rec'd any command at all, and so this is a false error. [Sorry - this is hard to describe]
Setup
All affected systems are 3005.3 .
All systems are direct connected to salt-master03.
Note: upgrade to 3006 is scheduled, but we're govt and can't just push the patch out.
Nodegroups.conf contains relevant lines (other comments and unrelated nodegroups elided):
nodegroups:
patch-excluded: '' # systems that are not patched on an existing schedule, or are excluded this month
patch-foundation-q: '( N@backup-servers or L@distro-master,salt-master03 ) and not N@patch-excluded'
not-hpc-internal: 'G@hpc_internal:False'
# has bug
patch-normal: ' N@not-hpc-internal and not N@patch-excluded and not N@patch-foundation-q'
# does not have bug
#patch-normal: 'N@not-hpc-internal and not N@patch-excluded'
# has bug
#patch-normal: 'not N@patch-foundation and N@not-hpc-internal and not N@patch-excluded'
backup-servers: 'L@backup-slave,backup-master'I've tried moving the "backup-servers" nodegroup before patch-normal, but the problem is not order dependant.
Please be as specific as possible and give set-up details.
- on-prem machine
- VM (Virtualbox, KVM, etc. please specify) -- some of these are VM's, some are physical hardware.
- VM running on a cloud service, please be explicit and add details
- container (Kubernetes, Docker, containerd, etc. please specify)
- or a combination, please be explicit
- jails if it is FreeBSD
- classic packaging
- onedir packaging
- used bootstrap to install
Steps to Reproduce the behavior
This nodegroup setup yields a list of ALL of our systems (not-hpc-internal), EXCEPT the hpc-internal ones, and EXCEPT the
hosts specifically listed in N@backup-servers and N@patch-foundation-q
This explicit list is distro-master,salt-master03,backup-slave,backup-master
These 4 hosts yield the error referenced:
salt -N patch-normal test.ping
system1
True
system2:
True
system3:
AND MANY OTHERS
salt-master03:
Minion did not return. [No response]
The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:
salt-run jobs.lookup_jid 20231108154033009594
backup-slave:
Minion did not return. [No response]
The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:
salt-run jobs.lookup_jid 20231108154033009594
backup-master:
Minion did not return. [No response]
The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:
salt-run jobs.lookup_jid 20231108154033009594
distro-master:
Minion did not return. [No response]
The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:
salt-run jobs.lookup_jid 20231108154033009594
ERROR: Minions returned with non-zero exit codeThis bug report is specifically around why these 4 nodes report this error: everything else is working as intended/desired.
Expected behavior
We expect the command to not generate errors for the 4 systems specifically excluded.
Versions Report
salt --versions-report
```shell Salt Version: Salt: 3005.3Dependency Versions:
cffi: 1.14.6
cherrypy: unknown
dateutil: 2.8.1
docker-py: Not Installed
gitdb: 4.0.10
gitpython: 3.1.37
Jinja2: 3.1.0
libgit2: Not Installed
M2Crypto: Not Installed
Mako: Not Installed
msgpack: 1.0.2
msgpack-pure: Not Installed
mysql-python: Not Installed
pycparser: 2.21
pycrypto: Not Installed
pycryptodome: 3.9.8
pygit2: Not Installed
Python: 3.9.18 (main, Nov 1 2022, 00:00:00)
python-gnupg: 0.4.8
PyYAML: 6.0.1
PyZMQ: 23.2.0
smmap: 5.0.1
timelib: 0.2.4
Tornado: 4.5.3
ZMQ: 4.3.4
System Versions:
dist: centos 9
locale: utf-8
machine: x86_64
release: 5.14.0-370.el9.x86_64
system: Linux
version: CentOS Stream 9
I can pretty easily add/modify these nodegroups for testing - please let me know.