New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reactor periodically can't find files in gitfs #47206
Comments
could you add this logging so we can get some more insight:
|
Sure, I have that in place now and restarted the master. Will update when I get results. |
Got a couple of hits:
Files are there, according to fileserver.file_list:
|
Thanks for adding that information. Can you share your sanitized reactor master config? And also when this occurs can you include more sanitized debug output before and after to show more context? |
Reactor config:
I've turned up logging to debug level, will post that when I see the issue again. |
Got some more hits, unfortunately it doesn't look like there's much more info with debugging. I've included all the lines that I think might be relevant. |
yeah doesn't look like much more information. Thanks for doing that. ping @terminalmage any other ideas here? |
I have a similar problem on version 2017.7.5, with local files though. Reactor:
Master config:
Master logfile: The reactor works like 50% of the time. The minion-created.sls file does indeed exist. I also have several other reactors configured, including beacons with 120+ minions reporting in, and a reactor on each job return (to check for highstate results). I removed the reactor for job return earlier today and things have been working better since then, but I still get errors occasionally. |
I have the exact same issue on 2018.3.0 also with local files. The reactor basically stopped working, with the following errors in the master log file:
versions report:
I know it stopped working today, because I have successful reactor events from yesterday. I tried clearing the minion cache (which also runs on the master), but that didn't help. The reactor files themselves are located in Restarting the |
@anitakrueger, @johje349 - are you running masters in a failover configuration by any chance? That's what we're doing and it just occurred to me that it might have some impact on this. |
@clallen no, no multiple masters here. Just a single master with about 100 minions. |
this is possibly a duplicate of #47539 which is assigned to a core engineer. |
#47539 is marked fixed "and it will be in the 2018.3.4 release" |
I haven't seen the issue after upgrading masters to 2018.3.3 about 6 months ago, so I'll close this. |
Description of Issue/Question
Every so often (sometimes days, sometimes minutes) the reactor will throw these errors:
However, files are always accessible via states, and fileserver.file_list shows all files available.
Restarting the master fixes it for a while.
I tried turning on granular debugging for salt.utils.reactor and salt.loaded.int.module.cp, but didn't see anything that looked useful.
I have also tried shutting down the master, deleting all the gitfs cache files, and restarting it to rebuild them. The issue still comes back a while later.
I realize we're using a pretty old version, and will be upgrading this year, I am mainly looking for clues as to where to look for the cause. If I can patch it temporarily that will be fine with me.
Setup
Master config settings:
Steps to Reproduce Issue
None that I know of, it works for a while and then starts failing.
Versions Report
Salt Version:
Salt: 2016.11.6
Dependency Versions:
cffi: 1.10.0
cherrypy: unknown
dateutil: 2.6.1
docker-py: Not Installed
gitdb: 2.0.2
gitpython: 2.1.5
ioflo: Not Installed
Jinja2: 2.9.6
libgit2: Not Installed
libnacl: 1.5.1
M2Crypto: Not Installed
Mako: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.4.8
mysql-python: Not Installed
pycparser: 2.18
pycrypto: 2.7a1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 2.7.11 (default, Jul 18 2017, 12:45:26)
python-gnupg: Not Installed
PyYAML: 3.12
PyZMQ: 16.0.2
RAET: Not Installed
smmap: 2.0.3
timelib: 0.2.4
Tornado: 4.5.1
ZMQ: 4.1.6
System Versions:
dist: redhat 7.4 Maipo
machine: x86_64
release: 4.1.12-112.14.10.el7uek.x86_64
system: Linux
version: Red Hat Enterprise Linux Server 7.4 Maipo
The text was updated successfully, but these errors were encountered: