-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Pillar fetch timeouts on proxy minion after ~60 seconds #63824
Comments
Please update the issue title to something descriptive. e.g. "[BUG} Cannot increase timeout for custom pillar". Though note that a pillar that takes that long is going to cripple your entire system, as |
In my case it would take about 2-3 min to fetch data from external databse, my plan was to use pillar cache as well to speed up the process after first fetch happens, but for pillar cache to kick in need to fetch it at least once. |
That is incredibly slow for a database query. I'm going to guess this is also more then just secret information? If it is that slow, and you really can't fix it, at least fetch it as part of he state template, not in pillar. |
It is what it is, can only speed up the DB query to certain level. But, in general case, forcing 60 seconds as a timeframe to get pillar with going through all external pillars, rendering and sending back to minion, might sound a bit concerning. What I am trying to say there are might be legit cases when pillar fetch might take longer then 60 seconds, allowing user to adjust system behaviour to accomodate those cases is desirable. |
Again, if pillar does take that long, the whole Salt installation will be almost useless. Pillar needs to be fast. Do not use it to store arbitrary data, especially data that takes minutes(!) to build. And a database query that also takes minutes to run is a big indicator of a massive design failure. I agree that the timeout should be configurable, but with a view to making it shorter, not longer. |
Ok, what is the reasonable amount of time for pillar to finish its work in that case? Also, are there anything else we can use instead of pillar or sourcing data on the fly during state execution? The idea was to fetch data into pillar once, use pillar cache, and as such have all the neccesary data available to proxy minion for fast state and template rendering execution, also lowering the burden on db and using salt master as the only entity to whitelist on DB side. |
I'm running into this same timeout. Modifying the call to We use gpg encrypted pillar data and have about 140 entries to decode. Gpg-agent is single threaded and on an Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz that seems to take about 80s. There doesn't seem to be any way to speed this up - gpg-agent is the bottleneck. We'd gladly convert to something faster. For now we use the pillar cache and run |
Description
Pillar fetch timeouts on proxy minion after ~60 seconds
Setup
SaltStack 3005.1
Master and Proxy Minion running in containers on RockyLinux VM in a VirtualBox
Steps to Reproduce the behavior
Create external pillar and make it to sit doing nothing for longer then 60 seconds
Expected behavior
Being able to configure extrnal pillar timeout as a parameter.
Versions Report
salt --versions-report
(Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)Additional context
I created custom external pillar but it takes longer then 60 seconds to fetch data and while master is in the process of retrieving that data proxy minoion errors out with timeout error constantly sending another pillar refresh request to master.
Getting this traceback on proxy-minion:
Was trying to play with master and proxy minion timeout parameter by setting it to a big value as recommended here but that does not seems to have any effect.
Looked at traceback and found this call - it seems timeout value is hardcoded to 60 seconds here and I was not able to figure out the way to influence that value through proxy minon or master settings or command line parameters.
If pillar fetch timeout is not configurable at the moment, having a setting to control that timeout value would be a very useful feature to implement as a workaround for this bahviour.
The text was updated successfully, but these errors were encountered: