-
Notifications
You must be signed in to change notification settings - Fork 5.5k
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't get batch mode and --failhard to work as expected #29643
Comments
I've also tried following the events by running |
As a test I tried adding some logging to
Also, if I remove
Also this seems to indicate a bug in how |
@jacksontj thoughts on this? |
To summarize, I think there are two issues:
|
Does https://docs.saltstack.com/en/latest/ref/cli/salt-call.html#cmdoption-salt-call--retcode-passthrough |
Not that I've seen. I think that's only supported by |
thanks for the ping, @matthayes. This looks like something we need to get fixed. |
I can help diagnose the issue on my side but I need some pointers for what code to check. I've been adding logging to dig deeper but I've reached the limit of my familiarity with the Salt code. Can you point me to where in the code the minion returns the retcode to the master? Or let me know if there is something I can do to help find the problem. I was working on automating some deployment but now I'm stuck until this is fixed so I am motivated to identify the problem :) |
If I'm not mistaken, modules can set their retcode in the special If I'm not mistaken, the minion pulls that retcode here: Lines 1166 to 1169 in bbc1034
|
Thanks, I've added some more logging. At the end of the
So, it is setting |
Actually, correction: this is the content of
I thought it had no value because I was using |
So, what is the relationship between |
I feel like they should have the same content but I haven't spent enough time digging through the context around |
What should I investigate next then? What is the purpose of this pack object anyways? Can |
The loader inserts the globals in the pack dictionary into functions, at call time I think? I haven't had time today to look around more for more leads to give you, unfortunately. I'll see if I can find some time this afternoon. |
Who knows the most about how this pack/lazy-loader system is supposed to work? I logged the object ID of the context objects to see if they were the same object and the IDs don't match. That is, the object ID of |
Okay, well here's something else interesting. I changed
At the beginning of I'm assuming the purpose of |
I got it working with a very small change. I just don't know if this is correct so would like someone with knowledge of I figured in
This gives me back the expected value of 2. When I update the code in
I can prepare a pull request and test it, but I'd like to know: is this the right solution? One thing that concerns me in general is whether this |
Interesting! I guess my main question is why the |
@matthayes what version of salt are you running? |
salt 2015.8.3 (Beryllium) |
Thanks. |
Alright, next piece of debugging: can you log the actual object address of |
Ok @basepi, I was wrong and you were right! I think I found the issue. As you said, 'retcode' is indeed in the minion response. The issue is exactly here: The retcode key is outside the data['ret'] dict. The return batch.run() is giving to get_retcode is this:
After changing the mentioned line to
The return batch.run() is giving to get_retcode is this:
And now the --failhard is working as expected! |
Hey @basepi I'm noticing another strange issue with failhard that affects test mode. I'm not sure if this is related. I can open another issue if you think that's best. I'm executing |
@matthayes That's a weird one. Do any of the states it does run fail? It's possible for things to fail even in test mode. |
(But to answer your question, probably a separate issue.) |
Nope, no failures for the states that ran. |
@matthayes I've just fixed this. All details provided in the PR #31164 |
This may fix an issue I've opened for a few months already. #24996 |
@danlsgiga you're welcome! |
@DmitryKuzmenko thanks for fixing this! |
I'm using a command like this below to execute
state.highstate
on one machine at a time, with the intention that it stop on the first failure. But, it's not stopping even though there is a failure.The cause of the failure is a
http_test.ping
custom module function I wrote that checks that an HTTP call is succeeding. I gave it a bogus path to force it to fail. I'm testing this against two minions. Looking at the minion logs I can see the calls are failing. But, the output I get from the salt command above is essentially:So, you can see that the
Ping the web app
call is failing and hasResult: False
. This happens for both machines. But, theretcode
for both machines is apparently 0. I ran the same command again with--out=yaml
appended, which showed me thatretcode
is 0 for both machines.Maybe this isn't a problem with the batch mode but instead a problem with how
retcode
is determined. Am I missing something? Shouldn't theretcode
be non-zero since the state hasResult: False
?The text was updated successfully, but these errors were encountered: