Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minion retcode is incorrect #30914

Closed
plastikos opened this issue Feb 4, 2016 · 8 comments
Closed

minion retcode is incorrect #30914

plastikos opened this issue Feb 4, 2016 · 8 comments
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt P1 Priority 1 severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around stale State-Compiler
Milestone

Comments

@plastikos
Copy link
Contributor

Certain state failures do not set retcode properly. Consequently the master does not enumerate the minions as failed and summary information is incorrect. This appears to be related to states that cause additional salt calls that instantiate new __context__. The roll-up of child __context__ to the parent __context__ fails and then __context__['retcode'] is incorrect.

It is notable that when the master parses the return information for colorization that the failed states are correctly colorized due to state return structures correctly indicate the failure. Unfortunately the overall success/failure is determined by reading retcode which is incorrect. Consequently summary information provided by -v --summary is incorrect. This can go unnoticed when visually scanning the output from a few minions, but when there are thousands of minions the summary information must be reliable.

Put the following contents in junk/foo.sls:

#!jinja|yaml

foo:
  cmd.run:
    - name: 'true'
    - check_cmd:
      - 'false'

The following is expected output:

[root@orc1.ut1 ~]# salt -v --summary minion-a state.apply junk.foo
Executing job with jid 20160204153220765528
-------------------------------------------

minion-a:
----------
          ID: foo
    Function: cmd.run
        Name: true
      Result: False
     Comment: check_cmd determined the state failed
     Started: 15:32:22.346851
    Duration: 785.631 ms
     Changes:
              ----------
              pid:
                  13114
              retcode:
                  0
              stderr:
              stdout:

Summary for minion-a
------------
Succeeded: 0 (changed=1)
Failed:    1
------------
Total states run:     1
Total run time: 785.631 ms


-------------------------------------------
Summary
-------------------------------------------
# of minions targeted: 1
# of minions returned: 1
# of minions that did not return: 0
# of minions with errors: 1
Minions with failures: minion-a
-------------------------------------------
ERROR: Minions returned with non-zero exit code

While minion-a shows a Failed: 1 it does not show as a minion with errors.

@jfindlay jfindlay added Bug broken, incorrect, or confusing behavior severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around P2 Priority 2 P1 Priority 1 State-Compiler Core relates to code central or existential to Salt and removed P2 Priority 2 labels Feb 5, 2016
@jfindlay jfindlay added this to the Approved milestone Feb 5, 2016
@jfindlay
Copy link
Contributor

jfindlay commented Feb 5, 2016

@plastikos, thanks for reporting. Similar to #18510.

@plastikos
Copy link
Contributor Author

@jfindlay, I don't think these are the same. I'm not concerned (in this ticket) about the overall exit code of salt.

@basepi
Copy link
Contributor

basepi commented Feb 22, 2016

Possibly refs #29643, and might be fixed by #31164

@plastikos
Copy link
Contributor Author

@DmitryKuzmenko states this is broken with transport: tcp.

This is a show-stopper for any upgrades at our site.

@DmitryKuzmenko
Copy link
Contributor

@plastikos have you checked this with #29643 applied? The fix is merged into 2016.3 and not released yet.

@plastikos
Copy link
Contributor Author

I'll be getting back to this on Wednesday.

Here's a big question: Isn't it obvious what this integration test does? Why not just merge it since it should succeed?

@plastikos
Copy link
Contributor Author

I have checked and it's still broken with #29643. My guess is that #29643 fixes a broader case that is necessary, but there's an additional situation that also causes __context__ to be reset for the specific case I have.

@stale
Copy link

stale bot commented Apr 16, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Apr 16, 2018
@stale stale bot closed this as completed Apr 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt P1 Priority 1 severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around stale State-Compiler
Projects
None yet
Development

No branches or pull requests

4 participants