Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux-exec: add two helper outputs #2005

Merged
merged 6 commits into from Feb 12, 2019

Conversation

@chu11
Copy link
Contributor

commented Feb 12, 2019

On normal completion, output to stderr the tasks that exited with non-zero exit status. Here's example output:

>flux exec -n sh -c ~/test.sh
[1,3]: Exit 1
2: Exit 2

and when I ctrl+c

>flux exec -n sh -c "sleep 10"
^C[0-3]: Exit 130

Also output a list of tasks that were still running when you do a force-exit (multiple ctrl+c in under a second). Example output:

>flux exec -n sh -c ~/ignore_sigint.sh
^C^Cinterrupt (Ctrl+C) one more time within 1.000000 sec to exit
^C[0-3]: command still running at exit

and one dumb commit to remove a struct var that is no longer used.

and one dumb commit to limit the decimal places in a message output.

@grondo

This comment has been minimized.

Copy link
Contributor

commented Feb 12, 2019

Thanks, seems to work great!

Would it be too much trouble to use strsignal(3) on exit status where WIFSIGNALED is true, and also report if there was a coredump (mimicking the shell), e.g. instead of

$ flux exec ./segf
[0-3]: Exit 139

something like

$ flux exec ./segf
[0-3]: Segmentation fault (core dumped)

(This is probably low priority)

We maybe need the convenience function back suprocess_exit_tostring (p) or something...

@chu11 chu11 force-pushed the chu11:issue1998 branch from 3c6b8c6 to 9d8c2eb Feb 12, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Feb 12, 2019

oops

flux-exec.c:447:9: error: implicit declaration of function 'zhashx_foreach'; did you mean 'zhashx_freefn'? [-Werror=implicit-function-declaration]
         zhashx_foreach (exitsets, output_exitsets, NULL);
         ^~~~~~~~~~~~~~
         zhashx_freefn

Looks like deprecated zhashx_foreach() was removed in newer versions. Repushed with updated version using zhashx_first/next

@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Feb 12, 2019

Would it be too much trouble to use strsignal(3) on exit status where WIFSIGNALED is true

Not a problem. In hindsight, it's obvious I should have done that :-)

chu11 added 5 commits Feb 12, 2019
Remove double call to flux_subprocess_exit_code().
Signals can only be > 0, can't be >= 0.
Output idsets of all tasks that failed with a non-zero exit
code.

Fixes #1998
On forced exit, output a convenience message to stderr indicating
the ranks that commands are still running on.
@chu11 chu11 force-pushed the chu11:issue1998 branch from 9d8c2eb to dc53703 Feb 12, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Feb 12, 2019

re-pushed with a couple stupid fixes I found, and using strsignal()'s output in place of "Exit num" for processes died via signal.

@grondo
grondo approved these changes Feb 12, 2019
Copy link
Contributor

left a comment

LGTM!

@grondo

This comment has been minimized.

Copy link
Contributor

commented Feb 12, 2019

I applied the merge-when-passing label and gave a review, so we'll see if this get auto-merged.

If you're not ready for merge yet, just remove the label before travis finishes, but this looks good to me and I ran some tests by hand on fluke.

@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Feb 12, 2019

should it have auto-merged by now?

@grondo

This comment has been minimized.

Copy link
Contributor

commented Feb 12, 2019

Well that was kind of a failure. Codecov never updated the status of your last push to success. As an experiment I restarted the coverage tester so see if that forces a status update.

Hopefully it isn't a case where codecov only updates status on the first push...

Edit: BTW, you can see the "status" of the mergify conditions by clicking the "Status" tab above. In this case it showed missing codecov/project status check.

@codecov-io

This comment has been minimized.

Copy link

commented Feb 12, 2019

Codecov Report

Merging #2005 into master will increase coverage by 0.04%.
The diff coverage is 68.88%.

@@            Coverage Diff             @@
##           master    #2005      +/-   ##
==========================================
+ Coverage   80.14%   80.18%   +0.04%     
==========================================
  Files         195      195              
  Lines       35150    35191      +41     
==========================================
+ Hits        28171    28219      +48     
+ Misses       6979     6972       -7
Impacted Files Coverage Δ
src/cmd/flux-exec.c 75.78% <68.88%> (-1.69%) ⬇️
src/modules/kvs/kvs.c 66.47% <0%> (+0.14%) ⬆️
src/common/libflux/message.c 81.64% <0%> (+0.36%) ⬆️
src/modules/connector-local/local.c 74.81% <0%> (+1.03%) ⬆️
src/common/libflux/mrpc.c 88.93% <0%> (+1.18%) ⬆️
src/common/libflux/response.c 82.71% <0%> (+1.23%) ⬆️
src/modules/barrier/barrier.c 78.62% <0%> (+2.06%) ⬆️
@mergify mergify bot merged commit 590129c into flux-framework:master Feb 12, 2019
3 of 4 checks passed
3 of 4 checks passed
codecov/patch 68.88% of diff hit (target 80.14%)
Details
Mergify — Summary 1 rule matches
Details
codecov/project 80.18% (+0.04%) compared to 790562e
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@grondo

This comment has been minimized.

Copy link
Contributor

commented Feb 12, 2019

yay, it worked. 😒
Maybe we'll rethink making codecov/project one of the conditions. Reviewers just won't approve PRs or set merge-when-passing if coverage doesn't look right...

@garlick

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

That sounds reasonable to me.

@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Feb 12, 2019

hmmm, issue #1998 wasn't automatically closed. I doubt that would be due to mergify? Nothing in the github docs suggests that a owner of the repo must do the merge to auto-close issues?

@grondo

This comment has been minimized.

Copy link
Contributor

commented Feb 12, 2019

Oh, that's a head scratcher...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.