Skip to content

Conversation

@artpol84
Copy link
Contributor

The fact that application proc called Abort (read failed) doesn't
mean that ORTE subsystem has failed - vice versa it does it's work
to gracefuly exit the whole application.

orted exiting with non-zero status creates a problem for at least
plm/slurm environments where orteds are launched via srun with
"--kill-on-bad-exit" flag. If one of orteds has exited with non-
zero status slurm will immediately kill all other orteds. As the
result we see a lot of leftover in the /tmp directory.

(ported from 4af7a08)

Signed-off-by: Artem Polyakov artpol84@gmail.com

The fact that application proc called Abort (read failed) doesn't
mean that ORTE subsystem has failed - vice versa it does it's work
to gracefuly exit the whole application.

orted exiting with non-zero status creates a problem for at least
plm/slurm environments where orteds are launched via `srun` with
"--kill-on-bad-exit" flag. If one of orteds has exited with non-
zero status slurm will immediately kill all other orteds. As the
result we see a lot of leftover in the `/tmp` directory.

(ported from 4af7a08)

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
@artpol84 artpol84 added the bug label Apr 14, 2017
@artpol84 artpol84 added this to the v2.0.3 milestone Apr 14, 2017
@artpol84 artpol84 requested a review from rhc54 April 14, 2017 21:09
@hppritcha
Copy link
Member

bot:lanl:retest

Copy link
Contributor

@rhc54 rhc54 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to go up to the 2.x series as well?

@artpol84
Copy link
Contributor Author

artpol84 commented May 5, 2017

We need this in 2.x!

@artpol84
Copy link
Contributor Author

artpol84 commented May 5, 2017

I think it was already merged: #3359

@rhc54
Copy link
Contributor

rhc54 commented May 5, 2017

Okay, good - I couldn't remember.

@hppritcha hppritcha merged commit 43b6f00 into open-mpi:v2.0.x May 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants