New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No error output when using mpirun with hpx #1221

Closed
pagrubel opened this Issue Aug 12, 2014 · 7 comments

Comments

Projects
None yet
4 participants
@pagrubel
Member

pagrubel commented Aug 12, 2014

While running hpx with mpirun, errors are not reported. Just try running an application that was not compiled for MPI parcel port and it hangs. Other errors also cause this to happen but can't tell you what they are because we can't see the errors.
.

@parsa

This comment has been minimized.

Show comment
Hide comment
@parsa

parsa Aug 12, 2014

Contributor

I think it's a problem with SLURM. It doesn't happen just for MPI application ran with mpirun, unless you open an interactive bash session with srun, it happens. If you try to build something that has compilation errors, you will not be able to see those errors and the SLURM job just hangs.

Contributor

parsa commented Aug 12, 2014

I think it's a problem with SLURM. It doesn't happen just for MPI application ran with mpirun, unless you open an interactive bash session with srun, it happens. If you try to build something that has compilation errors, you will not be able to see those errors and the SLURM job just hangs.

@hkaiser hkaiser added this to the 0.9.9 milestone Aug 12, 2014

@hkaiser

This comment has been minimized.

Show comment
Hide comment
@hkaiser

hkaiser Aug 12, 2014

Member

What do I need to do to reproduce the issue?

Member

hkaiser commented Aug 12, 2014

What do I need to do to reproduce the issue?

@sithhell

This comment has been minimized.

Show comment
Hide comment
@sithhell

sithhell Aug 12, 2014

Member

This is a non issue. There is nothing much we can do here. It's perfectly
valid to run non-mpi applications with mpirun. Try "mpirun echo hello" for
example. The problem is that some mpirun implementations don't forward the
environment properly. As such, every hpx process is waiting for someone
else to connect. It's a configuration problem on the user side which is not
really detectable.

Member

sithhell commented Aug 12, 2014

This is a non issue. There is nothing much we can do here. It's perfectly
valid to run non-mpi applications with mpirun. Try "mpirun echo hello" for
example. The problem is that some mpirun implementations don't forward the
environment properly. As such, every hpx process is waiting for someone
else to connect. It's a configuration problem on the user side which is not
really detectable.

@sithhell

This comment has been minimized.

Show comment
Hide comment
@sithhell

sithhell Aug 13, 2014

Member

Will close as won't fix as i think there is nothing we can do here.

Member

sithhell commented Aug 13, 2014

Will close as won't fix as i think there is nothing we can do here.

@hkaiser

This comment has been minimized.

Show comment
Hide comment
@hkaiser

hkaiser Aug 13, 2014

Member

Well, I have not even understood what the problem was... I was waiting for some explanations,
Why have you closed this?

Member

hkaiser commented Aug 13, 2014

Well, I have not even understood what the problem was... I was waiting for some explanations,
Why have you closed this?

@sithhell

This comment has been minimized.

Show comment
Hide comment
@sithhell

sithhell Aug 13, 2014

Member

I closed it because i got no reply to my explanation above ... So here it is again:
mpirun a.out has the effect of launching a.out. This might launch N or just one process on different machines, that depends on the environment and such. Now it might happen that it doesn't pass the environment correctly. So what pat sees is the equivalent of running N processes like this:
./a.out --hpx:localities=N --hpx:console
There is absolutely nothing we can do about that, IMHO. As such i closed the ticket as "won't fix".

Member

sithhell commented Aug 13, 2014

I closed it because i got no reply to my explanation above ... So here it is again:
mpirun a.out has the effect of launching a.out. This might launch N or just one process on different machines, that depends on the environment and such. Now it might happen that it doesn't pass the environment correctly. So what pat sees is the equivalent of running N processes like this:
./a.out --hpx:localities=N --hpx:console
There is absolutely nothing we can do about that, IMHO. As such i closed the ticket as "won't fix".

@pagrubel

This comment has been minimized.

Show comment
Hide comment
@pagrubel

pagrubel Aug 13, 2014

Member

Well that's what I suspected before i posted it.

Member

pagrubel commented Aug 13, 2014

Well that's what I suspected before i posted it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment