Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--hostfile option not working as expected #12574

Closed
rhl-bthr opened this issue May 24, 2024 · 4 comments
Closed

--hostfile option not working as expected #12574

rhl-bthr opened this issue May 24, 2024 · 4 comments

Comments

@rhl-bthr
Copy link

rhl-bthr commented May 24, 2024

Running mpirun --hostfile hosts -np 2 hostname, both the processes are executed on the same host, and none on the other. I did verify that the hostfile is being detected (mpirun crashed when I deleted the file).

My hosts file looks like,

10.4.0.1
10.4.0.2

Weirdly enough, if I add duplicate the 2 entries in the host file, making the total of 4, then running mpirun -np 4 runs 2 processes per host.

I'm running Ubuntu 22.04 on both machines, and they're accessible to each other via ssh (without any credentials).

@rhc54
Copy link
Contributor

rhc54 commented May 24, 2024

Yes, that is exactly as it should do. I'd suggest reading the hostfile documentation - it explains all this and provides examples.

@rhl-bthr
Copy link
Author

rhl-bthr commented May 24, 2024

Sorry, maybe I should have been clearer. Running mpi with 2 hosts in the hostfile, and setting -np 2, still runs both processes on the same one host, and none on the other. My expectation was that it would run one process per host.

Is this behaviour expected? If so, is there a way I can use -hostfile but still enforce the behavior that runs one process per host, which is what I'm getting when I run mpi with the setting mpirun -host 10.4.0.1,10.4.0.2 -np 2 hostname

@rhc54
Copy link
Contributor

rhc54 commented May 24, 2024

Sorry, maybe I should have been clearer. Running mpi with 2 hosts in the hostfile, and setting -np 2, still runs both processes on the same one host, and none on the other. My expectation was that it would run one process per host.

No, that expectation is not correct. You really should read the hostfile doc - we spent quite a bit of time on it to ensure it clearly explained all this.

If so, is there a way I can use -hostfile but still enforce the behavior that runs one process per host, which is what I'm getting when I run mpi with the setting mpirun -host 10.4.0.1,10.4.0.2 -np 2 hostname

Sure - but the precise syntax depends on which OMPI version you are using (which you haven't told us). You might want to try mpirun --help or check the mpirun documentation for placement options.

@rhl-bthr
Copy link
Author

Ah thanks, found it.

In case someone runs into this issue, I was able to solve this issue in two ways - by either adding -N 1 or --map-by-ppr:1:node. I was using OMPI version 4.0.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants