Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--hpx:bind range specifier restrictions are overly restrictive #2312

Closed
brycelelbach opened this issue Aug 29, 2016 · 2 comments · Fixed by #2318
Closed

--hpx:bind range specifier restrictions are overly restrictive #2312

brycelelbach opened this issue Aug 29, 2016 · 2 comments · Fixed by #2318

Comments

@brycelelbach
Copy link
Member

Suppose I'm trying to write an --hpx:bind specifier asking for 2 threads per core on a system with 4 threads per core in total.


For reference, if I want 4 out 4 thread per core, this works:

$ bin/simplest_hello_world --hpx:bind thread:0-63=core:0-63.pu:all --hpx:print-bind -t64

Also, if I want 1 out 4 thread per core, this works:

$ bin/simplest_hello_world --hpx:bind thread:0-63=core:0-63.pu:0 --hpx:print-bind -t64

Given that the above two examples work, I'd expect the following to work for 2 out of threads per core.

$ bin/simplest_hello_world --hpx:bind thread:0-63=core:0-31.pu:0-1 --hpx:print-bind -t64

If I do the following on top of trunk I get:

terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<hpx::exception> >'
  what():  index ranges can be specified only for one node type (socket/numanode, core, or pu): HPX(bad_parameter)
Aborted

I also get some odd behavior when I try to leave out the 'core:N' part of the spec:


If I want 4 out of 4 threads per core, I'd do this:

$ bin/simplest_hello_world --hpx:bind thread:0-63=pu:all --hpx:print-bind -t64

It works, and uses 4 threads per core. This is what I would expect.


If I want 1 out of 4 threads per core, I'd do this:

$ bin/simplest_hello_world --hpx:bind thread:0-63=pu:0 --hpx:print-bind -t64

This runs, but does not work as intended. It binds all threads to PU 0.


If I want 2 out of 4 threads per core, I'd do this:

$ bin/simplest_hello_world --hpx:bind thread:0-63=pu:0-1 --hpx:print-bind -t64

This does not work, giving this error:

terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<hpx::exception> >'
  what():  The number of OS threads requested (64) does not match the number of threads to bind (2): HPX(bad_parameter)
Aborted

Note that I am using --hpx:bind instead of --hpx:pu-step because --hpx:print-bind does not work with --hpx:pu-step; because of this, I have no way to get a diagnostic to confirm that --hpx:pu-step is working as intended (which I want to do because this is being run on you-know-which-new-hardware).

@hkaiser
Copy link
Member

hkaiser commented Aug 29, 2016

I believe those are genuine bugs in the code which is analyzing the bind expressions.

@hkaiser
Copy link
Member

hkaiser commented Sep 1, 2016

The PR #2318 will fix the first of your issues (thread:0-63=core:0-31.pu:0-1).

The second of your issues is actually handled correctly. thread:0-63=pu:0 will bind all threads to the first available PU. If you need to bind to the first PU of each of the cores use thread:0-63=cores:0-63.pu:0.

The third of your issues is correctly reported as an error, use thread:0-63=core:0-31.pu:0-1 instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants