Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding flags to scheduler allowing to control thread stealing and idle back-off #3768

Merged
merged 8 commits into from Apr 8, 2019

Conversation

@hkaiser
Copy link
Member

commented Apr 3, 2019

This adds a couple of things:

  • adding mark_end_of_scheduling hook to execution parameters allowing to be called back after the scheduling phase of the parallel algorithm
  • added invocation of mark_begin/mark_end scheduling parameters to sequential for_each and for_loop
  • added flags to scheduling loop that allows to control thread stealing and idle back-off (effective only for schedulers that support it)
  • changed implementation of static_priority_scheduler to full rely on local_priority_scheduler except that it starts of with stealing disabled
  • made fast_idle_mode scheduler flag more useful as it now speeds up idle back-off in case of unsuccessful thread stealing (no work available through thread stealing)
  • set HPX_WITH_THREAD_MANAGER_IDLE_BACKOFF=ON by default (can now be disabled at runtime)
  • added add_scheduler_mode/remove_scheduler_mode/add_remove_scheduler_mode APIs allowing to control the scheduler mode dynamically
  • flyby: removed unused arguments for get_next_thread, added argument controlling enable_stealing for wait_or_add_new
  • modified for_each_scaling test to not measure data initialization also added command line options --disable_stealing and --fast_idle_mode
  • duplicated scheduler mode across all cores (with cache alignment) to reduce false sharing

This relies on #3745 being merged first. This subsumes #3745

Fixes #3744

hkaiser added some commits Mar 17, 2019

Introduced cache_aligned_data and cache_line_data helper structure
- reduce false sharing in local priority scheduler and thread_queue
- reduce false sharing in latch and condition_variable
- disable stealing counters by default
More false sharing reductions
- disable exponential idle backoff by default
- simplify wait_or_add_new functionality
Fixing local_latch test to conform to preconditions required by latch…
…::reset

- flyby: add warning suppression for MSVC
Making wait-count for exponential backoff core-specific
- flyby: don't use alignment with NVCC
@@ -822,10 +846,11 @@ namespace hpx { namespace threads { namespace detail
// if nothing else has to be done either wait or terminate
else
{
++idle_loop_count;
--idle_loop_count;

This comment has been minimized.

Copy link
@msimberg

msimberg Apr 4, 2019

Contributor

Does this need to be reversed? Just thinking about naming. Right now it's more like idle_loop_countdown ;) and it's different from the other counts that actually grow.

This comment has been minimized.

Copy link
@hkaiser

hkaiser Apr 4, 2019

Author Member

I reversed this as it helped me not to carry around the end value to the point where it needed to be reset.

@@ -247,7 +247,7 @@ namespace hpx { namespace threads { namespace policies
num_pending_misses += high_priority_queues_[num_thread].data_->
get_num_pending_misses(reset);
}
if (num_thread == 0)
if (num_thread == num_queues_-1)

This comment has been minimized.

Copy link
@msimberg

msimberg Apr 4, 2019

Contributor

Does this make a difference?

This comment has been minimized.

Copy link
@hkaiser

hkaiser Apr 4, 2019

Author Member

There was some inconsistency that we were accessing the low-priority queue from different cores, sometimes it was core zero sometimes the last one. I made it consistent to reduce false sharing on that queue.

@hkaiser hkaiser force-pushed the control_thread_stealing branch 2 times, most recently from 044e0e9 to b1b9e7a Apr 4, 2019

@hkaiser hkaiser marked this pull request as ready for review Apr 5, 2019

@hkaiser

This comment has been minimized.

Copy link
Member Author

commented Apr 5, 2019

@msimberg, @biddisco: this is ready to go from my end, please review.

Adding flags to scheduler allowing to control thread stealing and idl…
…e back-off

This adds a couple of things:

- adding mark_end_of_scheduling hook to execution parameters allowing to be
  called back after the scheduling phase of the parallel algorithm
- added invocation of mark_begin/mark_end scheduling parameters to sequential
  for_each and for_loop

- added flags to scheduling loop that allows to control thread stealing and
  idle back-off (effective only for schedulers that support it)
- changed implementation of static_priority_scheduler to full rely on
  local_priority_scheduler except that it starts of with stealing disabled
- made fast_idle_mode scheduler flag more useful as it now speeds up idle
  back-off in case of unsuccessful thread stealing (no work available through
  thread stealing)
- set HPX_WITH_THREAD_MANAGER_IDLE_BACKOFF=ON by default (can now be disabled
  at runtime)

- added add_scheduler_mode/remove_scheduler_mode/add_remove_scheduler_mode
  APIs allowing to control the scheduler mode dynamically

- flyby: removed unused arguments for get_next_thread, added argument
  controlling enable_stealing for wait_or_add_new

- modified for_each_scaling test to not measure data initialization
  also added command line options `--disable_stealing` and `--fast_idle_mode`

@hkaiser hkaiser force-pushed the control_thread_stealing branch 2 times, most recently from ba55402 to 01b43d2 Apr 6, 2019

Moved scheduler_base implementation to source file
- flyby: remove obsolete periodic_maintenance

@hkaiser hkaiser force-pushed the control_thread_stealing branch from 01b43d2 to 89b45d3 Apr 6, 2019

@hkaiser hkaiser merged commit 825c3ec into master Apr 8, 2019

11 of 16 checks passed

pycicle daint-clang-3.8-boost-1.58.0-c++11-Release Test errors 628
Details
pycicle daint-gcc-7.3.0-boost-1.68.0-c++17-Release Build errors 11
Details
pycicle daint-gcc-7.3.0-boost-1.68.0-c++17-Release Test errors 1
Details
pycicle daint-gcc-7.3.0-cuda-9.2.148_3.19-6.0.7.1_2.1__g3d9acc8-boost-1.68.0-c++11-Release Build errors 48
Details
pycicle daint-gcc-7.3.0-cuda-9.2.148_3.19-6.0.7.1_2.1__g3d9acc8-boost-1.68.0-c++11-Release Test errors 8
Details
build-and-test Workflow: build-and-test
Details
pycicle daint-clang-3.8-boost-1.58.0-c++11-Release Build errors 0
Details
pycicle daint-clang-3.8-boost-1.58.0-c++11-Release Config errors 0
Details
pycicle daint-clang-7.0-boost-1.68.0-c++17-nonetworking-Debug Build errors 0
Details
pycicle daint-clang-7.0-boost-1.68.0-c++17-nonetworking-Debug Config errors 0
Details
pycicle daint-clang-7.0-boost-1.68.0-c++17-nonetworking-Debug Test errors 0
Details
pycicle daint-gcc-4.9.3-boost-1.58.0-c++11-Debug Build errors 0
Details
pycicle daint-gcc-4.9.3-boost-1.58.0-c++11-Debug Config errors 0
Details
pycicle daint-gcc-4.9.3-boost-1.58.0-c++11-Debug Test errors 0
Details
pycicle daint-gcc-7.3.0-boost-1.68.0-c++17-Release Config errors 0
Details
pycicle daint-gcc-7.3.0-cuda-9.2.148_3.19-6.0.7.1_2.1__g3d9acc8-boost-1.68.0-c++11-Release Config errors 0
Details

@hkaiser hkaiser deleted the control_thread_stealing branch Apr 8, 2019

@msimberg

This comment has been minimized.

Copy link
Contributor

commented Apr 10, 2019

What was the reason the shared_priority_queue_scheduler was timing out?

@hkaiser

This comment has been minimized.

Copy link
Member Author

commented Apr 10, 2019

@msimberg the wait_or_add_new was not properly setting its return value forcing the scheduler loop to wait forever for it to signal its termination status; see for instance here: https://github.com/STEllAR-GROUP/hpx/pull/3768/files#diff-c730f66e3d702c09217b1bbcc6b570f2R1317.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.