Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve spinlock implementation to perform better in high-contention situations #4557

Merged
merged 2 commits into from
Apr 27, 2020

Conversation

hkaiser
Copy link
Member

@hkaiser hkaiser commented Apr 26, 2020

This reduces the cache-pressure while acquiring highly contented spinlock instances by mostly reading from it instead of constantly trying to write to it.

As a flyby, this removes the dependency on boost/smart_ptr/detail/spinlock.hpp and related Boost facilities.

msimberg
msimberg previously approved these changes Apr 27, 2020
Copy link
Contributor

@msimberg msimberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks! The comments are mostly subjective style issues. Feel free to do as you please with them.

hpx/runtime/naming/name.hpp Outdated Show resolved Hide resolved
hpx/runtime/naming/name.hpp Outdated Show resolved Hide resolved
- flyby: make sure yield_while(pred, "", false) actually doesn't suspend
@hkaiser hkaiser merged commit 4551ce8 into master Apr 27, 2020
@hkaiser hkaiser deleted the better_spinlock branch April 27, 2020 17:59
@biddisco
Copy link
Contributor

I ran a benchmark to see how much this helps as I wanted to test it before it is merged.

Something must be wrong with my testing because I see a significant drop in performance
image
Unfortunately, the previous runs of the benchmark were in December, and so many things have changed that it is hard to be certain what might be responsible, but I'm pasting this image just for reference.

This may be a false reading, so do not be alarmed yet, I will double check my scripts and report back, but I thought I'd mention it now just in case anything else shows problems.

@hkaiser
Copy link
Member Author

hkaiser commented Apr 27, 2020

Uhh, not good. I wouldn't think that this particular change is causing this. So I guess the only thing we can do is to bisect...

@biddisco
Copy link
Contributor

I did another build this morning, but this time I turned off everything that might cause overheads

  cmake \
      -DCMAKE_BUILD_TYPE=Release \
      -DBOOST_ROOT=$BOOST_ROOT \
      -DHWLOC_ROOT=$INSTALL_ROOT/hwloc/$HWLOC_VER \
      -DHPX_WITH_MALLOC=JEMALLOC \
      -DJEMALLOC_ROOT=$INSTALL_ROOT/jemalloc/$JEMALLOC_VER \
      -DHPX_WITH_NETWORKING:BOOL=OFF \
      -DHPX_WITH_LOGGING:BOOL=OFF \
      -DHPX_WITH_IO_COUNTERS:BOOL=OFF \
      -DHPX_WITH_PARCELPORT_ACTION_COUNTERS:BOOL=OFF \
      -DHPX_WITH_PARCEL_PROFILING:BOOL=OFF \
      -DHPX_WITH_STACKTRACES:BOOL=OFF \
      -DHPX_WITH_THREAD_CUMULATIVE_COUNTS:BOOL=OFF \
      -DHPX_WITH_THREAD_IDLE_RATES:BOOL=OFF \
      -DHPX_WITH_THREAD_MANAGER_IDLE_BACKOFF:BOOL=OFF \
      -DHPX_WITH_THREAD_STEALING_COUNTS:BOOL=OFF \
      -DHPX_WITH_VERIFY_LOCKS:BOOL=OFF \
      -DHPX_WITH_TIMER_POOL:BOOL=OFF \
      -DHPX_WITH_IO_POOL:BOOL=OFF \
      ~/src/hpx

The build seems to perform much more in line with expectations - and looks a little better than december

image

Please disregard my previous graph as there must have been something enabled that shouldn't have been.

@hkaiser
Copy link
Member Author

hkaiser commented Apr 28, 2020

@biddisco thanks! This looks more like what I would have expected. Brilliant!

@biddisco
Copy link
Contributor

image
For completeness - included is a daint test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants