New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefer platform specific memory barriers #708
Conversation
Thanks @ianks. I cannot knowledgeable judge if this is a good change. @thedarkone maybe? |
GCC emits If this is to be merged the PR should also clean up concurrent-ruby/ext/concurrent/extconf.rb Line 46 in b461ffc
HAVE_GCC_SYNC doesn't seem to be used anywhere else.
I would consider merging this if and only if there was a measurable difference on MRI. |
Following the wisdom of postgres, we opt to use platform specific memory barriers when available. These are generally more performant. In this PR, we add specific cases for i386, x86_64. In the future, we could look at using pg's atomics library directly: https://github.com/postgres/postgres/tree/9d4649ca49416111aee2c84b7e4441a0b7aa2fac/src/include/port/atomics
@thedarkone Just did some benchmarking on x86 (do not have another other machines at the moment). AtomicBoolean#true?Pessimistic results (taking slowest result from this branch, fastest from master), show 7.89314% improvement. The optimistic (this branch best, master worst) was 14.8174% improvement AtomicBoolean#make_true/make_falsePessimistic results (taking slowest result from this branch, fastest from master), show 7.19176% improvement. The optimistic (this branch best, master worst) was 11.5249%% improvement require 'benchmark/ips'
require 'concurrent'
bool = Concurrent::AtomicBoolean.new(false)
`bundle exec rake compile`
# master
# ==========================================================================
# #true? 9.922M (± 2.7%) i/s - 49.768M in 5.020095s
# #true? 9.675M (± 1.2%) i/s - 48.484M in 5.011882s
# #true? 9.550M (± 4.4%) i/s - 47.841M in 5.021202s
# #make_true/#make_false 7.469M (± 2.7%) i/s - 37.386M in 5.009572s
# #make_true/#make_false 7.386M (± 2.5%) i/s - 37.123M in 5.029803s
# #make_true/#make_false 7.398M (± 2.2%) i/s - 37.123M in 5.021120s
# update-memory-barriers
# ==========================================================================
# #true? 11.098M (± 1.9%) i/s - 55.497M in 5.002488s
# #true? 10.891M (± 1.4%) i/s - 54.673M in 5.021049s
# #true? 10.766M (± 5.6%) i/s - 53.849M in 5.021705s
# #make_true/#make_false 8.289M (± 2.2%) i/s - 41.663M in 5.029083s
# #make_true/#make_false 8.208M (± 2.2%) i/s - 41.167M in 5.018527s
# #make_true/#make_false 8.047M (± 4.7%) i/s - 40.175M in 5.006055s
Benchmark.ips do |x|
x.config(iterations: 3)
x.report('#true?') do
bool.true?
end
x.report('#make_true/#make_false') do
bool.make_true
bool.make_false
end
end |
7895248
to
5afd748
Compare
@pitr-ch Are there any other changes you would like to see here? |
@ianks Thanks for reminding me about this. Looks good. Could you put a comment in the code explaining when it comes from, as you did int the PR's description. Then we can merge. |
@thedarkone thank you for your input on the PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ianks.
Following the wisdom of postgres, we opt to use platform specific memory barriers when available. These are generally more performant. In this PR, we add specific cases for i386, x86_64.
In the future, we could look at using pg's atomics library directly: https://github.com/postgres/postgres/tree/9d4649ca49416111aee2c84b7e4441a0b7aa2fac/src/include/port/atomics