New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added the PaddedAtomicLong #264
Conversation
Thanks for the contribution! Let me know when it can be reviewed :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution. Please consider the comments. As mentioned this needs some consideration from a compatibility standpoint, or might need to be handling similar to MPSC linked queue CAS method.
jctools-core/src/main/java/org/jctools/util/PaddedAtomicLong.java
Outdated
Show resolved
Hide resolved
* @see AtomicLong#getAndIncrement() | ||
*/ | ||
public final long getAndIncrement() { | ||
return UNSAFE.getAndAddLong(this, VALUE_OFFSET, 1L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means no pre-8 compatibility, which is not necessarily bad, but needs consideration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look at the Java 6 code
public final long getAndAdd(long delta) {
while (true) {
long current = get();
long next = current + delta;
if (compareAndSet(current, next))
return current;
}
}
So Unsafe.getAndSetLong has been replaced by something I expect to be a lot slower.
I made quick JMH benchmark:
@Fork(value = 2)
@Warmup(iterations = 1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
@Threads(4)
public class GetAndAddBenchmark {
public AtomicLong a;
@Setup
public void setup(){
a=new AtomicLong();
}
@Benchmark
public long slow() {
long delta = 10;
while (true) {
long current = a.get();
long next = current + delta;
if (a.compareAndSet(current, next))
return current;
}
}
@Benchmark
public long fast() {
long delta = 10;
return a.getAndAdd(delta);
}
}
And there is a significant performance loss with the slow approach:
Benchmark Mode Cnt Score Error Units
GetAndAddBenchmark.fast avgt 10 118.111 ± 9.630 ns/op
GetAndAddBenchmark.slow avgt 10 231.833 ± 30.669 ns/op
I had a peek at the generated Assembly (perfasm) and key difference is that the slow approach does a volatile read and then a 'lock cmpxchg' and the fast approach only needs a 'lock xadd'. The issue is that the former approach can run into a cas failure, while this can't happen with the lock xadd (hence latter approach is more performant).
In other words: there is a very significant performance loss if I would do it in the Java 6 way. Do you see any way to do it in a Java 6 compatible way but without the performance impact? One option would be to add a switch that chooses the appropriate logic based on the existence of Unsafe.getAndSetLong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nitsanw could you have another look?
jctools-core/src/main/java/org/jctools/util/PaddedAtomicLong.java
Outdated
Show resolved
Hide resolved
jctools-core/src/main/java/org/jctools/util/PaddedAtomicLong.java
Outdated
Show resolved
Hide resolved
Pull Request Test Coverage Report for Build 717
💛 - Coveralls |
@pveentjer Probably for java 6 we don't have other options if not to accept it to be slower :) |
0c7ebb0
to
3011dbf
Compare
@pveentjer I've pushed to your branch, please have a look, if it's all good please submit a unit test and rebase + squash commits to clean merge. |
@nitsanw I'll get my act together today and finish up the work. Once that is done I'll also add an PaddedAtomicReference. |
3011dbf
to
f17d6af
Compare
I have added a unit tests and squashed the commits. |
f17d6af
to
d49fc58
Compare
@pveentjer There is a build failure on the CI
|
|
@pveentjer currently the build is failing, looks like you need to rebase on top of latest code and adjust. If you are quick we might be able to ship this with the 3.0 release this week. |
d49fc58
to
f7fade5
Compare
f7fade5
to
87344cb
Compare
Could someone have another look at this PR? I'm not sure of copying the content of the methods is the right approach because it increases code duplication and it isn't trivial code. |
abstract class PaddedAtomicLongL1Pad extends Number implements java.io.Serializable { | ||
private static final long serialVersionUID = 1; | ||
|
||
transient long p01, p02, p03, p04, p05, p06, p07; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use byte padding as per the rest of JCTools
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Out of curiosity, is there any difference compared to using longs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from JDK 15 the JVM started reordering fields to fill in gaps between object header and fields, resulting in long
based padding fails.
87344cb
to
abefafc
Compare
The PR fits in with other work, I'm going to merge and work out the remaining bits. Thanks for the help @pveentjer ! |
A padded version of the java.util.atomic.AtomicLong.