New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port Striped64, LongAdder from JSR-166 #3342
Port Striped64, LongAdder from JSR-166 #3342
Conversation
final private[atomic] class Cell private[atomic] (var value: Long) { | ||
@volatile private[concurrent] var _value = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need value
and _value
? If they need to be separate, we should remove var
from value
, otherwise we have two fields instead of one field and a ctor argument..
Also, I think the original Java version has some padding here? I'm not sure if that trick works in Scala Native, but if it does, we should port it as well. If it doesn't, we should figure out a way to support that kind of memory layout, because it's important for concurrent code.
For a discussion about padding and how it prevents "false-sharing" on the JVM, see typelevel/cats-effect#2543 (comment).
That would create false sharing, i.e. two independent fields residing on the same cache line while different threads independently write to each variable. Each write to either field in this case invalidates the cache of the other thread, slowing everyone down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, just realized I was looking at an old version :)
However, I see the latest code uses @jdk.internal.vm.annotation.Contended
. This is an important annotation that relates to the false-sharing issue. @WojciechMazur I wonder if we can port its semantics to Scala Native?
At the VM’s discretion, those fields annotated as contended are given extra padding so they do not share cache lines with other fields that are likely to be independently accessed. This helps to avoid cache contention across multiple threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I missed that point, thank you! (And the discussion you sited from the cats-effect is very intriguing to me)
Yes, it would be nice if we have a scala native own implementation of this annotation that adds padding to the fields of a class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@armanbilge I've stumbled on the Contended annotation before, but I've not yet made any steps to support them. On most of the target architectures (x64, arm64, x86) the cache line size is equal to 64 bytes, so currently we might use a fixed padding. That's the most important is how to preserve this information between frontend (scalac) and backed (scala-native-codegen). Probably we should introuce a new NIR attribute which would contain this information, allowing eventually for platform-specific padding.
We definitly would need to support both single contended fields, as well as contended field groups, as the JVM does to make it memory-space efficient.
Until we'll implement Contended annotation, it would be worthy to place commented-out annotation, to allow for smooth adjustment later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened an issue to track this.
final private[atomic] def cas(cmp: Long, `val`: Long) = | ||
valueAtomic().compareExchangeWeak( | ||
cmp, | ||
`val`, | ||
memory_order.memory_order_release | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dumb question: in JSR166 I see the following comment, is that what you've done here with memory_order_release
?
* JVM intrinsics note: It would be possible to use a release-only
* form of CAS here, if it were provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry I am looking at an old version. This matches the latest code :) woops, I see even the newest code includes that comment. But if I understand correctly, that's exactly what you've done here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I refered to VarHandle.weakCompareAndSetRelease, and implemented it for scala-native.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this contribution, I've added a few comments.
* additional information regarding copyright ownership. | ||
*/ | ||
// Ported from JSR 166 revision 1.23 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also include the original header from the JSR166 sources, mentioning Doug Lea and the JSR-166 team?
final private[atomic] class Cell private[atomic] (var value: Long) { | ||
@volatile private[concurrent] var _value = value | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it was previously mentioned we should keep only 1 of these fields, probably to make it the most JSR-166 compliant it should be:
final private[atomic] class Cell private[atomic] (var value: Long) { | |
@volatile private[concurrent] var _value = value | |
final private[atomic] class Cell private[atomic] (@volatile private[atomic] var value: Long) { |
@alwaysinline private def threadProbeAtomic() = new CAtomicInt( | ||
fromRawPtr( | ||
Intrinsics.classFieldRawPtr( | ||
Thread.currentThread(), | ||
"threadLocalRandomProbe" | ||
) | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was can be moved to companion object. It does not reference this
. Thank to that we would be also able to remove unnecessary arguments of Stripped64 passed to {get,advance}Proble
var _index = index | ||
var _wasUncontended = wasUncontended |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically I would recommend to use underscored variant for the parameter types. That way you can lower the risk of referring the original value, instead of current var value. When working with javalib we typically would not use named function arguments anyway
final def add(x: Long): Unit = | ||
value = value + x | ||
@SerialVersionUID(7249069246863182397L) | ||
private class SerializationProxy(val a: LongAdder) extends Serializable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should a: LongAdder
be only constructor argument and not a field? Otherwise it looks like a cyclic dependency for me, becouse we would again want to serialize field of LongAdder type in it's serializer.
private class SerializationProxy(val a: LongAdder) extends Serializable { | |
private class SerializationProxy(a: LongAdder) extends Serializable { |
Also you can point why it's needed (I guess that's it's done to hide the implementation detail of Stripped64 transient fields)
This PR Ports Striped64 and LongAdder, which is needed for ConcurrentSkipListMap.