-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TrieMap method getOrElseUpdate is not thread-safe #7943
Comments
Imported From: https://issues.scala-lang.org/browse/SI-7943?orig=1 |
@retronym said: |
@axel22 said: This bug report complains about exactly that - it expects There are 4 solutions:
Thoughts? Btw, the proposed workaround with synchronized will fix the example in this bug report, but won't solve the issues, as there might exist other concurrent calls to e.g. |
@retronym said: I've always used Guava CacheBuilder for concurrent maps to implement memoization. I'm not sure what the right approach is, let's discuss it in person next week. |
@axel22 said (edited on Oct 31, 2013 4:02:19 PM UTC): |
@axel22 said (edited on Oct 31, 2013 4:04:35 PM UTC): |
@axel22 said (edited on Oct 31, 2013 4:41:24 PM UTC): import collection._
class Cache[K, V] {
class LazyCell(compute: =>V) {
lazy val value = compute
}
private val tm = concurrent.TrieMap[K, LazyCell]()
def getOrElseUpdate(k: K, fresh: =>V): V = {
val proposedCell = new LazyCell(fresh)
val existingCellOpt = tm.putIfAbsent(k, proposedCell)
if (existingCellOpt.nonEmpty) existingCellOpt.get.value
else proposedCell.value
}
}
object Main {
def main(args: Array[String]) {
val cache = new Cache[String, String]
val v1 = cache.getOrElseUpdate("key1", { println("val1"); "val1" })
val v1p = cache.getOrElseUpdate("key1", { println("val1 generation again"); "val1" })
val v2 = cache.getOrElseUpdate("key2", { println("val2"); "val2" })
}
} Note that there's no way of avoiding the thunk creation above, and we're already in object-creation-land, so an additional This means that any concurrent map that supports |
Matthew Roberts (mattroberts297) said: Thanks for looking at this. As a user, I'd prefer to have a smaller tightly defined interface (solution 2). Especially when dealing with concurrency. Whilst ideal, I can see solution 4 becoming a headache if the underlying implementation changes or the number of concurrent collections increases. Thanks again, |
@retronym said: |
@axel22 said (edited on Oct 31, 2013 5:19:06 PM UTC): As for easiness - seems to me it should work. You could also do a fast path: def getOrElseUpdate(k: K, fresh: =>V): V = {
val cell = tm.lookup(k)
if (cell != null) cell.value
else {
val proposedCell = new LazyCell(fresh)
val existingCellOpt = tm.putIfAbsent(k, proposedCell)
if (existingCellOpt.nonEmpty) existingCellOpt.get.value
else proposedCell.value
}
} |
@axel22 said: |
@retronym said: class LazyCell[V](var compute: () =>V) {
lazy val value = { val temp = compute; compute = null.asInstanceOf[V]; temp }
}
def getOrElseUpdate(k: K, fresh: =>V): V = {
val proposedCell = new LazyCell(() => fresh)
... |
@axel22 said: |
@retronym said (edited on Oct 31, 2013 8:08:32 PM UTC):
|
@axel22 said: |
Matthew Roberts (mattroberts297) said: |
@axel22 said: |
@axel22 said: |
@adriaanm said: |
@gkossakowski said: |
Mikael Ståldal (mikaelstaldal) said: If the contract is relaxed, I guess that this could be done in scala.collection.concurrent.Map instead, like this: override def getOrElseUpdate(key: A, op: => B): B =
get(key) match {
case Some(v) => v
case None => val d = op; putIfAbsent(key, d).getOrElse(d)
} |
@axel22 said (edited on Sep 24, 2014 1:33:43 PM UTC): The basic problem here is that the semantics of the Once this is fixed in the Note: invoking this method may potentially evaluate the expression `op` multiple times without storing it in the map. a fix similar to what you suggest can be added to the |
@retronym said: |
Morten Andersen-Gott (magott) said (edited on Dec 12, 2014 10:43:23 AM UTC): Which also does not guarantee that the mapping function is called only once under racy conditions |
Andrei Pozolotin (andrei.pozolotin) said: |
@axel22 said: |
Jeff Olson (jdolson) said: |
@axel22 said: Eventually, we want to do the same in |
@axel22 said: |
Andrei Pozolotin (andrei.pozolotin) said: |
@axel22 said (edited on Feb 10, 2015 9:21:22 PM UTC): Specifically, in the case of the |
Morten Andersen-Gott (magott) said: |
Andrei Pozolotin (andrei.pozolotin) said: |
@axel22 said: In
For these reasons, I'd be inclined to push back and say that it's not worth it. |
Andrei Pozolotin (andrei.pozolotin) said: probably you could relax the "purity" of |
@axel22 said (edited on Feb 11, 2015 7:12:31 PM UTC): I do, however, care about purity and simplicity of any implementation, and I have an emotional attachment to the maintainers. :) Why do I think extra, unwarranted complexity is the case here? First, the test case in this bug report always fails at least at the latest in the second iteration of the loop - that's because the test case is checking the wrong thing - one should replace Second, let's see what the angry mob has to say:
Third, how would you even implement this correctly?
class LazyNode(key: K, thunk: () => V) {
lazy val value = thunk()
} Instead, we must write a custom lock around So, you see, it's possible, it's just that it's somewhat complex. Fourth, after thinking about it in the background mode for some time today, the execute-mapping-only-if-inserted semantics is less useful than you think. Here's why - assume that thread A calls def getOrElseUpdate(k: K, op: =>V): V = {
get(k) match {
case Some(v) => v
case None =>
val v = op
putIfAbsent(k, v) // mutable.MapLike implementation -> update(k, v)
v
} (note: this is different from the current How could a thread B that calls
There is one other situation where a program could observe this - I challenge you to find it. So, I feel convinced that an execute-mapping-only-if-inserted semantics are not worth it here. In any case, I would not run blindly into adding the possibly unnecessary complexity. |
Andrei Pozolotin (andrei.pozolotin) said: |
@axel22 said (edited on Feb 11, 2015 11:03:21 PM UTC): Unless carefully designed tricks are employed (such as the locked Lazy node described earlier), introducing a single lock to a lock-free data structure operation, usually pollutes all other operations with the need for locking :/ |
Jeff Olson (jdolson) said: override def getOrElseUpdate(key: A, op: => B): B = get(key) match {
case Some(v) => v
case None =>
val v = op
putIfAbsent(key, v).getOrElse(v)
} This should be the default implementation in Aside: def adjust(key: A, op: Option[B] => Option[B]): Unit function added to the |
@axel22 said:
|
@adriaanm said: |
Andrei Pozolotin (andrei.pozolotin) said: |
- For BitMasks use TrieMap and a more careful get or update scheme - Reference: [github.com/scala/bug/issues/7943](scala/bug#7943)
- For BitMasks use TrieMap and a more careful get or update scheme - Reference: [github.com/scala/bug/issues/7943](scala/bug#7943)
The ScalaDoc states that a scala.collection.concurrent.TrieMap is thread-safe:
But the method
getOrElseUpdate
of scala.collection.mutable.MapLike is not overridden and is therefore not thread safe:This issue can be recreated with the following code:
Please let me know if you need more information.
As a workaround you can write:
The text was updated successfully, but these errors were encountered: