Lazy vals initialization shouldn't lock on owner #4798
Comments
|
Imported From: https://issues.scala-lang.org/browse/SUGGEST-11?orig=1 |
|
@jrudolph said: |
|
@jrudolph said:
At https://gist.github.com/1076016 I posted an example file to experiment with various solutions.
There's still a solution needed for primitive values: Those are currently boxed and thus, access to them is slower than with the current solution. One could either accept that or introduce another field per primitive lazy val to hold the lock. I'm not sure it goes cheaper than that. Some may argue that this change will bring a performance drawback to lazy val initialization and one possible answer to that would probably be to provide different implementations of lazy vals (e.g. flagged by annotations) for different situations. One should certainly do the benchmarking before deciding for one solution but my take is, that betting on better ability to parallelize will be more beneficial than trying to tweak the last bits out of an implementation that is (or will be) fairly well optimized by the JVM. |
|
@SethTisue said: Available mechanisms for making suggestions include:
|
We were recently reminded that initialization of lazy vals locks on the object holding the lazy val. This makes using lazy vals in a concurrent setting awkward if calculating a value is an expensive operation and thus the lock may be hold for a long time. It is especially expensive
if each (initializing) access to an easily calculated lazy val in the same object is then obstructed by a long-hold lock for calculating another lazy val. This would lead to a rule to use lazy vals only if
needed for performance reasons for long-running calculations but not for smaller calculations. This, however isn't possible because lazy vals are commonly needed to get around the initialization order
issues.
So, lazy vals, while idiomatic, are an obstacle to parallelization. This is a pity because in the common cake pattern you will likely compose several unrelated traits which may contain lazy vals which now can't be initialized in parallel which in turn slows down the initialization of applications built with the cake pattern for their central object. In an application of a certain size debugging these
problem is no fun because the central object will be called from everywhere and to properly parallelize you have to repeatedly start up your program to find all accesses to lazy vals in the call path (or
use a profiler).
One workaround I see right now is to manually touch the cheap lazy-vals to be used concurrently up front to be sure their calculation won't run into a lock. The other one is to use a special
LazyVal class like I used to do in Java and lock on it for the long-running calculations. Both workarounds aren't nearly as usable as the built-in keyword.
This may seem like a special case but IMO this will become more of a trap in the future if more software has to run concurrently. Therefore, lazy vals should be improved to have a lock per lazy val in the default case. A final solution to this problem would have to balance memory use (the per-lazy-val lock has to be saved somewhere) while still maintaining ease of use. David MacIver once conceived of a way to do so.
The text was updated successfully, but these errors were encountered: