SI-9824 SI-9814 proper locking scope for lazy vals in lambdas #133

Closed
adriaanm opened this Issue Apr 20, 2016 · 21 comments

Projects

None yet

3 participants

@adriaanm
Member
adriaanm commented Apr 20, 2016 edited

Consolidating comments from #161 and #177:

  • SI-9824 deadlock using parallel collections with delambdafy:method
  • SI-9814 Function0 doesn't synchronize properly

solution

Let users opt-in to old encoding + feature in release notes.

a look at the byte code

lazy val in closure has different locking scope due to delambdafy:method. It used to be the anonymous function class:

Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_71).
Type in expressions for evaluation. Or try :help.

scala> :paste -raw
// Entering paste mode (ctrl-D to finish)

package p; class C { def foo = () => { lazy val x = 42; x } }

// Exiting paste mode, now interpreting.


scala> new p.C().foo
res0: () => Int = <function0>

scala> new p.C().foo.getClass
res1: Class[_ <: () => Int] = class p.C$$anonfun$foo$1

scala> :javap -private -c p.C$$anonfun$foo$1
Compiled from "<pastie>"
public final class p.C$$anonfun$foo$1 extends scala.runtime.AbstractFunction0$mcI$sp implements scala.Serializable {
  public static final long serialVersionUID;

  public final int apply();
    Code:
       0: aload_0
       1: invokevirtual #21                 // Method apply$mcI$sp:()I
       4: ireturn

  public int apply$mcI$sp();
    Code:
       0: invokestatic  #29                 // Method scala/runtime/IntRef.zero:()Lscala/runtime/IntRef;
       3: astore_1
       4: iconst_0
       5: invokestatic  #35                 // Method scala/runtime/VolatileByteRef.create:(B)Lscala/runtime/VolatileByteRef;
       8: astore_2
       9: aload_0
      10: aload_1
      11: aload_2
      12: invokespecial #39                 // Method x$1:(Lscala/runtime/IntRef;Lscala/runtime/VolatileByteRef;)I
      15: ireturn

  public final java.lang.Object apply();
    Code:
       0: aload_0
       1: invokevirtual #46                 // Method apply:()I
       4: invokestatic  #52                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
       7: areturn

  private final int x$lzycompute$1(scala.runtime.IntRef, scala.runtime.VolatileByteRef);
    Code:
       0: aload_0
       1: dup
       2: astore_3
       3: monitorenter
       4: aload_2
       5: getfield      #57                 // Field scala/runtime/VolatileByteRef.elem:B
       8: iconst_1
       9: iand
      10: i2b
      11: iconst_0
      12: if_icmpne     32
      15: aload_1
      16: bipush        42
      18: putfield      #60                 // Field scala/runtime/IntRef.elem:I
      21: aload_2
      22: aload_2
      23: getfield      #57                 // Field scala/runtime/VolatileByteRef.elem:B
      26: iconst_1
      27: ior
      28: i2b
      29: putfield      #57                 // Field scala/runtime/VolatileByteRef.elem:B
      32: getstatic     #66                 // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
      35: pop
      36: aload_3
      37: monitorexit
      38: aload_1
      39: getfield      #60                 // Field scala/runtime/IntRef.elem:I
      42: ireturn
      43: aload_3
      44: monitorexit
      45: athrow
    Exception table:
       from    to  target type
           4    38    43   any

But is now the enclosing class of the lambda:

⚡ ~/scala/2.12/bin/scala
Welcome to Scala 2.12.0-M4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_71).
Type in expressions for evaluation. Or try :help.

scala> :paste -raw
// Entering paste mode (ctrl-D to finish)

package p; class C { def foo = () => { lazy val x = 42; x } }

// Exiting paste mode, now interpreting.


scala> :javap -private -c p.C
Compiled from "<pastie>"
public class p.C {
  public scala.Function0<java.lang.Object> foo();
    Code:
       0: aload_0
       1: invokedynamic #37,  0             // InvokeDynamic #0:apply$mcI$sp:(Lp/C;)Lscala/runtime/java8/JFunction0$mcI$sp;
       6: areturn

  private final int x$lzycompute$1(scala.runtime.IntRef, scala.runtime.VolatileByteRef);
    Code:
       0: aload_0
       1: dup
       2: astore_3
       3: monitorenter
       4: aload_2
       5: getfield      #51                 // Field scala/runtime/VolatileByteRef.elem:B
       8: iconst_1
       9: iand
      10: i2b
      11: iconst_0
      12: if_icmpne     32
      15: aload_1
      16: bipush        42
      18: putfield      #56                 // Field scala/runtime/IntRef.elem:I
      21: aload_2
      22: aload_2
      23: getfield      #51                 // Field scala/runtime/VolatileByteRef.elem:B
      26: iconst_1
      27: ior
      28: i2b
      29: putfield      #51                 // Field scala/runtime/VolatileByteRef.elem:B
      32: getstatic     #62                 // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
      35: pop
      36: aload_3
      37: monitorexit
      38: goto          44
      41: aload_3
      42: monitorexit
      43: athrow
      44: aload_1
      45: getfield      #56                 // Field scala/runtime/IntRef.elem:I
      48: ireturn
    Exception table:
       from    to  target type
           4    36    41   Class java/lang/Throwable
@adriaanm adriaanm modified the milestone: 2.12.0-RC1 May 2, 2016
@adriaanm adriaanm self-assigned this Jul 12, 2016
@adriaanm
Member

probably all we can do is document the workaround + expose option to use old encoding

@adriaanm adriaanm changed the title from lazy val in closure has different locking scope due to delambdafy:method to sams and locks Jul 18, 2016
@DarkDimius
Member
DarkDimius commented Aug 3, 2016 edited

@adriaanm @retronym note that issue with a lazy val inside a lambda is fixed in Dotty.

@DarkDimius
Member

It is both quite easy to fix and removes one more indirection present in code.
Instead of having a pair of (IntRef, VolatileByteRef) dotty has a single dotty.runtime.LazyInt class that it synchronizes on. In your case, you can synchronize over VolatileByteRef.

import dotty.runtime.LazyInt;
import scala.Function0;
import scala.runtime.BoxesRunTime;

// version in dotty
public class C {
    public Function0 foo() {
        return () -> this.$anonfun$foo$1();
    }

    private int x$lzyINIT1$1(LazyInt x$lzy1$1) {
        Integer n;
        LazyInt lazyInt = x$lzy1$1;
        synchronized (lazyInt) {
            int n2;
            if (x$lzy1$1.initialized()) {
                n2 = x$lzy1$1.value();
            } else {
                x$lzy1$1.initialized_$eq(true);
                x$lzy1$1.value_$eq(42);
                n2 = x$lzy1$1.value();
            }
            n = BoxesRunTime.boxToInteger((int)n2);
        }
        return BoxesRunTime.unboxToInt((Object)n);
    }

    private int x$1(LazyInt x$lzy1$2) {
        return x$lzy1$2.initialized() ? x$lzy1$2.value() : this.x$lzyINIT1$1(x$lzy1$2);
    }

    private int $anonfun$foo$1() {
        LazyInt lazyInt = new LazyInt();
        return this.x$1(lazyInt);
    }
}
@adriaanm adriaanm changed the title from sams and locks to proper locking scope for lazy vals in lambdas Aug 4, 2016
@lrytz
Member
lrytz commented Aug 12, 2016

Implementing it is not trivial: when generating the lzy$compute method (which synchronizes on C.this) in lazyvals the VolatileByteRef does not exist yet. It is only added later when lambdalift moves the method out of the def $anonfun$f nest.

How is this handled in dotty? Does the lazy val translation know if a lazy val is captured and needs boxing?

@DarkDimius
Member

@lrytz,
Dotty has multiple lazy vals schemes.
For local lazy vals, such as one in example - it uses holder classes https://github.com/lampepfl/dotty/blob/master/src/dotty/runtime/LazyHolders.scala. This saves one object allocation and allows to have finer-grained synchronization.

@lrytz
Member
lrytz commented Aug 12, 2016 edited

How does the translation decide whether a holder is needed or not?

Scalac always generates a local variable for the bitmap. In the example, the bitmap is transformed into a VolatileByteRef in lambdalift, which comes later in the pipeline.

@DarkDimius
Member

@lrytz, for local vals, it's generated unconditionally.

@lrytz
Member
lrytz commented Aug 12, 2016

So you rely on the optimizer / linker / hotspot to inline the box?

@DarkDimius
Member

@lrytz, yes, similarly to how it is done in scalac. But in scalac you rely on two boxes being inlined at once(the VolatileByteRef and the actual value). Here, only one box needs to be inlined.

@lrytz
Member
lrytz commented Aug 12, 2016

Oh, I see now. I was going to say that in Scalac we don't create any boxes if the lazy val doesn't escape, but it always does because of the accessor and the compute methods.

@adriaanm adriaanm changed the title from proper locking scope for lazy vals in lambdas to SI-9824 SI-9814 proper locking scope for lazy vals in lambdas Aug 12, 2016
@lrytz
Member
lrytz commented Aug 15, 2016

@DarkDimius in dotc's translation (your example), is there a reason to create the Integer box?

@DarkDimius
Member

@lrytz, no, that's an artifact of how we encode synchronized blocks.
It can be removed by a phase that would special-case this late in the pipeline, or by a rewrite-rule.

@lrytz
Member
lrytz commented Aug 15, 2016

@DarkDimius thanks. BTW, LazyFloat is missing in LazyHolders.scala :)

@lrytz lrytz referenced this issue in scala/scala Aug 15, 2016
Merged

Fields: expand lazy vals during fields, like modules #5294

25 of 25 tasks complete
@lrytz lrytz added a commit to lrytz/scala that referenced this issue Aug 16, 2016
@lrytz lrytz Desugar local lazy vals into a box with a volatile boolean init field
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additionl problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
0a95d45
@lrytz lrytz added a commit to lrytz/scala that referenced this issue Aug 16, 2016
@lrytz lrytz Desugar local lazy vals into a box with a volatile boolean init field
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additionl problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
26fbaf6
@lrytz lrytz added a commit to lrytz/scala that referenced this issue Aug 16, 2016
@lrytz lrytz Desugar local lazy vals into a box with a volatile boolean init field
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additionl problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
3d56b1f
@lrytz lrytz added a commit to lrytz/scala that referenced this issue Aug 16, 2016
@lrytz lrytz Desugar local lazy vals into a box with a volatile boolean init field
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additionl problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
81198e2
@lrytz lrytz added a commit to lrytz/scala that referenced this issue Aug 17, 2016
@lrytz lrytz Desugar local lazy vals into a box with a volatile boolean init field
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additionl problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
efd5425
@adriaanm adriaanm added a commit to adriaanm/scala that referenced this issue Aug 17, 2016
@lrytz @adriaanm lrytz + adriaanm Desugar local lazy vals into a box with a volatile boolean init field
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
b71e98d
@adriaanm adriaanm added a commit to adriaanm/scala that referenced this issue Aug 18, 2016
@lrytz @adriaanm lrytz + adriaanm Desugar local lazy vals into a box with a volatile boolean init field
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
2fca7a0
@adriaanm
Member
adriaanm commented Aug 18, 2016 edited

@DarkDimius, how about shortening this to the following (ignoring the Integer box n):

synchronized (lazyInt) {
  if (!x$lzy1$1.initialized()) {
    x$lzy1$1.initialized_$eq(true);
    x$lzy1$1.value_$eq(42);
  }
  x$lzy1$1.value();
}

As long as value() is read under the monitor, it's fine right? Any thoughts on other publication mechanisms? Why does initialized need to be @volatile, as it's all in the synchronized block anyway? There seems to be no benefit to it, since reading initialized does not give any guarantees about what you'd read in value.

@adriaanm adriaanm added a commit to adriaanm/scala that referenced this issue Aug 19, 2016
@adriaanm adriaanm Fields does bitmaps & synch for lazy vals & modules
Essentially, we fuse mixin and lazyvals into the fields phase.
With fields mixing in trait members into subclasses, we
have all info needed to compute bitmaps, and thus we can
synthesize the synchronisation logic as well.

By doing this before erasure we get better signatures,
and before specialized means specialized lazy vals work now.
Mixins is now almost reduced to its essence: implementing
super accessors and forwarders. It still synthesizes
accessors for param accessors and early init trait vals.

Concretely, trait lazy vals are mixed into subclasses
with the needed synchronization logic in place, as do
lazy vals in classes and methods. Similarly, modules
are initialized using double checked locking.
Since the code to initialize a module is short,
we do not emit compute methods for modules (anymore).

For simplicity, local lazy vals do not get a compute method either.

The strange corner case of constant-typed final lazy vals
is resolved in favor of laziness, by no longer assigning
a constant type to a lazy val (see widenIfNecessary in namers).
If you explicitly ask for something lazy, you get laziness;
with the constant-typedness implicit, it yields to the
conflicting `lazy` modifier because it is explicit.

  Co-Authored-By: Lukas Rytz <lukas@lightbend.com>
  Fixes scala/scala-dev#133

Inspired by dotc, desugar a local `lazy val x = rhs` into

```
val x$lzy = new scala.runtime.LazyInt()
def x(): Int = {
    x$lzy.synchronized {
      if (!x$lzy.initialized) {
        x$lzy.initialized = true
        x$lzy.value = rhs
      }
      x$lzy.value
    }
}
```

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

```
def f = {
  lazy val x = 0
  x
}
```

desugars to

```
public int f() {
    IntRef x$lzy = IntRef.zero();
    VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
    return this.x$1(x$lzy, bitmap$0);
}

private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    C c = this;
    synchronized (c) {
        if ((byte)(bitmap$0$1.elem & 1) == 0) {
            x$lzy$1.elem = 0;
            bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
        }
        return x$lzy$1.elem;
    }
}

private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    return (byte)(bitmap$0$1.elem & 1) == 0 ?
      this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
}
```

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem (scala/scala-dev#133)
by synchronizing on the lazy holder.

Currently, we don't exploit the fact that the initialized field
is `@volatile`, because it's not clear the performance is needed
for local lazy vals (as they are not contended, and as soon as
the VM warms up, biased locking should deal with that)

Note, be very very careful when moving to double-checked locking,
as this needs a different variation than the one we use for
class-member lazy vals. A read of a volatile field of a class
does not necessarily impart any knowledge about a "subsequent" read
of another non-volatile field of the same object. A pair of
volatile reads and write can be used to implement a lock, but it's
not clear if the complexity is worth an unproven performance gain.
(Once the performance gain is proven, let's change the encoding.)

  - don't explicitly init bitmap in bytecode
  - must apply method to () explicitly after uncurry
f52047f
@adriaanm adriaanm added a commit to adriaanm/scala that referenced this issue Aug 19, 2016
@adriaanm @adriaanm adriaanm + adriaanm Fields does bitmaps & synch for lazy vals & modules
Essentially, we fuse mixin and lazyvals into the fields phase.
With fields mixing in trait members into subclasses, we
have all info needed to compute bitmaps, and thus we can
synthesize the synchronisation logic as well.

By doing this before erasure we get better signatures,
and before specialized means specialized lazy vals work now.
Mixins is now almost reduced to its essence: implementing
super accessors and forwarders. It still synthesizes
accessors for param accessors and early init trait vals.

Concretely, trait lazy vals are mixed into subclasses
with the needed synchronization logic in place, as do
lazy vals in classes and methods. Similarly, modules
are initialized using double checked locking.
Since the code to initialize a module is short,
we do not emit compute methods for modules (anymore).

For simplicity, local lazy vals do not get a compute method either.

The strange corner case of constant-typed final lazy vals
is resolved in favor of laziness, by no longer assigning
a constant type to a lazy val (see widenIfNecessary in namers).
If you explicitly ask for something lazy, you get laziness;
with the constant-typedness implicit, it yields to the
conflicting `lazy` modifier because it is explicit.

  Co-Authored-By: Lukas Rytz <lukas@lightbend.com>
  Fixes scala/scala-dev#133

Inspired by dotc, desugar a local `lazy val x = rhs` into

```
val x$lzy = new scala.runtime.LazyInt()
def x(): Int = {
    x$lzy.synchronized {
      if (!x$lzy.initialized) {
        x$lzy.initialized = true
        x$lzy.value = rhs
      }
      x$lzy.value
    }
}
```

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

```
def f = {
  lazy val x = 0
  x
}
```

desugars to

```
public int f() {
    IntRef x$lzy = IntRef.zero();
    VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
    return this.x$1(x$lzy, bitmap$0);
}

private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    C c = this;
    synchronized (c) {
        if ((byte)(bitmap$0$1.elem & 1) == 0) {
            x$lzy$1.elem = 0;
            bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
        }
        return x$lzy$1.elem;
    }
}

private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    return (byte)(bitmap$0$1.elem & 1) == 0 ?
      this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
}
```

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem (scala/scala-dev#133)
by synchronizing on the lazy holder.

Currently, we don't exploit the fact that the initialized field
is `@volatile`, because it's not clear the performance is needed
for local lazy vals (as they are not contended, and as soon as
the VM warms up, biased locking should deal with that)

Note, be very very careful when moving to double-checked locking,
as this needs a different variation than the one we use for
class-member lazy vals. A read of a volatile field of a class
does not necessarily impart any knowledge about a "subsequent" read
of another non-volatile field of the same object. A pair of
volatile reads and write can be used to implement a lock, but it's
not clear if the complexity is worth an unproven performance gain.
(Once the performance gain is proven, let's change the encoding.)

  - don't explicitly init bitmap in bytecode
  - must apply method to () explicitly after uncurry
26bf2fc
@lrytz
Member
lrytz commented Aug 22, 2016

@adriaanm IUC Aleksey's post correctly, the problem with this approach is that it synchronizes on every access of the lazy val, which is slow. The idea of double-checked locking (again, IUC wikipedia correctly) is to return an initialized lazy val without acquiring the lock. Since the initialized field is checked outside the synchronized block in double-checked locking, it needs to be volatile.

@DarkDimius
Member

@adriaanm, I've benchmarked the version you propose as part of https://github.com/DarkDimius/lazy-val-bench and It was slower even in case of absence of contention. Though such VM behavior in this case may have changed over last 2 years, I'd assume it's still the case.

@lrytz
Member
lrytz commented Aug 22, 2016

@adriaanm i see now what you're saying about no guarantees reading non-volatile value after seeing volatile initialized == true; need to look/think at it a bit more.

@adriaanm
Member
adriaanm commented Aug 22, 2016 edited

@DarkDimius that's surprising to me. Could you share any thoughts on why

synchronized (lazyInt) {
  int n2;
  if (x$lzy1$1.initialized()) {
    n2 = x$lzy1$1.value();
  } else {
    x$lzy1$1.initialized_$eq(true);
    x$lzy1$1.value_$eq(42);
    n2 = x$lzy1$1.value();
  }
  n = BoxesRunTime.boxToInteger((int)n2);
}

is faster than:

synchronized (lazyInt) {
  if (!x$lzy1$1.initialized()) {
    x$lzy1$1.initialized_$eq(true);
    x$lzy1$1.value_$eq(42);
  }
  x$lzy1$1.value();
}

?

@adriaanm adriaanm added a commit to adriaanm/scala that referenced this issue Aug 22, 2016
@adriaanm @adriaanm adriaanm + adriaanm Fields does bitmaps & synch for lazy vals & modules
Essentially, we fuse mixin and lazyvals into the fields phase.
With fields mixing in trait members into subclasses, we
have all info needed to compute bitmaps, and thus we can
synthesize the synchronisation logic as well.

By doing this before erasure we get better signatures,
and before specialized means specialized lazy vals work now.
Mixins is now almost reduced to its essence: implementing
super accessors and forwarders. It still synthesizes
accessors for param accessors and early init trait vals.

Concretely, trait lazy vals are mixed into subclasses
with the needed synchronization logic in place, as do
lazy vals in classes and methods. Similarly, modules
are initialized using double checked locking.
Since the code to initialize a module is short,
we do not emit compute methods for modules (anymore).

For simplicity, local lazy vals do not get a compute method either.

The strange corner case of constant-typed final lazy vals
is resolved in favor of laziness, by no longer assigning
a constant type to a lazy val (see widenIfNecessary in namers).
If you explicitly ask for something lazy, you get laziness;
with the constant-typedness implicit, it yields to the
conflicting `lazy` modifier because it is explicit.

  Co-Authored-By: Lukas Rytz <lukas@lightbend.com>
  Fixes scala/scala-dev#133

Inspired by dotc, desugar a local `lazy val x = rhs` into

```
val x$lzy = new scala.runtime.LazyInt()
def x(): Int = {
    x$lzy.synchronized {
      if (!x$lzy.initialized) {
        x$lzy.initialized = true
        x$lzy.value = rhs
      }
      x$lzy.value
    }
}
```

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

```
def f = {
  lazy val x = 0
  x
}
```

desugars to

```
public int f() {
    IntRef x$lzy = IntRef.zero();
    VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
    return this.x$1(x$lzy, bitmap$0);
}

private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    C c = this;
    synchronized (c) {
        if ((byte)(bitmap$0$1.elem & 1) == 0) {
            x$lzy$1.elem = 0;
            bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
        }
        return x$lzy$1.elem;
    }
}

private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    return (byte)(bitmap$0$1.elem & 1) == 0 ?
      this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
}
```

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem (scala/scala-dev#133)
by synchronizing on the lazy holder.

Currently, we don't exploit the fact that the initialized field
is `@volatile`, because it's not clear the performance is needed
for local lazy vals (as they are not contended, and as soon as
the VM warms up, biased locking should deal with that)

Note, be very very careful when moving to double-checked locking,
as this needs a different variation than the one we use for
class-member lazy vals. A read of a volatile field of a class
does not necessarily impart any knowledge about a "subsequent" read
of another non-volatile field of the same object. A pair of
volatile reads and write can be used to implement a lock, but it's
not clear if the complexity is worth an unproven performance gain.
(Once the performance gain is proven, let's change the encoding.)

  - don't explicitly init bitmap in bytecode
  - must apply method to () explicitly after uncurry
0196c71
@adriaanm
Member

Ah ok, Lukas pointed out I was missing the fast path:

    private int x$1(LazyInt x$lzy1$2) {
        return x$lzy1$2.initialized() ? x$lzy1$2.value() : this.x$lzyINIT1$1(x$lzy1$2);
    }

duh... :-)

@adriaanm
Member

My question then extends to our lazy val impl on both locals and fields: why is a volatile read/write combo on the init bit enough to establish happens-before for the actual field that has the lazily computed value? I understand how it works for SafeDCLFactory in https://shipilev.net/blog/2014/safe-public-construction/#_safe_publication, but there all volatile reads and writes are on the same field.

@adriaanm
Member

Ok, I think I convinced myself after reading https://shipilev.net/blog/2014/jmm-pragmatics/#_happens_before_publication

@retronym retronym added a commit to retronym/scala that referenced this issue Aug 29, 2016
@adriaanm @retronym adriaanm + retronym Fields does bitmaps & synch for lazy vals & modules
Essentially, we fuse mixin and lazyvals into the fields phase.
With fields mixing in trait members into subclasses, we
have all info needed to compute bitmaps, and thus we can
synthesize the synchronisation logic as well.

By doing this before erasure we get better signatures,
and before specialized means specialized lazy vals work now.
Mixins is now almost reduced to its essence: implementing
super accessors and forwarders. It still synthesizes
accessors for param accessors and early init trait vals.

Concretely, trait lazy vals are mixed into subclasses
with the needed synchronization logic in place, as do
lazy vals in classes and methods. Similarly, modules
are initialized using double checked locking.
Since the code to initialize a module is short,
we do not emit compute methods for modules (anymore).

For simplicity, local lazy vals do not get a compute method either.

The strange corner case of constant-typed final lazy vals
is resolved in favor of laziness, by no longer assigning
a constant type to a lazy val (see widenIfNecessary in namers).
If you explicitly ask for something lazy, you get laziness;
with the constant-typedness implicit, it yields to the
conflicting `lazy` modifier because it is explicit.

  Co-Authored-By: Lukas Rytz <lukas@lightbend.com>
  Fixes scala/scala-dev#133

Inspired by dotc, desugar a local `lazy val x = rhs` into

```
val x$lzy = new scala.runtime.LazyInt()
def x(): Int = {
    x$lzy.synchronized {
      if (!x$lzy.initialized) {
        x$lzy.initialized = true
        x$lzy.value = rhs
      }
      x$lzy.value
    }
}
```

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

```
def f = {
  lazy val x = 0
  x
}
```

desugars to

```
public int f() {
    IntRef x$lzy = IntRef.zero();
    VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
    return this.x$1(x$lzy, bitmap$0);
}

private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    C c = this;
    synchronized (c) {
        if ((byte)(bitmap$0$1.elem & 1) == 0) {
            x$lzy$1.elem = 0;
            bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
        }
        return x$lzy$1.elem;
    }
}

private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    return (byte)(bitmap$0$1.elem & 1) == 0 ?
      this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
}
```

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem (scala/scala-dev#133)
by synchronizing on the lazy holder.

Currently, we don't exploit the fact that the initialized field
is `@volatile`, because it's not clear the performance is needed
for local lazy vals (as they are not contended, and as soon as
the VM warms up, biased locking should deal with that)

Note, be very very careful when moving to double-checked locking,
as this needs a different variation than the one we use for
class-member lazy vals. A read of a volatile field of a class
does not necessarily impart any knowledge about a "subsequent" read
of another non-volatile field of the same object. A pair of
volatile reads and write can be used to implement a lock, but it's
not clear if the complexity is worth an unproven performance gain.
(Once the performance gain is proven, let's change the encoding.)

  - don't explicitly init bitmap in bytecode
  - must apply method to () explicitly after uncurry
5b7c909
@adriaanm adriaanm added a commit to scala/scala that closed this issue Sep 1, 2016
@adriaanm adriaanm Fields does bitmaps & synch for lazy vals & modules
Essentially, we fuse mixin and lazyvals into the fields phase.
With fields mixing in trait members into subclasses, we
have all info needed to compute bitmaps, and thus we can
synthesize the synchronisation logic as well.

By doing this before erasure we get better signatures,
and before specialized means specialized lazy vals work now.
Mixins is now almost reduced to its essence: implementing
super accessors and forwarders. It still synthesizes
accessors for param accessors and early init trait vals.

Concretely, trait lazy vals are mixed into subclasses
with the needed synchronization logic in place, as do
lazy vals in classes and methods. Similarly, modules
are initialized using double checked locking.
Since the code to initialize a module is short,
we do not emit compute methods for modules (anymore).

For simplicity, local lazy vals do not get a compute method either.

The strange corner case of constant-typed final lazy vals
is resolved in favor of laziness, by no longer assigning
a constant type to a lazy val (see widenIfNecessary in namers).
If you explicitly ask for something lazy, you get laziness;
with the constant-typedness implicit, it yields to the
conflicting `lazy` modifier because it is explicit.

  Co-Authored-By: Lukas Rytz <lukas@lightbend.com>
  Fixes scala/scala-dev#133

Inspired by dotc, desugar a local `lazy val x = rhs` into

```
val x$lzy = new scala.runtime.LazyInt()
def x(): Int = {
    x$lzy.synchronized {
      if (!x$lzy.initialized) {
        x$lzy.initialized = true
        x$lzy.value = rhs
      }
      x$lzy.value
    }
}
```

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

```
def f = {
  lazy val x = 0
  x
}
```

desugars to

```
public int f() {
    IntRef x$lzy = IntRef.zero();
    VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
    return this.x$1(x$lzy, bitmap$0);
}

private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    C c = this;
    synchronized (c) {
        if ((byte)(bitmap$0$1.elem & 1) == 0) {
            x$lzy$1.elem = 0;
            bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
        }
        return x$lzy$1.elem;
    }
}

private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    return (byte)(bitmap$0$1.elem & 1) == 0 ?
      this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
}
```

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem (scala/scala-dev#133)
by synchronizing on the lazy holder.

Currently, we don't exploit the fact that the initialized field
is `@volatile`, because it's not clear the performance is needed
for local lazy vals (as they are not contended, and as soon as
the VM warms up, biased locking should deal with that)

Note, be very very careful when moving to double-checked locking,
as this needs a different variation than the one we use for
class-member lazy vals. A read of a volatile field of a class
does not necessarily impart any knowledge about a "subsequent" read
of another non-volatile field of the same object. A pair of
volatile reads and write can be used to implement a lock, but it's
not clear if the complexity is worth an unproven performance gain.
(Once the performance gain is proven, let's change the encoding.)

  - don't explicitly init bitmap in bytecode
  - must apply method to () explicitly after uncurry
b2e0911
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment