Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SI-9824 SI-9814 proper locking scope for lazy vals in lambdas #133

Closed
adriaanm opened this issue Apr 20, 2016 · 21 comments
Closed

SI-9824 SI-9814 proper locking scope for lazy vals in lambdas #133

adriaanm opened this issue Apr 20, 2016 · 21 comments
Assignees
Milestone

Comments

@adriaanm
Copy link
Member

@adriaanm adriaanm commented Apr 20, 2016

Consolidating comments from #161 and #177:

  • SI-9824 deadlock using parallel collections with delambdafy:method
  • SI-9814 Function0 doesn't synchronize properly

solution

Let users opt-in to old encoding + feature in release notes.

a look at the byte code

lazy val in closure has different locking scope due to delambdafy:method. It used to be the anonymous function class:

Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_71).
Type in expressions for evaluation. Or try :help.

scala> :paste -raw
// Entering paste mode (ctrl-D to finish)

package p; class C { def foo = () => { lazy val x = 42; x } }

// Exiting paste mode, now interpreting.


scala> new p.C().foo
res0: () => Int = <function0>

scala> new p.C().foo.getClass
res1: Class[_ <: () => Int] = class p.C$$anonfun$foo$1

scala> :javap -private -c p.C$$anonfun$foo$1
Compiled from "<pastie>"
public final class p.C$$anonfun$foo$1 extends scala.runtime.AbstractFunction0$mcI$sp implements scala.Serializable {
  public static final long serialVersionUID;

  public final int apply();
    Code:
       0: aload_0
       1: invokevirtual #21                 // Method apply$mcI$sp:()I
       4: ireturn

  public int apply$mcI$sp();
    Code:
       0: invokestatic  #29                 // Method scala/runtime/IntRef.zero:()Lscala/runtime/IntRef;
       3: astore_1
       4: iconst_0
       5: invokestatic  #35                 // Method scala/runtime/VolatileByteRef.create:(B)Lscala/runtime/VolatileByteRef;
       8: astore_2
       9: aload_0
      10: aload_1
      11: aload_2
      12: invokespecial #39                 // Method x$1:(Lscala/runtime/IntRef;Lscala/runtime/VolatileByteRef;)I
      15: ireturn

  public final java.lang.Object apply();
    Code:
       0: aload_0
       1: invokevirtual #46                 // Method apply:()I
       4: invokestatic  #52                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
       7: areturn

  private final int x$lzycompute$1(scala.runtime.IntRef, scala.runtime.VolatileByteRef);
    Code:
       0: aload_0
       1: dup
       2: astore_3
       3: monitorenter
       4: aload_2
       5: getfield      #57                 // Field scala/runtime/VolatileByteRef.elem:B
       8: iconst_1
       9: iand
      10: i2b
      11: iconst_0
      12: if_icmpne     32
      15: aload_1
      16: bipush        42
      18: putfield      #60                 // Field scala/runtime/IntRef.elem:I
      21: aload_2
      22: aload_2
      23: getfield      #57                 // Field scala/runtime/VolatileByteRef.elem:B
      26: iconst_1
      27: ior
      28: i2b
      29: putfield      #57                 // Field scala/runtime/VolatileByteRef.elem:B
      32: getstatic     #66                 // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
      35: pop
      36: aload_3
      37: monitorexit
      38: aload_1
      39: getfield      #60                 // Field scala/runtime/IntRef.elem:I
      42: ireturn
      43: aload_3
      44: monitorexit
      45: athrow
    Exception table:
       from    to  target type
           4    38    43   any

But is now the enclosing class of the lambda:

⚡ ~/scala/2.12/bin/scala
Welcome to Scala 2.12.0-M4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_71).
Type in expressions for evaluation. Or try :help.

scala> :paste -raw
// Entering paste mode (ctrl-D to finish)

package p; class C { def foo = () => { lazy val x = 42; x } }

// Exiting paste mode, now interpreting.


scala> :javap -private -c p.C
Compiled from "<pastie>"
public class p.C {
  public scala.Function0<java.lang.Object> foo();
    Code:
       0: aload_0
       1: invokedynamic #37,  0             // InvokeDynamic #0:apply$mcI$sp:(Lp/C;)Lscala/runtime/java8/JFunction0$mcI$sp;
       6: areturn

  private final int x$lzycompute$1(scala.runtime.IntRef, scala.runtime.VolatileByteRef);
    Code:
       0: aload_0
       1: dup
       2: astore_3
       3: monitorenter
       4: aload_2
       5: getfield      #51                 // Field scala/runtime/VolatileByteRef.elem:B
       8: iconst_1
       9: iand
      10: i2b
      11: iconst_0
      12: if_icmpne     32
      15: aload_1
      16: bipush        42
      18: putfield      #56                 // Field scala/runtime/IntRef.elem:I
      21: aload_2
      22: aload_2
      23: getfield      #51                 // Field scala/runtime/VolatileByteRef.elem:B
      26: iconst_1
      27: ior
      28: i2b
      29: putfield      #51                 // Field scala/runtime/VolatileByteRef.elem:B
      32: getstatic     #62                 // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
      35: pop
      36: aload_3
      37: monitorexit
      38: goto          44
      41: aload_3
      42: monitorexit
      43: athrow
      44: aload_1
      45: getfield      #56                 // Field scala/runtime/IntRef.elem:I
      48: ireturn
    Exception table:
       from    to  target type
           4    36    41   Class java/lang/Throwable
@adriaanm adriaanm modified the milestone: 2.12.0-RC1 May 2, 2016
@adriaanm adriaanm self-assigned this Jul 12, 2016
@adriaanm
Copy link
Member Author

@adriaanm adriaanm commented Jul 12, 2016

probably all we can do is document the workaround + expose option to use old encoding

@adriaanm adriaanm changed the title lazy val in closure has different locking scope due to delambdafy:method sams and locks Jul 18, 2016
@DarkDimius
Copy link
Member

@DarkDimius DarkDimius commented Aug 3, 2016

@adriaanm @retronym note that issue with a lazy val inside a lambda is fixed in Dotty.

@DarkDimius
Copy link
Member

@DarkDimius DarkDimius commented Aug 3, 2016

It is both quite easy to fix and removes one more indirection present in code.
Instead of having a pair of (IntRef, VolatileByteRef) dotty has a single dotty.runtime.LazyInt class that it synchronizes on. In your case, you can synchronize over VolatileByteRef.

import dotty.runtime.LazyInt;
import scala.Function0;
import scala.runtime.BoxesRunTime;

// version in dotty
public class C {
    public Function0 foo() {
        return () -> this.$anonfun$foo$1();
    }

    private int x$lzyINIT1$1(LazyInt x$lzy1$1) {
        Integer n;
        LazyInt lazyInt = x$lzy1$1;
        synchronized (lazyInt) {
            int n2;
            if (x$lzy1$1.initialized()) {
                n2 = x$lzy1$1.value();
            } else {
                x$lzy1$1.initialized_$eq(true);
                x$lzy1$1.value_$eq(42);
                n2 = x$lzy1$1.value();
            }
            n = BoxesRunTime.boxToInteger((int)n2);
        }
        return BoxesRunTime.unboxToInt((Object)n);
    }

    private int x$1(LazyInt x$lzy1$2) {
        return x$lzy1$2.initialized() ? x$lzy1$2.value() : this.x$lzyINIT1$1(x$lzy1$2);
    }

    private int $anonfun$foo$1() {
        LazyInt lazyInt = new LazyInt();
        return this.x$1(lazyInt);
    }
}
@adriaanm adriaanm changed the title sams and locks proper locking scope for lazy vals in lambdas Aug 4, 2016
@lrytz
Copy link
Member

@lrytz lrytz commented Aug 12, 2016

Implementing it is not trivial: when generating the lzy$compute method (which synchronizes on C.this) in lazyvals the VolatileByteRef does not exist yet. It is only added later when lambdalift moves the method out of the def $anonfun$f nest.

How is this handled in dotty? Does the lazy val translation know if a lazy val is captured and needs boxing?

@DarkDimius
Copy link
Member

@DarkDimius DarkDimius commented Aug 12, 2016

@lrytz,
Dotty has multiple lazy vals schemes.
For local lazy vals, such as one in example - it uses holder classes https://github.com/lampepfl/dotty/blob/master/src/dotty/runtime/LazyHolders.scala. This saves one object allocation and allows to have finer-grained synchronization.

@lrytz
Copy link
Member

@lrytz lrytz commented Aug 12, 2016

How does the translation decide whether a holder is needed or not?

Scalac always generates a local variable for the bitmap. In the example, the bitmap is transformed into a VolatileByteRef in lambdalift, which comes later in the pipeline.

@DarkDimius
Copy link
Member

@DarkDimius DarkDimius commented Aug 12, 2016

@lrytz, for local vals, it's generated unconditionally.

@lrytz
Copy link
Member

@lrytz lrytz commented Aug 12, 2016

So you rely on the optimizer / linker / hotspot to inline the box?

@DarkDimius
Copy link
Member

@DarkDimius DarkDimius commented Aug 12, 2016

@lrytz, yes, similarly to how it is done in scalac. But in scalac you rely on two boxes being inlined at once(the VolatileByteRef and the actual value). Here, only one box needs to be inlined.

@lrytz
Copy link
Member

@lrytz lrytz commented Aug 12, 2016

Oh, I see now. I was going to say that in Scalac we don't create any boxes if the lazy val doesn't escape, but it always does because of the accessor and the compute methods.

@adriaanm adriaanm changed the title proper locking scope for lazy vals in lambdas SI-9824 SI-9814 proper locking scope for lazy vals in lambdas Aug 12, 2016
@lrytz
Copy link
Member

@lrytz lrytz commented Aug 15, 2016

@DarkDimius in dotc's translation (your example), is there a reason to create the Integer box?

@DarkDimius
Copy link
Member

@DarkDimius DarkDimius commented Aug 15, 2016

@lrytz, no, that's an artifact of how we encode synchronized blocks.
It can be removed by a phase that would special-case this late in the pipeline, or by a rewrite-rule.

@lrytz
Copy link
Member

@lrytz lrytz commented Aug 15, 2016

@DarkDimius thanks. BTW, LazyFloat is missing in LazyHolders.scala :)

lrytz added a commit to lrytz/scala that referenced this issue Aug 16, 2016
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additionl problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
lrytz added a commit to lrytz/scala that referenced this issue Aug 16, 2016
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additionl problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
lrytz added a commit to lrytz/scala that referenced this issue Aug 16, 2016
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additionl problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
lrytz added a commit to lrytz/scala that referenced this issue Aug 16, 2016
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additionl problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
lrytz added a commit to lrytz/scala that referenced this issue Aug 17, 2016
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additionl problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
adriaanm added a commit to adriaanm/scala that referenced this issue Aug 17, 2016
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
adriaanm added a commit to adriaanm/scala that referenced this issue Aug 18, 2016
Inspired by dotc, desugar a local `lazy val x = rhs` into

    val x$lzy = new scala.runtime.LazyInt()
    def x(): Int = {
      if (!x$lzy.initialized) {
        x$lzy.synchronized {
          if (!x$lzy.initialized) {
            x$lzy.initialized = true
            x$lzy.value = rhs
          }
        }
      }
      x$lzy.value
    }

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

    def f = {
      lazy val x = 0
      x
    }

desugars to

    public int f() {
        IntRef x$lzy = IntRef.zero();
        VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
        return this.x$1(x$lzy, bitmap$0);
    }

    private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        C c = this;
        synchronized (c) {
            if ((byte)(bitmap$0$1.elem & 1) == 0) {
                x$lzy$1.elem = 0;
                bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
            }
            return x$lzy$1.elem;
        }
    }

    private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
        return (byte)(bitmap$0$1.elem & 1) == 0 ? this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
    }

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem by synchronizing on the lazy
holder.

Resolves scala/scala-dev#133
@adriaanm
Copy link
Member Author

@adriaanm adriaanm commented Aug 18, 2016

@DarkDimius, how about shortening this to the following (ignoring the Integer box n):

synchronized (lazyInt) {
  if (!x$lzy1$1.initialized()) {
    x$lzy1$1.initialized_$eq(true);
    x$lzy1$1.value_$eq(42);
  }
  x$lzy1$1.value();
}

As long as value() is read under the monitor, it's fine right? Any thoughts on other publication mechanisms? Why does initialized need to be @volatile, as it's all in the synchronized block anyway? There seems to be no benefit to it, since reading initialized does not give any guarantees about what you'd read in value.

adriaanm added a commit to adriaanm/scala that referenced this issue Aug 19, 2016
Essentially, we fuse mixin and lazyvals into the fields phase.
With fields mixing in trait members into subclasses, we
have all info needed to compute bitmaps, and thus we can
synthesize the synchronisation logic as well.

By doing this before erasure we get better signatures,
and before specialized means specialized lazy vals work now.
Mixins is now almost reduced to its essence: implementing
super accessors and forwarders. It still synthesizes
accessors for param accessors and early init trait vals.

Concretely, trait lazy vals are mixed into subclasses
with the needed synchronization logic in place, as do
lazy vals in classes and methods. Similarly, modules
are initialized using double checked locking.
Since the code to initialize a module is short,
we do not emit compute methods for modules (anymore).

For simplicity, local lazy vals do not get a compute method either.

The strange corner case of constant-typed final lazy vals
is resolved in favor of laziness, by no longer assigning
a constant type to a lazy val (see widenIfNecessary in namers).
If you explicitly ask for something lazy, you get laziness;
with the constant-typedness implicit, it yields to the
conflicting `lazy` modifier because it is explicit.

  Co-Authored-By: Lukas Rytz <lukas@lightbend.com>
  Fixes scala/scala-dev#133

Inspired by dotc, desugar a local `lazy val x = rhs` into

```
val x$lzy = new scala.runtime.LazyInt()
def x(): Int = {
    x$lzy.synchronized {
      if (!x$lzy.initialized) {
        x$lzy.initialized = true
        x$lzy.value = rhs
      }
      x$lzy.value
    }
}
```

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

```
def f = {
  lazy val x = 0
  x
}
```

desugars to

```
public int f() {
    IntRef x$lzy = IntRef.zero();
    VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
    return this.x$1(x$lzy, bitmap$0);
}

private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    C c = this;
    synchronized (c) {
        if ((byte)(bitmap$0$1.elem & 1) == 0) {
            x$lzy$1.elem = 0;
            bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
        }
        return x$lzy$1.elem;
    }
}

private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    return (byte)(bitmap$0$1.elem & 1) == 0 ?
      this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
}
```

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem (scala/scala-dev#133)
by synchronizing on the lazy holder.

Currently, we don't exploit the fact that the initialized field
is `@volatile`, because it's not clear the performance is needed
for local lazy vals (as they are not contended, and as soon as
the VM warms up, biased locking should deal with that)

Note, be very very careful when moving to double-checked locking,
as this needs a different variation than the one we use for
class-member lazy vals. A read of a volatile field of a class
does not necessarily impart any knowledge about a "subsequent" read
of another non-volatile field of the same object. A pair of
volatile reads and write can be used to implement a lock, but it's
not clear if the complexity is worth an unproven performance gain.
(Once the performance gain is proven, let's change the encoding.)

  - don't explicitly init bitmap in bytecode
  - must apply method to () explicitly after uncurry
adriaanm added a commit to adriaanm/scala that referenced this issue Aug 19, 2016
Essentially, we fuse mixin and lazyvals into the fields phase.
With fields mixing in trait members into subclasses, we
have all info needed to compute bitmaps, and thus we can
synthesize the synchronisation logic as well.

By doing this before erasure we get better signatures,
and before specialized means specialized lazy vals work now.
Mixins is now almost reduced to its essence: implementing
super accessors and forwarders. It still synthesizes
accessors for param accessors and early init trait vals.

Concretely, trait lazy vals are mixed into subclasses
with the needed synchronization logic in place, as do
lazy vals in classes and methods. Similarly, modules
are initialized using double checked locking.
Since the code to initialize a module is short,
we do not emit compute methods for modules (anymore).

For simplicity, local lazy vals do not get a compute method either.

The strange corner case of constant-typed final lazy vals
is resolved in favor of laziness, by no longer assigning
a constant type to a lazy val (see widenIfNecessary in namers).
If you explicitly ask for something lazy, you get laziness;
with the constant-typedness implicit, it yields to the
conflicting `lazy` modifier because it is explicit.

  Co-Authored-By: Lukas Rytz <lukas@lightbend.com>
  Fixes scala/scala-dev#133

Inspired by dotc, desugar a local `lazy val x = rhs` into

```
val x$lzy = new scala.runtime.LazyInt()
def x(): Int = {
    x$lzy.synchronized {
      if (!x$lzy.initialized) {
        x$lzy.initialized = true
        x$lzy.value = rhs
      }
      x$lzy.value
    }
}
```

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

```
def f = {
  lazy val x = 0
  x
}
```

desugars to

```
public int f() {
    IntRef x$lzy = IntRef.zero();
    VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
    return this.x$1(x$lzy, bitmap$0);
}

private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    C c = this;
    synchronized (c) {
        if ((byte)(bitmap$0$1.elem & 1) == 0) {
            x$lzy$1.elem = 0;
            bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
        }
        return x$lzy$1.elem;
    }
}

private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    return (byte)(bitmap$0$1.elem & 1) == 0 ?
      this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
}
```

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem (scala/scala-dev#133)
by synchronizing on the lazy holder.

Currently, we don't exploit the fact that the initialized field
is `@volatile`, because it's not clear the performance is needed
for local lazy vals (as they are not contended, and as soon as
the VM warms up, biased locking should deal with that)

Note, be very very careful when moving to double-checked locking,
as this needs a different variation than the one we use for
class-member lazy vals. A read of a volatile field of a class
does not necessarily impart any knowledge about a "subsequent" read
of another non-volatile field of the same object. A pair of
volatile reads and write can be used to implement a lock, but it's
not clear if the complexity is worth an unproven performance gain.
(Once the performance gain is proven, let's change the encoding.)

  - don't explicitly init bitmap in bytecode
  - must apply method to () explicitly after uncurry
@lrytz
Copy link
Member

@lrytz lrytz commented Aug 22, 2016

@adriaanm IUC Aleksey's post correctly, the problem with this approach is that it synchronizes on every access of the lazy val, which is slow. The idea of double-checked locking (again, IUC wikipedia correctly) is to return an initialized lazy val without acquiring the lock. Since the initialized field is checked outside the synchronized block in double-checked locking, it needs to be volatile.

@DarkDimius
Copy link
Member

@DarkDimius DarkDimius commented Aug 22, 2016

@adriaanm, I've benchmarked the version you propose as part of https://github.com/DarkDimius/lazy-val-bench and It was slower even in case of absence of contention. Though such VM behavior in this case may have changed over last 2 years, I'd assume it's still the case.

@lrytz
Copy link
Member

@lrytz lrytz commented Aug 22, 2016

@adriaanm i see now what you're saying about no guarantees reading non-volatile value after seeing volatile initialized == true; need to look/think at it a bit more.

@adriaanm
Copy link
Member Author

@adriaanm adriaanm commented Aug 22, 2016

@DarkDimius that's surprising to me. Could you share any thoughts on why

synchronized (lazyInt) {
  int n2;
  if (x$lzy1$1.initialized()) {
    n2 = x$lzy1$1.value();
  } else {
    x$lzy1$1.initialized_$eq(true);
    x$lzy1$1.value_$eq(42);
    n2 = x$lzy1$1.value();
  }
  n = BoxesRunTime.boxToInteger((int)n2);
}

is faster than:

synchronized (lazyInt) {
  if (!x$lzy1$1.initialized()) {
    x$lzy1$1.initialized_$eq(true);
    x$lzy1$1.value_$eq(42);
  }
  x$lzy1$1.value();
}

?

adriaanm added a commit to adriaanm/scala that referenced this issue Aug 22, 2016
Essentially, we fuse mixin and lazyvals into the fields phase.
With fields mixing in trait members into subclasses, we
have all info needed to compute bitmaps, and thus we can
synthesize the synchronisation logic as well.

By doing this before erasure we get better signatures,
and before specialized means specialized lazy vals work now.
Mixins is now almost reduced to its essence: implementing
super accessors and forwarders. It still synthesizes
accessors for param accessors and early init trait vals.

Concretely, trait lazy vals are mixed into subclasses
with the needed synchronization logic in place, as do
lazy vals in classes and methods. Similarly, modules
are initialized using double checked locking.
Since the code to initialize a module is short,
we do not emit compute methods for modules (anymore).

For simplicity, local lazy vals do not get a compute method either.

The strange corner case of constant-typed final lazy vals
is resolved in favor of laziness, by no longer assigning
a constant type to a lazy val (see widenIfNecessary in namers).
If you explicitly ask for something lazy, you get laziness;
with the constant-typedness implicit, it yields to the
conflicting `lazy` modifier because it is explicit.

  Co-Authored-By: Lukas Rytz <lukas@lightbend.com>
  Fixes scala/scala-dev#133

Inspired by dotc, desugar a local `lazy val x = rhs` into

```
val x$lzy = new scala.runtime.LazyInt()
def x(): Int = {
    x$lzy.synchronized {
      if (!x$lzy.initialized) {
        x$lzy.initialized = true
        x$lzy.value = rhs
      }
      x$lzy.value
    }
}
```

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

```
def f = {
  lazy val x = 0
  x
}
```

desugars to

```
public int f() {
    IntRef x$lzy = IntRef.zero();
    VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
    return this.x$1(x$lzy, bitmap$0);
}

private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    C c = this;
    synchronized (c) {
        if ((byte)(bitmap$0$1.elem & 1) == 0) {
            x$lzy$1.elem = 0;
            bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
        }
        return x$lzy$1.elem;
    }
}

private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    return (byte)(bitmap$0$1.elem & 1) == 0 ?
      this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
}
```

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem (scala/scala-dev#133)
by synchronizing on the lazy holder.

Currently, we don't exploit the fact that the initialized field
is `@volatile`, because it's not clear the performance is needed
for local lazy vals (as they are not contended, and as soon as
the VM warms up, biased locking should deal with that)

Note, be very very careful when moving to double-checked locking,
as this needs a different variation than the one we use for
class-member lazy vals. A read of a volatile field of a class
does not necessarily impart any knowledge about a "subsequent" read
of another non-volatile field of the same object. A pair of
volatile reads and write can be used to implement a lock, but it's
not clear if the complexity is worth an unproven performance gain.
(Once the performance gain is proven, let's change the encoding.)

  - don't explicitly init bitmap in bytecode
  - must apply method to () explicitly after uncurry
@adriaanm
Copy link
Member Author

@adriaanm adriaanm commented Aug 22, 2016

Ah ok, Lukas pointed out I was missing the fast path:

    private int x$1(LazyInt x$lzy1$2) {
        return x$lzy1$2.initialized() ? x$lzy1$2.value() : this.x$lzyINIT1$1(x$lzy1$2);
    }

duh... :-)

@adriaanm
Copy link
Member Author

@adriaanm adriaanm commented Aug 22, 2016

My question then extends to our lazy val impl on both locals and fields: why is a volatile read/write combo on the init bit enough to establish happens-before for the actual field that has the lazily computed value? I understand how it works for SafeDCLFactory in https://shipilev.net/blog/2014/safe-public-construction/#_safe_publication, but there all volatile reads and writes are on the same field.

@adriaanm
Copy link
Member Author

@adriaanm adriaanm commented Aug 22, 2016

Ok, I think I convinced myself after reading https://shipilev.net/blog/2014/jmm-pragmatics/#_happens_before_publication

retronym added a commit to retronym/scala that referenced this issue Aug 29, 2016
Essentially, we fuse mixin and lazyvals into the fields phase.
With fields mixing in trait members into subclasses, we
have all info needed to compute bitmaps, and thus we can
synthesize the synchronisation logic as well.

By doing this before erasure we get better signatures,
and before specialized means specialized lazy vals work now.
Mixins is now almost reduced to its essence: implementing
super accessors and forwarders. It still synthesizes
accessors for param accessors and early init trait vals.

Concretely, trait lazy vals are mixed into subclasses
with the needed synchronization logic in place, as do
lazy vals in classes and methods. Similarly, modules
are initialized using double checked locking.
Since the code to initialize a module is short,
we do not emit compute methods for modules (anymore).

For simplicity, local lazy vals do not get a compute method either.

The strange corner case of constant-typed final lazy vals
is resolved in favor of laziness, by no longer assigning
a constant type to a lazy val (see widenIfNecessary in namers).
If you explicitly ask for something lazy, you get laziness;
with the constant-typedness implicit, it yields to the
conflicting `lazy` modifier because it is explicit.

  Co-Authored-By: Lukas Rytz <lukas@lightbend.com>
  Fixes scala/scala-dev#133

Inspired by dotc, desugar a local `lazy val x = rhs` into

```
val x$lzy = new scala.runtime.LazyInt()
def x(): Int = {
    x$lzy.synchronized {
      if (!x$lzy.initialized) {
        x$lzy.initialized = true
        x$lzy.value = rhs
      }
      x$lzy.value
    }
}
```

Note that the 2.11 decoding (into a local variable and a bitmap) also
creates boxes for local lazy vals, in fact two for each lazy val:

```
def f = {
  lazy val x = 0
  x
}
```

desugars to

```
public int f() {
    IntRef x$lzy = IntRef.zero();
    VolatileByteRef bitmap$0 = VolatileByteRef.create((byte)0);
    return this.x$1(x$lzy, bitmap$0);
}

private final int x$lzycompute$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    C c = this;
    synchronized (c) {
        if ((byte)(bitmap$0$1.elem & 1) == 0) {
            x$lzy$1.elem = 0;
            bitmap$0$1.elem = (byte)(bitmap$0$1.elem | 1);
        }
        return x$lzy$1.elem;
    }
}

private final int x$1(IntRef x$lzy$1, VolatileByteRef bitmap$0$1) {
    return (byte)(bitmap$0$1.elem & 1) == 0 ?
      this.x$lzycompute$1(x$lzy$1, bitmap$0$1) : x$lzy$1.elem;
}
```

An additional problem with the above encoding is that the `lzycompute`
method synchronizes on `this`. In connection with the new lambda
encoding that no longer generates anonymous classes, captured lazy vals
no longer synchronize on the lambda object.

The new encoding solves this problem (scala/scala-dev#133)
by synchronizing on the lazy holder.

Currently, we don't exploit the fact that the initialized field
is `@volatile`, because it's not clear the performance is needed
for local lazy vals (as they are not contended, and as soon as
the VM warms up, biased locking should deal with that)

Note, be very very careful when moving to double-checked locking,
as this needs a different variation than the one we use for
class-member lazy vals. A read of a volatile field of a class
does not necessarily impart any knowledge about a "subsequent" read
of another non-volatile field of the same object. A pair of
volatile reads and write can be used to implement a lock, but it's
not clear if the complexity is worth an unproven performance gain.
(Once the performance gain is proven, let's change the encoding.)

  - don't explicitly init bitmap in bytecode
  - must apply method to () explicitly after uncurry
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants