New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calls to Enum.values() don't optimise away as expected #574

Closed
chrisseaton opened this Issue Jul 28, 2018 · 4 comments

Comments

5 participants
@chrisseaton
Member

chrisseaton commented Jul 28, 2018

@nezihyigitbasi asked on Twitter about why calls to Enum.values() were producing garbage when called in an inner loop. Why isn't Graal able to compile the object away, he wanted to know.

For background context, there is no method Enum.values(). Enumerations in Java define this method themselves. The code is generated by the javac compiler.

It's the Java language spec that gives us semantics for enumerations. As far as I know the JVM spec doesn't know anything about them - they're just normal Java classes by the time the JVM gets them.

The minimal reproduction that @nezihyigitbasi gave us was this loop. He actually put the loop inside main, but I've factored it out to make some tooling easier to use.

import java.util.concurrent.TimeUnit;

public class TestEnum
{
    public static void main(String[] args)
            throws InterruptedException
    {
        while (true) {
          methodUnderTest();
        }
    }
    
    private static void methodUnderTest() {
        for (TimeUnit unit : TimeUnit.values()) {
        }
    }
}

If we use GraalVM, and Ideal Graph Visualiser and the -Dgraal.Dump flag we can see what how the method comes into the compiler from the bytecode parser.

a

That ObjectClone node at the top is the starting point of the explanation. It comes an object, and the input object is the little grey box above it, which if we look at the value of it we can see is the array of values.

b

If we look at the compiler graph after the mid tier, we can see the object clone hasn't gone away - it's been lowered to a more specific ArrayCopyCall, and the inputs are still constants.

c

So we can see that indeed Graal is not optimising away this loop - it's always cloning the values() array. The values() array comes from the enum's class definition, so let's make looking at that our next step.

The example code we were given used an enum from the standard library. Let's write our own simple enumeration to look at instead.

public enum MyEnum {
  A, B, C;
}

If we compile with javac, and the disassemble with javap -p -c MyEnum.class, we'll see this:

public final class MyEnum extends java.lang.Enum<MyEnum> {
  public static final MyEnum A;

  public static final MyEnum B;

  public static final MyEnum C;

  private static final MyEnum[] $VALUES;

  public static MyEnum[] values();
    Code:
       0: getstatic     #1                  // Field $VALUES:[LMyEnum;
       3: invokevirtual #2                  // Method "[LMyEnum;".clone:()Ljava/lang/Object;
       6: checkcast     #3                  // class "[LMyEnum;"
       9: areturn

  public static MyEnum valueOf(java.lang.String);
    Code:
       0: ldc           #4                  // class MyEnum
       2: aload_0
       3: invokestatic  #5                  // Method java/lang/Enum.valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum;
       6: checkcast     #4                  // class MyEnum
       9: areturn

  private MyEnum();
    Code:
       0: aload_0
       1: aload_1
       2: iload_2
       3: invokespecial #6                  // Method java/lang/Enum."<init>":(Ljava/lang/String;I)V
       6: return

  static {};
    Code:
       0: new           #4                  // class MyEnum
       3: dup
       4: ldc           #7                  // String A
       6: iconst_0
       7: invokespecial #8                  // Method "<init>":(Ljava/lang/String;I)V
      10: putstatic     #9                  // Field A:LMyEnum;
      13: new           #4                  // class MyEnum
      16: dup
      17: ldc           #10                 // String B
      19: iconst_1
      20: invokespecial #8                  // Method "<init>":(Ljava/lang/String;I)V
      23: putstatic     #11                 // Field B:LMyEnum;
      26: new           #4                  // class MyEnum
      29: dup
      30: ldc           #12                 // String C
      32: iconst_2
      33: invokespecial #8                  // Method "<init>":(Ljava/lang/String;I)V
      36: putstatic     #13                 // Field C:LMyEnum;
      39: iconst_3
      40: anewarray     #4                  // class MyEnum
      43: dup
      44: iconst_0
      45: getstatic     #9                  // Field A:LMyEnum;
      48: aastore
      49: dup
      50: iconst_1
      51: getstatic     #11                 // Field B:LMyEnum;
      54: aastore
      55: dup
      56: iconst_2
      57: getstatic     #13                 // Field C:LMyEnum;
      60: aastore
      61: putstatic     #1                  // Field $VALUES:[LMyEnum;
      64: return
}

We can see that values() is a getter for a field $VALUES, which also clones the value before returning it. That's where our ObjectClone comes from. We can see the static initialiser (the last method) creates this array as a simple object array, and writing it to the $VALUES field.

$VALUES is static final. That means that when you get the value of $VALUES, it will never change, so you would think that the compiler could get the value and compile it into the program.

I think it may be doing that - the ObjectClone could be referencing the constant object that the compiler read from $VALUES. This actually surprises me - I thought that final fields in the JVM spec were not actually final, as they could be changed by reflection. I'm not sure that is a written rule, but I was fairly certain it was the case.

However, even if the reference is constant because it's final - the elements of the array are not final. This is presumably why the object array is cloned - because they don't want a caller mutating it. So maybe the reference to the array in ObjectClone is constant, but the values are not known to be constant. I'm not totally certain why this would mean a copy of the object still has to be read - if someone else modified the array while you were making the copy that wouldn't be any different from the array being modified as you read it directly in this loop wouldn't it? And the length is constant if you have a constant reference to it, so why isn't the loop unrolled?

A couple of other experiments I tried were skipping the javac code generation step and writing our own enumeration, but one which works in the same way, like this:

public class MyEnum2 {
  
  public static final MyEnum2 A = new MyEnum2();
  public static final MyEnum2 B = new MyEnum2();
  public static final MyEnum2 C = new MyEnum2();
  
  public static final MyEnum2[] VALUES = new MyEnum2[]{A, B, C};
  
  public static MyEnum2[] values() {
    return VALUES.clone();
  }
  
}

public class TestEnum2
{
    public static void main(String[] args)
            throws InterruptedException
    {
        while (true) {
          methodUnderTest();
        }
    }
    
    private static void methodUnderTest() {
        for (MyEnum2 unit : MyEnum2.values()) {
        }
    }
}

That generates exactly the same graph in Graal.

I also tried a version that, instead of cloning an existing object which anyone could have access to, creates a new array each time.

public class MyEnum3 {
  
  public static final MyEnum3 A = new MyEnum3();
  public static final MyEnum3 B = new MyEnum3();
  public static final MyEnum3 C = new MyEnum3();
  
  public static MyEnum3[] values() {
    return new MyEnum3[]{A, B, C};
  }
  
}

public class TestEnum3
{
    public static void main(String[] args)
            throws InterruptedException
    {
        while (true) {
          methodUnderTest();
        }
    }
    
    private static void methodUnderTest() {
        for (MyEnum3 unit : MyEnum3.values()) {
        }
    }
}

This is the result we want! It comes out of the bytecode parser with an invoke to values() rather than the direct clone operation, which makes me think Graal is doing some special inlining for values() on proper enumerations?

d

But then it comes out the other end with everything totally removed, and we see that graph that anyone working on a benchmark wants to see!

e

There's a classic problem here.

In the interpreter and older JVMs, it makes sense to cache object allocations. In a more sophisticated JVM, it doesn't always make sense to cache object allocations, becuase the cached version is public.

You can see the JRuby people, like @headius, dealing with the same problem with the boxed Integer cache. They want the cache in the interpreted code, but now we have more sophisticated JVMs they don't want the caching as it defeats the escape analysis and scalar replacement of aggregates that you were looking for in your Tweet. And in this case it's the javac compiler that has made the decision for us, so it's too late to do anything at runtime.

So that still leaves some questions for my colleagues who know more than I:

Does Graal treat final as final or not?

Why are proper enumeration classes values() methods inlined by the time I see After parsing, but not my own custom enumeration, when the JVM doesn't know anything about enumerations?

Why does the ObjectClone appear to have a constant reference to all the elements in the array when it's final and so could be modified by reflection?

Why isn't the constant nature of the array length enough to unroll the loop? Or is it available but not considered a good idea to unroll for some reason?

What can we do about the escaped $VALUES object that javac gives us? Since the object is cloned each time, why doesn't javac create a new array directly each time like I did in MyEnum3? Wouldn't this generate the same garbage in the interpreter but also compile well?

What else could we do to make values() constant-fold?

A use-case you didn't mention in your Tweet that we've seen (someone else in the team mentioned it to me) was doing MyEnum.values()[ord]. That's a case we could be interested in as well.

@dougxc

This comment has been minimized.

Show comment
Hide comment
@dougxc

dougxc Jul 29, 2018

Member

Does Graal treat final as final or not?

Yes, Graal definitely treats final static fields as compilation final. This is shown in your first graph.

Why are proper enumeration classes values() methods inlined by the time I see After parsing, but not my own custom enumeration, when the JVM doesn't know anything about enumerations?

This is probably because Enum3.values() doesn't satisfy the trivial inlining size where as the javac generated Enum.values() does. You might be able to confirm this with the -Dgraal.TraceInlining=true option.

Why does the ObjectClone appear to have a constant reference to all the elements in the array when it's final and so could be modified by reflection?

Graal disregards uses of reflection to modify a static final field (apart from static fields in java.lang.System).

In general, even if we could doing something in Graal about Enum.values() creating a new object each time, you still have the interpreter and first level compiler (i.e. C1) to worry about. For this reason, it's best to manually make a private local copy of Enum.values() where necessary like we recently did in Graal. A better solution would be for Enum to have a List<T> valuesAsList() method that returns an unmodifiable list.

Member

dougxc commented Jul 29, 2018

Does Graal treat final as final or not?

Yes, Graal definitely treats final static fields as compilation final. This is shown in your first graph.

Why are proper enumeration classes values() methods inlined by the time I see After parsing, but not my own custom enumeration, when the JVM doesn't know anything about enumerations?

This is probably because Enum3.values() doesn't satisfy the trivial inlining size where as the javac generated Enum.values() does. You might be able to confirm this with the -Dgraal.TraceInlining=true option.

Why does the ObjectClone appear to have a constant reference to all the elements in the array when it's final and so could be modified by reflection?

Graal disregards uses of reflection to modify a static final field (apart from static fields in java.lang.System).

In general, even if we could doing something in Graal about Enum.values() creating a new object each time, you still have the interpreter and first level compiler (i.e. C1) to worry about. For this reason, it's best to manually make a private local copy of Enum.values() where necessary like we recently did in Graal. A better solution would be for Enum to have a List<T> valuesAsList() method that returns an unmodifiable list.

@eregon

This comment has been minimized.

Show comment
Hide comment
@eregon

eregon Jul 29, 2018

Member

Isn't this an instance where Graal should see the original array elements are constants or can be treated as such since they are javac-emitted Enum $VALUES? Somewhat similar to the offsets of a switch.
If it did see that I'd guess the new array allocation would be compiled away.

Member

eregon commented Jul 29, 2018

Isn't this an instance where Graal should see the original array elements are constants or can be treated as such since they are javac-emitted Enum $VALUES? Somewhat similar to the offsets of a switch.
If it did see that I'd guess the new array allocation would be compiled away.

@thomaswue thomaswue self-assigned this Jul 29, 2018

@gilles-duboscq

This comment has been minimized.

Show comment
Hide comment
@gilles-duboscq

gilles-duboscq Jul 30, 2018

Member

Isn't this an instance where Graal should see the original array elements are constants or can be treated as such since they are javac-emitted Enum $VALUES?

This is already the case. However that doesn't really help with the new array allocation. The new array allocation is the thing making sure these values stay constant.
I think the reason we don't get what we expect here is that BasicObjectCloneNode#virtualize is very shy with arrays.

Member

gilles-duboscq commented Jul 30, 2018

Isn't this an instance where Graal should see the original array elements are constants or can be treated as such since they are javac-emitted Enum $VALUES?

This is already the case. However that doesn't really help with the new array allocation. The new array allocation is the thing making sure these values stay constant.
I think the reason we don't get what we expect here is that BasicObjectCloneNode#virtualize is very shy with arrays.

@gilles-duboscq gilles-duboscq self-assigned this Jul 30, 2018

@gilles-duboscq

This comment has been minimized.

Show comment
Hide comment
@gilles-duboscq

gilles-duboscq Aug 3, 2018

Member

After eae0566, the simple patterns mentioned here should fold as expected. I'm still considering further improvements for more complex patterns.

Member

gilles-duboscq commented Aug 3, 2018

After eae0566, the simple patterns mentioned here should fold as expected. I'm still considering further improvements for more complex patterns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment