Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image files are not displayed properly on SwingSet2 demo program #4193

Closed
takiguc opened this issue Jan 8, 2019 · 38 comments
Closed

Image files are not displayed properly on SwingSet2 demo program #4193

takiguc opened this issue Jan 8, 2019 · 38 comments

Comments

@takiguc
Copy link

takiguc commented Jan 8, 2019

Hello.

I used OpenJDK JDK11 + OpenJ9 for Linux amd64.
When I executed SwingSet2 demo, GIF images were not displayed properly.
And the situation was random, 4 GIF images were displayed on JInternalFrame Demo.
2 or 3 images were broken.

Version Information: RHEL 7.5

$ ~/OpenJ9/jdk-11.0.1+13/bin/java -version
openjdk version "11.0.1" 2018-10-16
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.1+13)
Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.11.0, JRE 11 Linux amd64-64-Bit Compressed References 20181115_18 (JIT enabled, AOT enabled)
OpenJ9   - 090ff9dcd
OMR      - ea548a66
JCL      - d4455071ce based on jdk-11.0.1+13)

Test instruction:

  1. Run following command
$ ~/OpenJ9/jdk-11.0.1+13/bin/java -jar ~/OpenJ9/jdk-11.0.1+13/demo/jfc/SwingSet2/SwingSet2.jar
  1. Check Frame 0 - 4 windows. some of images are not displayed properly

This issue only happens on Linux amd64.
Linux ppc64le and s390x builds were OK.
It worked fine on following AdoptOpenJDK build.

$ ~/Hotspot/jdk-11.0.1+13/bin/java -version
openjdk version "11.0.1" 2018-10-16
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.1+13)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.1+13, mixed mode)

It seemed GIF image treated as interlace GIF.
But I checked interlace flag on GIF.

$ identify -verbose resources/images/misc/duke.gif | grep Interlace:
  Interlace: None

I checked the latest IBM Java8 for Linux amd64, but I could not recreate this issue.
I assume IBM Java8 applied a fix against class library on GIF image decoder related code.
But it seems register or memory area was not clean up properly.

internalframe

@ChengJin01
Copy link
Contributor

Will investigate to see what happens.

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 11, 2019

I already reproduced the issue with OpenJDK11/OpenJ9 on LinuxLite (downloaded from https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.1%2B13/OpenJDK11U-jdk_x64_linux_openj9_jdk-11.0.1_13_openj9-0.11.0_11.0.1_13.tar.gz)
jdk-11.0.1_13/bin/java -jar jdk-11.0.1_13/demo/jfc/SwingSet2/SwingSet2.jar

image

and It works good with OpenJDK11/Hotspot on LinuxLite (downloaded from https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.1%2B13/OpenJDK11U-jdk_x64_linux_hotspot_11.0.1_13.tar.gz)
jdk-11.0.1_13_hotspot/bin/java -jar jdk-11.0.1_13/demo/jfc/SwingSet2/SwingSet2.jar

image

So it looks like there is something wrong happening in our code which needs further analysis.

@pshipton
Copy link
Member

pshipton commented Jan 11, 2019

Does it still occur with -Xint?

@ChengJin01
Copy link
Contributor

Yes, it ended up with pretty much the same result with JIT off.
jdk-11.0.1_13/bin/java -Xint -jar jdk-11.0.1_13/demo/jfc/SwingSet2/SwingSet2.jar

image

@ChengJin01
Copy link
Contributor

I tried to isolate the problem by writing two test cases to with different APIs to display the GIF.

[1] Test1:

        File picture = new File("resources/images/misc/toast.gif");
        Image image=ImageIO.read(picture);  <---------------
        ImageIcon icon = new ImageIcon(image);

The picture shows up correctly when loading with ImageIO.read()
image

[2] Test2:

  ImageIcon icon = new ImageIcon("resources/images/misc/toast.gif");

The test ends up with an interlaced picture.
image

Considering that the loading mechanism ImageIcon(fileName) (loading image data on the background) is different from ImageIO.read() (get all data loaded with one thread), need to investigate what happened to ImageIcon(fileName) to figure out whether the image data got messed up during loading.

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 18, 2019

Investigation shows the interlaceflag in Test2 (the original code in SwingSet2) was set to false during reading the image via readImage() at src/java.desktop/share/classes/sun/awt/image/ImageDecoder.java (OpenJDK11)

    private native boolean parseImage(int x, int y, int width, int height,
                                      boolean interlace, int initCodeSize,
                                      byte block[], byte rasline[],
                                      IndexColorModel model);
    /**
     * Read Image data
     */
    private boolean readImage(boolean first, int disposal_method, int delay)
        throws IOException
    {
        boolean interlace = (block[8] & INTERLACEMASK) != 0;
    	System.out.println("GifImageDecoder.readImage: interlace = " + interlace);
...
        boolean ret = parseImage(x, y, width, height, interlace, initCodeSize, block, rasline, model);
...

The debugging messages are as follows:

java.lang.Exception
	at java.desktop/sun.awt.image.GifImageDecoder.readImage(GifImageDecoder.java:503)
	at java.desktop/sun.awt.image.GifImageDecoder.produceImage(GifImageDecoder.java:213)
	at java.desktop/sun.awt.image.InputStreamImageSource.doFetch(InputStreamImageSource.java:269)
	at java.desktop/sun.awt.image.ImageFetcher.fetchloop(ImageFetcher.java:212)
	at java.desktop/sun.awt.image.ImageFetcher.run(ImageFetcher.java:176)
Reading a 310 by 192 non-interlaced image...
GifImageDecoder.readImage: interlace = false

3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at sun/awt/image/GifImageDecoder.parseImage(Native Method)
4XESTACKTRACE                at sun/awt/image/GifImageDecoder.readImage(GifImageDecoder.java:600)
4XESTACKTRACE                at sun/awt/image/GifImageDecoder.produceImage(GifImageDecoder.java:213)
4XESTACKTRACE                at sun/awt/image/InputStreamImageSource.doFetch(InputStreamImageSource.java:269)
4XESTACKTRACE                at sun/awt/image/ImageFetcher.fetchloop(ImageFetcher.java:212)
4XESTACKTRACE                at sun/awt/image/ImageFetcher.run(ImageFetcher.java:176)

However, the value of the passed-in interlace flag was messed up when calling the native code GifImageDecoder.parseImage() at src/java.desktop/share/native/libawt/awt/image/gif/gifdecoder.c

JNIEXPORT jboolean JNICALL
Java_sun_awt_image_GifImageDecoder_parseImage(JNIEnv *env,
                                              jobject this,
                                              jint relx, jint rely,
                                              jint width, jint height,
                                              jint interlace, <-------------- the passed-in interlace flag
                                              jint initCodeSize,
                                              jbyteArray blockh,
                                              jbyteArray raslineh,
                                              jobject cmh)
...
    int passinc = interlace ? 8 : 1;
    int passht = passinc;
    int len;

    printf("\nJava_sun_awt_image_GifImageDecoder_parseImage: interlace = %d\n", interlace);
    printf("\nJava_sun_awt_image_GifImageDecoder_parseImage: passinc = %d\n", passinc);
...
Java_sun_awt_image_GifImageDecoder_parseImage: interlace = 1476422656 <---------------------
Java_sun_awt_image_GifImageDecoder_parseImage: passinc = 8

So I tried to hack the native code Java_sun_awt_image_GifImageDecoder_parseImage() by forcing passinc to 1 (assuming interlace is false) and picture was correctly generated as follows:
image

Based on the investigation above, It seems there is some issue with manipulation on boolean value in JNI & bytecode interpreter in our code which needs to be fixed (Specifically the problem happens when converting from boolean (Java level) to jint (native) via JNI)

FYI: @DanHeidinga, @tajila

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 18, 2019

Just talked to Tobi, it seems there was no change with the passed-in arguments that got merged.

Given that the problem happened in the JNI native, need to double-check the calling path from java to native (runJNINative, callCFunction, e.g) to figure out how the passe-in arguments were handled.

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 23, 2019

The root cause of the issue is due to the inconsistency of the argument type between the declaration in java code and the corresponding implementation in the native code in OpenJDK11 as follows:

src/java.desktop/share/classes/sun/awt/image/GifImageDecoder.java

    private native boolean parseImage(
                                      int x, int y, 
                                      int width, int height,
                                      boolean interlace,  <-------- the boolean type in java code
                                      int initCodeSize,
                                      byte block[], byte rasline[],
                                      IndexColorModel model);

src/java.desktop/share/native/libawt/awt/image/gif/gifdecoder.c

JNIEXPORT jboolean JNICALL
Java_sun_awt_image_GifImageDecoder_parseImage(JNIEnv *env,  jobject this,
                                              jint relx, jint rely,
                                              jint width, jint height,
                                              jint interlace,   <---  wrongly cast to the int type in the native code
                                              jint initCodeSize,
                                              jbyteArray blockh,
                                              jbyteArray raslineh,
                                              jobject cmh)
{

Technically, this issue has nothing to do with VM (from cJNICallout) or FFI in JNI operation as the FFI code depends on the size of type in the declaration to copy the value from the java stack to the native.

Thus, changing "jint interlace" to "jboolean interlace" fixes the issue and the GIF image will be generated correctly as as result.

@DanHeidinga , I wondering whether we can request Oracle/OpenJDK to update the incorrect argument type for Java_sun_awt_image_GifImageDecoder_parseImage() in Java11 as this makes more sense from JNI perspective.

@DanHeidinga
Copy link
Member

@andrew-m-leonard can you propose a patch at OpenJDK addressing the mismatch between the java & native signatures? Bonus points if it can be backported to JDK11

@takiguc
Copy link
Author

takiguc commented Jan 23, 2019

@ChengJin01 , could you explain why Linux ppc64le, s390x and AIX build don't have this issue ?
I assume the situation is same as Linux amd64 build.

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 23, 2019

@takiguc, this entirely depends on how the target platform extends or casts from boolean (1byte) to int (e.g. 4bytes) on the native stack (this also depends on whether it is big-endian, in which case we will help to extend in VM). Technically the argument types should be identical on both java and native from JNI perspective. Relying on the automatic cast/conversion on platforms is dangerous in such case.

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 23, 2019

I am still double-checking the related VM & FFI code on X86 (UNIX-based) to see whether we can figure out a way around the issue against the VM Spec as to the manipulation on the boolean type.

@andrew-m-leonard
Copy link
Contributor

Created openjdk bug : https://bugs.openjdk.java.net/browse/JDK-8217735

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 24, 2019

Further investigation on the FFI code (X86) shows the problem was caused by the aligned memory without initialization before copying arguments the native stack as follows:

/runtime/libffi/x86/ffi64.c
void
ffi_call (ffi_cif *cif, void (*fn)(void), void *rvalue, void **avalue)
{
...
  stack = alloca (sizeof (struct register_args) + cif->bytes + 4*8);
  reg_args = (struct register_args *) stack;
  argp = stack + sizeof (struct register_args); <------ argp points to the allocate native stack for arguments.
...
  for (i = 0; i < avn; ++i)
    {
    ...
      n = examine_argument (arg_types[i], classes, 0, &ngpr, &nsse);
      if (n == 0
	  || gprcount + ngpr > MAX_GPR_REGS
	  || ssecount + nsse > MAX_SSE_REGS)
	{
	  long align = arg_types[i]->alignment;

	  /* Stack arguments are *always* at least 8 byte aligned.  */
	  if (align < 8)
	    align = 8;

	  /* Pass this argument in memory.  */
	  argp = (void *) ALIGN (argp, align); <---------- mostly aligned by 8 bytes
	  memcpy (argp, avalue[i], size);
	  argp += size; <---- the next address still moves forward by 8bytes even though it skips the current type size.
	}
      else
	{
	  /* The argument is passed entirely in registers.  */
          ... ...

So it means all arguments on the stack are aligned at 8 bytes (already verify the address of argp)

e.g. if the current stack pointer argp stores the boolean value (1 byte) at 0x28117640, the next value to be copied starts at 0x28117648.

0x28117640 | argument : boolean ..... | <--- 8 bytes  (1 byte value + 7 bytes garbage data after memory allocation.
0x28117648 | argument: int      ..... |  <---- 8 bytes  (4 bytes value + 4 bytes garbage data)
...

In such case, casting the boolean type (1 byte ) to the int type (4 bytes) inevitably introduces the garbage data (random value) existing in the allocated memory.

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 24, 2019

Technically there are two solutions to the issue on our side.
[1] initialize the stack memory before copying the arguments

/runtime/libffi/x86/ffi64.c
void
ffi_call (ffi_cif *cif, void (*fn)(void), void *rvalue, void **avalue)
{
size_t stack_size;
 ...
 stack_size = sizeof (struct register_args) + cif->bytes + 4*8;
 stack = alloca(stack_size);
 memset(stack, 0, stack_size); <------- initialized with 0 for the allocated memory
argp = stack + sizeof (struct register_args);

The idea is to ensure there is no garbage data introduced in casting short-size types to long-size types (e.g. from boolean to int)

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 24, 2019

[2] replace the FFI type of boolean (1 byte) to with the FFI type of int (4 bytes)

The idea is based on the VM Spec which says:

2.3.4 The boolean Type
Although the Java Virtual Machine defines a boolean type, 
it only provides very limited support for it. 
There are no Java Virtual Machine instructions solely dedicated
to operations on boolean values. Instead, expressions in the Java 
programming language that operate on boolean values are 
compiled to use values of the Java Virtual Machine int data type.

From VM Spec perspective, the int type for boolean in the native is acceptable at this point.
If so, we can change the FFI type of boolean as follows to ensure it can be safely & correctly casted to both jint (4types) and jboolean (1byte), whatever the target platform is.

/runtime/vm/BytecodeInterpreter.hpp
	static const VMINLINE ffi_type *
	getFFIType(U_8 j9ntc) {
		static const ffi_type * const J9NtcToFFI[] = {
 &ffi_type_void,		/* J9NtcVoid */
&ffi_type_sint32,	/* J9NtcBoolean */ <------- replace &ffi_type_uint8 with &ffi_type_sint32
 ....
 &ffi_type_sint32,	/* J9NtcInt */
};

In such case, the boolean type always takes 4 bytes on the java stack for arguments (actually the initialization has been done by its own assignment rather than in FFI code)

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 24, 2019

The only difference is that solution [1] is specific to X86 on which the issue was originally detected and has no impact on other platform, while solution [2] affects all platforms (theoretically solution [2] should work on all platforms)

So we can choose either of these two solutions above to fix the issue on our side if Oracle insists on the jint type.

@DanHeidinga, any preference on the solutions above?

@DanHeidinga
Copy link
Member

@ChengJin01 there's been recent changes to return bytecodes, putfield, Unsafe, and JNI returns to ensure that booleans are only returned as 1 or 0. Can you write some tests to validate whether booleans should also be converted to 0 / 1 when calling jni?

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 25, 2019

I've created a simple test to verify the assumption above to see how it goes with a passed-in int value in the case of the boolean argument in the native (the .class file was generated via ASM as javac refuses to convert int to boolean in the code level)
jnibooltest.zip
snippet of JniBoolArgTest.class

  public static native void printBoolMethod(boolean);
    descriptor: (Z)V
    flags: (0x0109) ACC_PUBLIC, ACC_STATIC, ACC_NATIVE

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC
    Code:
      stack=1, locals=2, args_size=1
         0: bipush        126  <---------- 0x7e
         2: istore_1
         3: iload_1   <---------- load 0x7e for the JNI native printBoolMethod()
         4: invokestatic  #15                 // Method printBoolMethod:(Z)V
         7: return

JniBoolArgTest.c

JNIEXPORT void JNICALL Java_JniBoolArgTest_printBoolMethod(JNIEnv * env, jclass clazz, jboolean arg)
{
   printf("\nJava_JniBoolArgTest_printBoolMethod: arg = 0x%x\n", arg);
   return;
}

[1] result on OpenJ9

root@JCHTORLNXVM041:~/jchau/temp/#jdk11_openj9/bin/java -Djava.library.path=".../jnitest"   JniBoolArgTest
Java_JniBoolArgTest_printBoolMethod: arg = 0x7e

[2] result on Hotspot

root@JCHTORLNXVM041:~/jchau/temp/#jdk_11.0.1_13_hotspot/bin/java -Djava.library.path=".../jnitest"   JniBoolArgTest
Java_JniBoolArgTest_printBoolMethod: arg = 0x7e

So Both OpenJ9 and Hotspot ended up with the same incorrect result as the conversion won't happen automatically in the case of boolean.

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 25, 2019

The conversion can be done on the java stack before copying the arguments to the native
e.g.

	VMINLINE FFI_Return
	cJNICallout(REGISTER_ARGS_LIST, ...)
	{
           ...
			for (U_8 i = 0; i < argCount; i++) {
               ...
                         	if (J9NtcBoolean == argTypes[i]) {
					javaArgs[offset] = javaArgs[offset] & 1;
				}

The only concern is the fix changes the original passed-in value on the java stack; otherwise we have to address the conversion after the copy in the FFI code specific to all platforms, which seems complicated.

@DanHeidinga
Copy link
Member

https://github.com/eclipse/openj9/blob/c6fc7baec315198dd5f31501c6c9a0de453aa3a9/runtime/include/jni.h#L71

indicates that JNI code processing booleans should only ever read a char's worth of data so uint8 is correct.

Given this bug though, I'm leaning towards the solution in #4193 (comment) but using ffi_type_uint32rather than a signed one.

@ChengJin01 Can you confirm that passes the test?

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 25, 2019

@DanHeidinga , Changing to ffi_type_uint32 for boolean only fixes the case of int type in the native (the issue with GIF image) but can't address the case of a random int value (1 byte) to be passed as boolean to the native. The reason is that it assumes the original value is 1 type (the same as solution [1]) , in which case there is no conversion from 1 byte int value (e.g. 0x7e) to 0 or 1.

@ChengJin01
Copy link
Contributor

ChengJin01 commented Jan 25, 2019

Already confirmed ffi_type_uint32 fixes the issue with GIF images
../jdk/bin/java -jar SwingSet2.jar
image

But it still failed at the JniBoolArgTest as follows:

root@JCHTORLNXVM041:~/jchau/temp# ../jdk/bin/java -Djava.library.path=".../jnitest"   JniBoolArgTest
Java_JniBoolArgTest_printBoolMethod: arg = 0x7e

The results above are correct as the issue with GIF image (whether a boolean value can be safely cast to 1 or 4 bytes without considering whether the value is valid) and the boolean conversion in JniBoolArgTest (whether the passed-in 1 byte value is 0 or 1 for boolean) belong to two different situations in the native which need to be addressed separately.

@ChengJin01
Copy link
Contributor

Some JNI native tests (detected in personal build) with boolean return type shows changing to ffi_type_uint32 affects the return value for boolean. Need to check what happened in such case.

@DanHeidinga
Copy link
Member

Created openjdk bug : https://bugs.openjdk.java.net/browse/JDK-8217735

@andrew-m-leonard Can you put the fix for this into the Extensions repos? I'd like to ensure we have the right behaviour while we finish reviewing this PR

@pshipton
Copy link
Member

pshipton commented Mar 1, 2019

Tagged against the 0.13 milestone as a reminder to get the extensions change into this release.

@andrew-m-leonard
Copy link
Contributor

I've raised the question on the awt-dev maillist to get Oracle's position... https://mail.openjdk.java.net/pipermail/awt-dev/2019-March/015029.html

@pshipton
Copy link
Member

pshipton commented Mar 6, 2019

A fix to the JNI declaration has been merged to jdk12.
ibmruntimes/openj9-openjdk-jdk12#20
ibmruntimes/openj9-openjdk-jdk12#22

@pshipton
Copy link
Member

pshipton commented Mar 6, 2019

ChengJin01 added a commit to ChengJin01/openj9 that referenced this issue Mar 21, 2019
The change is to ensure booleans can be handled
appropriately as the int type when calling the
JNI natives.

Close: eclipse-openj9#4193

Signed-off-by: CHENGJin <jincheng@ca.ibm.com>
@andrew-m-leonard
Copy link
Contributor

Merged into openjdk http://hg.openjdk.java.net/jdk/client/rev/62171da145f9
Should get merged into jdk"head" from client within a week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants