Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace mingw with clang in compilation #2979

Merged
merged 1 commit into from Nov 21, 2018

Conversation

ChengJin01
Copy link
Contributor

@ChengJin01 ChengJin01 commented Sep 21, 2018

The change is to modify all scripts/code
involved to ensure we use Clang in the place
of mingw to compile the bytecode interpreter
on Windows.

depends on #2980

Signed-off-by: CHENGJin jincheng@ca.ibm.com

@ChengJin01
Copy link
Contributor Author

ChengJin01 commented Sep 21, 2018

The changes mainly include:

  1. updated document on Java 8 & 11 to support Clang
  2. changed all names in scripts/code prefixed with "MINGW" to "CLANG" as clang supports most of predefined macros used in mingw
  3. removed the following flags specific to mingw which is not recognized/used/supported on clang
    -mwin32(mingw option/no equivalent on clang)
    -mdll(unused)
    -mthreads(unused)
    -fno-use-linker-plugin (not supported)
    please refer to https://www.systutorials.com/docs/linux/man/1-x86_64-w64-mingw32-g++/ for details of these flags

Reviewer: @pshipton , @keithc-ca

FYI @DanHeidinga

@pshipton
Copy link
Member

Have you tested building Windows 32-bit?

@ChengJin01
Copy link
Contributor Author

Nope, all builds were compiled on Windows 64-bit.

@pshipton pshipton changed the title Replace mingw with clang in compilation WIP Replace mingw with clang in compilation Sep 21, 2018
@ChengJin01
Copy link
Contributor Author

I will try compiling a 32-bit Java 8 build on a Fyre machine to ensure everything works fine.

@ChengJin01
Copy link
Contributor Author

ChengJin01 commented Sep 26, 2018

sanity_test_results.zip

I've already complid Java 8 32 bit with MSVC + Clang (32bit). There is pretty much no change with compiler flags except that the compiler needs to be Clang 32bit (http://releases.llvm.org/6.0.1/LLVM-6.0.1-win32.exe)

[1] Sanity test(functional) 64bit:
No test failure was detected on both msvc + mingw and msvc + clang:
TOTAL: 163 EXECUTED: 87 PASSED: 87 FAILED: 0 SKIPPED: 76
ALL TESTS PASSED

[2] Sanity test (functional) 32bit:
No problem was detected on msvc + mingw and 5 test failures (crash?) were detected on msvc + clang
(1) msvc + mingw
TOTAL: 163 EXECUTED: 94 PASSED: 94 FAILED: 0 SKIPPED: 69
ALL TESTS PASSED

(2) msvc + clang
FAILED test targets:
cmdLineTester_jvmtitests_hcr_SE80_3
JCL_Test_SE80_0
JCL_Test_SE80_1
jit_jitt_0
jit_jitt_3

TOTAL: 163 EXECUTED: 94 PASSED: 89 FAILED: 5 SKIPPED: 69

I manually ran one of failing tests cmdLineTester_jvmtitests_hcr_SE80_3 and it stopped working in testBCIUsingASM_InjectTryCatch_Level2_injectCatchAndThrowNewAIOBE_catchAIOBEAndThrowNewAE

C:/temp_jdk8/perf_win/clang_jdk8_32bit/windows-x86-normal-server-release/images/j2sdk-image/jre/bin/java  -Xgcpolicy:gencon -Xjit:count=0 -Xnocompressedrefs  -Xdump    -agentlib:jvmtitest=test:ta001 -cp "C:/temp_jdk8/openj9_v2/jvmtest/functional/cmdLineTests/jvmtitests/jvmtitest.jar;C:/temp_jdk8/openj9_v2/jvmtest/TestConfig/lib/asm-all.jar" com.ibm.jvmti.tests.util.TestRunner

*** Testing [4/13]:     testBCIUsingASM_InjectTryCatch_Level2_injectCatchAndThrowNewAIOBE_catchAIOBEAndThrowNewAE
Expected NullPointerException received
(stopped working.....)

Not yet figured out what happened to the failing tests on the build compiled with msvc + clang

@ChengJin01
Copy link
Contributor Author

ChengJin01 commented Sep 27, 2018

I tried replacing __GNUC__ with __clang__ in oti/mingw/stddef.h and oti/mingw_comp.h as Clang already discarded ___GNUC__ in 6.0.1 (the predefined macro remains in older versions but has been removed now) to get Clang accept all definitions previously used by mingw. However the recompiled build still ended up with same failure in testBCIUsingASM_InjectTryCatch_Level2_injectCatchAndThrowNewAIOBE_catchAIOBEAndThrowNewAE, which means the replacement for __GNUC__ is logically incorrect and has nothing to do with the test failures.

Still need to get back to the failing tests for root cause.

@ChengJin01
Copy link
Contributor Author

ChengJin01 commented Sep 28, 2018

Investigation shows these test failures were triggered by 4 test classes in which the methods share the pretty much the same nested try-catch structure. That being said, these failures belong to the same issue.

  1. cmdLineTester_jvmtitests_hcr_SE80_3 (executed with -Xjit:count=0)
    Running command: "C:/temp_jdk8/perf_win/clang_jdk8_32bit/windows-x86-normal-server-release/images/j2sdk-image/jre/bin\java" -Xgcpolicy:gencon -Xjit:count=0 -Xnocompressedrefs -Xdump -agentlib:jvmtitest=test:ta001 -cp "C:/temp_jdk8/openj9_v2/test///..//jvmtest\functional\cmdLineTests\jvmtitests\jvmtitest.jar;C:/temp_jdk8/openj9_v2/test///..//jvmtest\TestConfig\lib\asm-all.jar" com.ibm.jvmti.tests.util.TestRunner

/test/functional/cmdLineTests/jvmtitests/src/com/ibm/jvmti/tests/BCIWithASM/ta001.java

testBCIUsingASM_InjectTryCatch_Level2_injectCatchAndThrowNewAIOBE_catchAIOBEAndThrowNewAE {
...
		byte[] trasnformedClassBytes = ASMTransformer.trasnform_injectCatchAndThrowNewAIOBE_catchAIOBEAndThrowNewAE( Source4.class );
		boolean transformed = redefineClass( Source4.class, trasnformedClassBytes.length, trasnformedClassBytes );
		Source4 postState = new Source4();
		try { 
			int postStateResult = postState.returnOne(); <------------ crashed here

/test/functional/cmdLineTests/jvmtitests/src/com/ibm/jvmti/tests/BCIWithASM/Source4.java

public class Source4 {
	public int returnOne() {
		//try {                            <---------------------------- nested try-catch structure
			String str = null;  
			str.equals("a");	
			return 1;
		//} catch ( NullPointerException e1 ) {
		//	try {												<--injected code 
		//		int intArray [] = new int [] {1};   
		//		return intArray[1];	
		//	} catch ( ArrayIndexOutOfBoundsException e2 ) {
		//		return 1/0;
		//	}
		//}
	}
}

The crash issue disappeared by removing count=0 (only with JIT on/off)

  1. cfgTest in jit_jitt_0/jit_jitt_3 (executed with -Xjit:count=0)
    test with -Xjit:noJitUntilMain,count=0,assumeStrictFP, .....
  <test name="cfgTest">
    <classes>
      <class name="jit.test.jitt.cfg.nestedExceptions"/>
    </classes>

/test/functional/JIT_Test/src/jit/test/jitt/cfg/nestedExceptions.java

public class nestedExceptions extends jit.test.jitt.Test {
   int tstExcp(int i) {
      try {
         if (i > 10)
            throw new Exception();
      }
      catch(Exception e1) {
         try {
            if (i > 11)
               throw new Exception();
         }
         catch(Exception e2) {
            try {
               if (i > 12)
                  throw new Exception();
            }
            catch(Exception e3) {  
               i = 99;
            }
         }
      }
      return i;
   };

The crash issue disappeared by removing count=0 (only with JIT on/off)

  1. JCL_Test_SE80_0/JCL_Test_SE80_1 (executed with JIT on by default)
    /test/functional/Java8andUp/playlist.xml
		<testCaseName>JCL_Test_SE80</testCaseName>
       -testnames \
         ...
	JCL_TEST_Java-Lang,\

/test/functional/Java8andUp/testng.xml

	<test name="JCL_TEST_Java-Lang">
		<classes>
			<class name="org.openj9.test.java.lang.Test_Class"/>
			<class name="org.openj9.test.java.lang.Test_ClassCastException"/> <------
			<class name="org.openj9.test.java.lang.Test_Compiler"/>
			<class name="org.openj9.test.java.lang.Test_InstantiationError"/>
			<class name="org.openj9.test.java.lang.Test_InstantiationException"/> <------

/test/functional/Java8andUp/src/org/openj9/test/java/lang/Test_ClassCastException.java

	public void test_Constructor() {
		// Test for method java.lang.ClassCastException()
		try {      <------------------------------------------ nested try-catch structure
			try {
				String z = (String)new Object();
				if (z == null)
					; //use z to avoid a warning msg
			} catch (ClassCastException e) {
				return;
			}
			AssertJUnit.assertTrue("Failed to generate Exception", false);
		} catch (Exception e) {
			AssertJUnit.assertTrue("Exception during test", false);
		}
	}

/test/functional/Java8andUp/src/org/openj9/test/java/lang/Test_InstantiationException.java

	public void test_Constructor() {
		try {          <------------------------------------------ nested try-catch structure
			try {
				Class cl = null;
				cl = Class.forName("java.io.OutputStream");
				cl.newInstance();
			} catch (InstantiationException e) {
				return;
			}
			AssertJUnit.assertTrue("Failed to generate Exception", false);
		} catch (Exception e) {
			AssertJUnit.assertTrue("Exception during test", false);
		}
	}

The crash issue disappeared by specifying -Xint (with JIT off)

So it turns out JIT doesn't work good with the nested exception structure on the Clang compiled build on Windows 32bit.

I suggest to create a separate issue for JIT to address the issue after merging the changes. Meanwhile, these failing test cases should be temporarily excluded unless the crash issue gets revolved by JIT.


@pshipton , any other concern at this point?

@pshipton
Copy link
Member

JIT should take a look at the crashes in advance of merging. We can't just exclude and ignore the problems as they may appear in other ways.
@andrewcraik

@pshipton
Copy link
Member

or maybe @gacholio can help, its strange there would be a JIT issue when only the bytecode interpreter is compiled with clang.

@gacholio
Copy link
Contributor

As with anything that crashes with the JIT and not without, limit the compilations down using -Xjit:verbose,vlog=filename and then -Xjit:limitfile=filename (you can trim the limitfile to get down to the minimal set of compiled methods). Especially effective with count=0.

@gacholio
Copy link
Contributor

Personally, I think making a change which causes tests for fail for unknown reasons and excluding the tests is a terrible idea. Who knows what else out there is going to exhibit the issue?

@ChengJin01
Copy link
Contributor Author

ChengJin01 commented Sep 28, 2018

It's definitely correct to get JIT guys involved in the analysis before merging the changes.
I am wondering how they get started to triage the problem as the changes have not yet been committed. Should we upload the compiled build to somewhere in vm_archive for downloading or they just pull out the changes for compilation on their own?

@pshipton
Copy link
Member

Guess we can check with them and see how they want to proceed.

@ChengJin01
Copy link
Contributor Author

ChengJin01 commented Sep 28, 2018

As with anything that crashes with the JIT and not without, limit the compilations down using -Xjit:verbose,vlog=filename and then -Xjit:limitfile=filename...

To confirm the crash was caused by JIT/count=0, actually I already isolated the failing methods by removing all methods in the test file before compiling the test case.

e.g.
in /test/functional/cmdLineTests/jvmtitests/src/com/ibm/jvmti/tests/BCIWithASM/ta001.java
remove all other methods except testBCIUsingASM_InjectTryCatch_Level2_injectCatchAndThrowNewAIOBE_catchAIOBEAndThrowNewAE.
to ensure the only the failing method is executed:

  1. It crashed when specifying -Xjit:count=0
Administrator@JCHTORWIN101 /cygdrive/c/temp/test
C:/temp_jdk8/perf_win/clang_jdk8_32bit/windows-x86-normal-server-release/images/j2sdk-image/jre/bin/java  -Xgcpolicy:gencon -Xjit:count=0 -Xnocompressedrefs  -Xdump    -agentlib:jvmtitest=test:ta001 -cp "C:/temp_jdk8/openj9_v2/jvmtest/functional/cmdLineTests/jvmtitests/jvmtitest.jar;C:/temp_jdk8/openj9_v2/jvmtest/TestConfig/lib/asm-all.jar" com.ibm.jvmti.tests.util.TestRunner
*** Testing [1/1]:      testBCIUsingASM_InjectTryCatch_Level2_injectCatchAndThrowNewAIOBE_catchAIOBEAndThrowNewAE
Expected NullPointerException received
(crashed)
  1. It passed wen specifying -Xjit or -Xint
Administrator@JCHTORWIN101 /cygdrive/c/temp/test
C:/temp_jdk8/perf_win/clang_jdk8_32bit/windows-x86-normal-server-release/images/j2sdk-image/jre/bin/java  -Xgcpolicy:gencon -Xjit -Xnocompressedrefs  -Xdump    -agentlib:jvmtitest=test:ta001 -cp "C:/temp_jdk8/openj9_v2/jvmtest/functional/cmdLineTests/jvmtitests/jvmtitest.jar;C:/temp_jdk8/openj9_v2/jvmtest/TestConfig/lib/asm-all.jar" com.ibm.jvmti.tests.util.TestRunner
*** Testing [1/1]:      testBCIUsingASM_InjectTryCatch_Level2_injectCatchAndThrowNewAIOBE_catchAIOBEAndThrowNewAE
Expected NullPointerException received
Expected ArithmeticException received
*** Test took 31 milliseconds
OK

The triage on other 2 failing tests followed the same way: the whole test case passed by removing the failing methods by specifying -Xjit(removing count=0) or -Xint (disabling JIT)

@ChengJin01
Copy link
Contributor Author

ChengJin01 commented Sep 28, 2018

To get things simple, directly compile and run the following class (the class was copied from /test/functional/cmdLineTests/jvmtitests/src/com/ibm/jvmti/tests/BCIWithASM/Source4.java which is used in testBCIUsingASM_InjectTryCatch_Level2_injectCatchAndThrowNewAIOBE_catchAIOBEAndThrowNewAE)

public class DivByZero {
	public int returnOne() {
		try {
			String str = null;  
			str.equals("a");	
			return 1;
		} catch ( NullPointerException e1 ) {
			try { 
				int intArray [] = new int [] {1};   
				return intArray[1];	
			} catch ( ArrayIndexOutOfBoundsException e2 ) {
				return 1/0;
			}
		}
	}
	public static void main(String[] args) {
		DivByZero db = new DivByZero();
		db.returnOne();
	   }
}

It crashed when specifying -Xjit:count=0.
It passed with the following exception when specifying -Xjit or -Xint, which is same results as the mingw compiled build on Windows 32bit.

Exception in thread "main" java.lang.ArithmeticException: divide by zero
        at DivByZero.returnOne(DivByZero.java:13)
        at DivByZero.main(DivByZero.java:19)

@fjeremic
Copy link
Contributor

@0dvictor could you help take a look? Seems to fail consistently with -Xjit:count=0. You can work with @ChengJin01 to get the build somehow so you don't have to recompile yourself.

@ChengJin01
Copy link
Contributor Author

ChengJin01 commented Sep 28, 2018

@0dvictor , I already packaged these compiled builds at T drive for your analysis

Administrator@JCHTORWIN101 ../z_jdk11_comp/perf_builds
$ ls *
java8:
build_jdk8_clang.zip build_jdk8_mingw.zip (64bit)

java8_32bit:
clang_jdk8_32bit.zip DivByZero.class DivByZero.java mingw_jdk8_32bit.zip

@0dvictor
Copy link
Contributor

I can run both builds now. However, instead of a crash, I got different failure - no crash but no output at all. Will investigate further.

I suspect it is related to the signal handler, since -Xrs -Xjit:count=0 works to me.

@0dvictor
Copy link
Contributor

The failure can be reproduced with -Xjit:noJitUntilMain,count=0,optlevel=noopt,limit={DivByZero.returnOne*}. Only one method, DivByZero,returnOne, is compiled. JIT trace logs showed the compiled code, stack atlas, and gc maps are identical for runs from the clang package and the mingw package.
mingw.log
clang.log

@ChengJin01
Copy link
Contributor Author

@keithc-ca , does it mean there is problem with signal handler to deal with the nested exceptions on Windows 32bit?

@gacholio
Copy link
Contributor

I suspect the root issue lies in the exception throw entry to the interpreter from the JIT (this path is never taken in pure interpreter mode). If you can run this in the debugger, break at throwCurrentExceptionFromJIT and trace from there. You will get exceptions from the divide by 0, which you should use gh to continue from. I don't believe there's any problem in the exception handlers or JIT code.

@0dvictor
Copy link
Contributor

I ran it in the debugger (Visual Studio), and found that Unhandled exception at 0x100194C0 (j9vm29.dll) in java.exe: 0xC00001A5: An invalid exception handler routine has been detected (parameters: 0x00000002). It seems the exception handler wasn't setup properly. It never reached throwCurrentExceptionFromJIT.

@keithc-ca
Copy link
Contributor

@babsingh Any thoughts about signal handling?

@gacholio
Copy link
Contributor

Is there an MSVC ifdef which should be windows?

@gacholio
Copy link
Contributor

Hmm, looks like OMR believes both _MSC_VER and clang will be defined at the same time.

@ChengJin01
Copy link
Contributor Author

ChengJin01 commented Nov 9, 2018

with the move to clang instead of mingw did we test different permutations of the values ....

We didn't have such tests but we already had perf test to ensure Clang meets our expectation at #2979 (comment)

is Windows still defining USE_COMPUTED_GOTO after switching to Clang? I believe it should...

USE_COMPUTED_GOTO is not defined when compiling with Clang because __GNUC__ is not the predefined macro in Clang (I already verified with test)
clang_mingw_macros.txt

/runtime/vm/BytecodeInterpreter.cpp
#if (defined(WIN32) && defined(__GNUC__)) || (defined(LINUX) && defined(J9HAMMER))
#define USE_COMPUTED_GOTO
#else
#undef USE_COMPUTED_GOTO
#endif

Meanwhile, defining USE_COMPUTED_GOTO will end up with a bunch of errors during the compilation with Clang.
USE_COMPUTED_GOTO_defined_clang_errors.txt

./BytecodeInterpreter.hpp:9619:4: error: cannot jump from this indirect goto statement to one of its possible targets
                        PERFORM_ACTION(i2b(REGISTER_ARGS));
                        ^
./BytecodeInterpreter.hpp:8535:4: note: expanded from macro 'PERFORM_ACTION'
                        EXECUTE_CURRENT_BYTECODE(); \
                        ^
./BytecodeInterpreter.hpp:8106:36: note: expanded from macro 'EXECUTE_CURRENT_BYTECODE'
#define EXECUTE_CURRENT_BYTECODE() EXECUTE_BYTECODE_NUMBER(*_pc)
                                   ^
./BytecodeInterpreter.hpp:8103:4: note: expanded from macro 'EXECUTE_BYTECODE_NUMBER'
                        goto *(bytecodeTable[number]); \
                        ^
./BytecodeInterpreter.hpp:8786:2: note: possible target of indirect goto statement
        JUMP_TARGET(J9_BCLOOP_SEND_TARGET_INITIAL_STATIC):
        ^
./BytecodeInterpreter.hpp:8097:27: note: expanded from macro 'JUMP_TARGET'
#define JUMP_TARGET(name) c##name
                          ^
<scratch space>:43:1: note: expanded from here
cJ9_BCLOOP_SEND_TARGET_INITIAL_STATIC
./BytecodeInterpreter.hpp:8779:8: note: jump bypasses variable initialization
        void *methodRunAddress = _sendMethod->methodRunAddress;

@ChengJin01
Copy link
Contributor Author

ChengJin01 commented Nov 9, 2018

I already added -Wno-ignored-attributes back and removed all changes with varargs functions as confirmed above.

@pshipton
Copy link
Member

pshipton commented Nov 9, 2018

jenkins test all win jdk8,jdk11

@pshipton
Copy link
Member

pshipton commented Nov 9, 2018

jenkins test all win32 jdk8

@pshipton
Copy link
Member

pshipton commented Nov 9, 2018

jenkins line endings check

@pshipton
Copy link
Member

pshipton commented Nov 9, 2018

jenkins copyright check

@pshipton
Copy link
Member

pshipton commented Nov 9, 2018

jenkins test all win jdk11

@pshipton
Copy link
Member

pshipton commented Nov 9, 2018

jenkins test sanity win jdk11

@pshipton
Copy link
Member

pshipton commented Nov 9, 2018

jenkins test extended win jdk11

@gacholio
Copy link
Contributor

gacholio commented Nov 9, 2018

Why are there module.xml changes for util? There don't seem to be any clang-compiled files there.

@pshipton pshipton dismissed gacholio’s stale review November 9, 2018 21:40

JNICALL changes have been removed

@pshipton
Copy link
Member

pshipton commented Nov 9, 2018

@gacholio note the "runtime/util/mingw_comp.c → runtime/util/clang_comp.c"

@ChengJin01
Copy link
Contributor Author

ChengJin01 commented Nov 9, 2018

Why are there module.xml changes for util? There don't seem to be any clang-compiled files there.

Because mingw_comp.h/.c changed to clang_comp.h/.c as we should remove/adjust all mingw related setting in script/code/file naming. So the module.xml has to be modified accordingly.

runtime/oti/mingw_comp.h --> runtime/oti/clang_comp.h (name and code changed)
runtime/util/mingw_comp.c -> runtime/util/clang_comp.c (name changed)

@pshipton pshipton changed the title WIP Replace mingw with clang in compilation Replace mingw with clang in compilation Nov 10, 2018
@pshipton
Copy link
Member

@sxa555 the switch to clang is ready to go. We can coordinate the OpenJ9 change with the Adopt change, maybe on Monday.

@pshipton
Copy link
Member

@ChengJin01 please fix the conflicts

The change is to modify all scripts/code
involved to ensure we use Clang in the place
of mingw to compile the bytecode interpreter
on Windows.

Signed-off-by: CHENGJin <jincheng@ca.ibm.com>
@ChengJin01
Copy link
Contributor Author

Conflicts in changes have been addressed.

@pshipton
Copy link
Member

jenkins compile win jdk8

@pshipton pshipton merged commit 7ebb4d4 into eclipse-openj9:master Nov 21, 2018
pshipton added a commit to pshipton/openj9 that referenced this pull request Jan 24, 2020
OpenJ9 replaced mingw with Clang in eclipse-openj9#2979

[ci skip]

Signed-off-by: Peter Shipton <Peter_Shipton@ca.ibm.com>
@ChengJin01 ChengJin01 deleted the replace_mingw_with_clang branch June 23, 2020 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants