Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing to closed socket connection kills the JVM #2871

Closed
pdbain-ibm opened this issue Sep 14, 2018 · 21 comments
Closed

Writing to closed socket connection kills the JVM #2871

pdbain-ibm opened this issue Sep 14, 2018 · 21 comments

Comments

@pdbain-ibm
Copy link
Contributor

Recent regression. Looks like a blocker. FYI @gacholio . I am investigating.

$ java -version
openjdk version "10.0.2-internal" 2018-07-17
OpenJDK Runtime Environment (build 10.0.2-internal+0-adhoc.jenkins.Build-JDK10-linuxx86-64cmprssptrs)
Eclipse OpenJ9 VM (build master-e60de42, JRE 10 Linux amd64-64-Bit Compressed References 20180914_446 (JIT enabled, AOT enabled)
OpenJ9   - e60de42
OMR      - 28b6fbc
JCL      - a6243ec based on jdk-10.0.2+13)
28a57f78e63e-root:8$ "/java/bin/java"  -Xcompressedrefs  \
> --add-exports=java.base/com.ibm.tools.attach.target=ALL-UNNAMED \
> -cp "/openj9test/TestConfig/scripts/testKitGen/../../../../jvmtest/TestConfig/resources:/openj9test/TestConfig/scripts/testKitGen/../../../../jvmtest/TestConfig/lib/testng.jar:/openj9test/TestConfig/scripts/testKitGen/../../../../jvmtest/TestConfig/lib/jcommander.jar:/openj9test/TestConfig/scripts/testKitGen/../../../../jvmtest/functional/Java8andUp/GeneralTest.jar" \
> -Dcom.ibm.tools.attach.enable=yes \
> -Dcom.ibm.tools.attach.timeout=15000 \
> -Djava.sidecar="--add-exports=java.base/com.ibm.tools.attach.target=ALL-UNNAMED" \
> org.testng.TestNG -d "/openj9test/TestConfig/scripts/testKitGen/../../../TestConfig/test_output_15369295991794/TestAttachErrorHandling_0" "/openj9test/TestConfig/scripts/testKitGen/../../../../jvmtest/functional/Java8andUp/testng.xml" -testnames TestAttachErrorHandling \
> -groups level.extended \
> -excludegroups d.*.linux_x86-64_cmprssptrs,d.*.arch.x86,d.*.os.linux,d.*.bits.64,d.*.generic-all
[IncludeExcludeTestAnnotationTransformer] [INFO] exclude file is null
...
... TestNG 6.14.2 by C?dric Beust (cedric@beust.com)
...

[AttachApiTest] [WARN] /tmp/.com_ibm_tools_attach/2351 not removed because -1 still alive
[AttachApiTest] [WARN] /tmp/.com_ibm_tools_attach/1234_5 not removed because -1 still alive
[AttachApiTest] [WARN] /tmp/.com_ibm_tools_attach/foo not removed because -1 still alive
28a57f78e63e-root:9$ echo $?
141
@pdbain-ibm
Copy link
Contributor Author

None of the error handling tests are reporting failure. AttachAPI sanity test passes.

@pdbain-ibm
Copy link
Contributor Author

The TestNG harness is not printing the list of passing and failing tests. the exit status of 141 is suspicious: a normal test failure causes an exit status of 1.

@pdbain-ibm pdbain-ibm changed the title TestAttachErrorHandling_0 fails Writing to closed socket connection kills the JVM Sep 14, 2018
@pdbain-ibm
Copy link
Contributor Author

Test is failing in testVmShutdownWhileAttached:

            info("terminated " + tgtId);
            tgtVm.getSystemProperties();
            info("completed getSystemProperties to " + tgtId);

Console log:

[TargetManager] [DEBUG] waitFor status = 0
[AttachApiTest] [DEBUG] TEST_INFO: terminated 16157
status=141

attach API log:

    1536935911319 16157: 18 [Attach API teardown]: AttachHandler tryObtainMasterLock failed
    1536935911322 16157: 18 [Attach API teardown]: deleting my directory 
    1536935911323 16157: 18 [Attach API teardown]: AttachHandler closed semaphore
1536935911335 16135: 1 [main]: streamSend ATTACH_GETSYSTEMPROPERTIES

at which point the attacher dies.

	public Properties getSystemProperties() throws IOException {
		if (!targetAttached) {
			/*[MSG "K0544", "Target not attached"]*/
			throw new IOException(getString("K0544")); //$NON-NLS-1$
		}
		Properties props = getTargetProperties(true);
		return props;
	}

I instrumented the test to catch Throwable: no exception or other throwable is thrown.

@pdbain-ibm
Copy link
Contributor Author

pdbain-ibm commented Sep 14, 2018

The test connects to a target Java process, opening a socket in the process. It then causes the target to stop (closing the socket) and tries to write to the socket.

I wrote an ad-hoc test (attached):

			OutputStream commandStream = targetSocket.getOutputStream();
			Thread.sleep(10000);
			while (true) {
				System.out.println("Attacher:sending");
				commandStream.write(99);
				Thread.sleep(1000);
			}

This previously worked (i.e. threw an exception) on OpenJ9:

$ java -version
openjdk version "10.0.2-adoptopenjdk" 2018-07-17
OpenJDK Runtime Environment (build 10.0.2-adoptopenjdk+13)
Eclipse OpenJ9 VM (build openj9-0.9.0, JRE 10 Linux amd64-64-Bit Compressed References 20180813_102 (JIT enabled, AOT enabled)
OpenJ9   - 24e53631
OMR      - fad6bf6e
JCL      - 7db90eda56 based on jdk-10.0.2+13)

$ (java attacher; echo Java exited with status $?) & (sleep 2; java target 12345)
[1] 5758
Attacher: Server port = 12345
Attacher:connected
Target attached
Target read
Attacher:sending
Target result=99
Target read
Target result=99
...
Attacher:sending
Target result=99
Target exit
$ Attacher:sending
Attacher:sending
java.net.SocketException: Broken pipe (Write failed)
	at java.net.SocketOutputStream.socketWrite(java.base@10.0.2-adoptopenjdk/SocketOutputStream.java:111)
	at java.net.SocketOutputStream.write(java.base@10.0.2-adoptopenjdk/SocketOutputStream.java:134)
	at attacher.main(attacher.java:21)
Attacher:exit
Java exited with status 0

... and OpenJDK:

$ java -version
openjdk version "10.0.1-adoptopenjdk" 2018-04-17
OpenJDK Runtime Environment (build 10.0.1-adoptopenjdk+10)
OpenJDK 64-Bit Server VM (build 10.0.1-adoptopenjdk+10, mixed mode)
...
java.net.SocketException: Broken pipe (Write failed)
	at java.base/java.net.SocketOutputStream.socketWrite0(Native Method)
	at java.base/java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
	at java.base/java.net.SocketOutputStream.write(SocketOutputStream.java:134)
	at attacher.main(attacher.java:21)

But not with the latest OpenJ9:

$ java -version
openjdk version "10.0.2-internal" 2018-07-17
OpenJDK Runtime Environment (build 10.0.2-internal+0-adhoc..openj9-openjdk-jdk10)
Eclipse OpenJ9 VM (build master-b04a51b, JRE 10 Linux amd64-64-Bit Compressed References 20180914_000000 (JIT enabled, AOT enabled)
OpenJ9   - b04a51b
OMR      - 28b6fbc
JCL      - a6243ec based on jdk-10.0.2+13)
...
Target read
Attacher:sending
Target result=99
Target read
Attacher:sending
Target result=99
Target exit
$ Attacher:sending
Attacher:sending
Java exited with status 141

The first call to write() returns silently and the second call kills the JVM instead of throwing an exception.

@pdbain-ibm
Copy link
Contributor Author

2871.zip

@pdbain-ibm
Copy link
Contributor Author

This looks like a class library change. @andrew-m-leonard Has the socket code changed recently?
@pshipton would you please remove the "test failure" label?

@andrew-m-leonard
Copy link
Contributor

Nothing has changed in jdk10u for 2 months... extensions haven't changed from about a month either

@pdbain-ibm
Copy link
Contributor Author

Fails with Java 8:

$ java -version
openjdk version "1.8.0_192-internal"
OpenJDK Runtime Environment (build 1.8.0_192-internal-pdbain_2018_09_14_14_16-b00)
Eclipse OpenJ9 VM (build verifier-ab170ae, JRE 1.8.0 Linux amd64-64-Bit Compressed References 20180914_000000 (JIT enabled, AOT enabled)
OpenJ9 - ab170ae
OMR - 28b6fbc
JCL - df909af based on jdk8u192-b03)

@andrew-m-leonard
Copy link
Contributor

Java8 extensions were updated to jdk8u192-b03 recently

@pdbain-ibm
Copy link
Contributor Author

Passing with the latest nightly build from
https://github.com/AdoptOpenJDK/openjdk10-binaries/releases/download/jdk10u-2018-09-17-09-19/OpenJDK10U-jre_x64_linux_openj9_2018-09-17-09-19.tar.gz

openjdk version "10.0.2-adoptopenjdk" 2018-07-17
OpenJDK Runtime Environment (build 10.0.2-adoptopenjdk+13)
Eclipse OpenJ9 VM (build master-669607af, JRE 10 Linux amd64-64-Bit Compressed References 20180916_12 (JIT enabled, AOT enabled)
OpenJ9 - 669607a
OMR - f29d158
JCL - a6243eca06 based on jdk-10.0.2+13)

@pdbain-ibm
Copy link
Contributor Author

Failed with latest eclipse build:
openjdk version "10.0.2-internal" 2018-07-**

OpenJDK Runtime Environment (build 10.0.2-internal+0-adhoc.jenkins.Build-JDK10-linuxx86-64)
Eclipse OpenJ9 VM (build master-d01f5df, JRE 10 Linux amd64-64-Bit 20180918_409 (JIT enabled, AOT enabled)
OpenJ9 - d01f5df
OMR - 3994cd6
JCL - 7786d58 based on jdk-10.0.2+13)

**

@pdbain-ibm
Copy link
Contributor Author

Bracketed it to the GCC-7.3 changes:

commit aeba9dec5b0b282c9868541892f144dc6954f7e1   ############## FAILED    #####################
Author: Violeta Sebe <vsebe@ca.ibm.com>
Date:   Fri Sep 14 12:22:24 2018 -0400 

    Enable GCC-7.3 toolchain for Linux PPC LE in Jenkins files 
    
    Add build environment variables for gcc-7.3 for JDK11 and Next. 
    
    [ci skip] 
    
    Signed-off-by: Violeta Sebe <vsebe@ca.ibm.com>

commit 9ef5381dc1e674d675c7929e10065220eda3b617   ############## PASSED    #####################
Author: Jack Lu <Jack.S.Lu@ibm.com>
Date:   Wed Sep 5 16:50:06 2018 -0400 

    Add support for Constant Dynamic entry in classfile writer
    
    - add constant dynamic case during constantpool analyze and writing
    
    Signed-off-by: Jack Lu <Jack.S.Lu@ibm.com>

Those changes should affect only PPCLE:

  build_env:
     vars:
       11: 'CC=gcc-7 CXX=g++-7'
       next: 'CC=gcc-7 CXX=g++-7'

@pshipton
Copy link
Member

pshipton commented Sep 19, 2018

The gcc 7.3 switch only occurred for Java 11 builds, yet the test is failing on older versions.

@pdbain-ibm
Copy link
Contributor Author

The log is lying. There were more changes between:

$ git log --no-merges --graph --oneline 9ef538...aeba9dec5b0b282c9868541892f144dc6954f7e1
* aeba9de Enable GCC-7.3 toolchain for Linux PPC LE in Jenkins files
* 6c6a0c3 Real-time clock is not supported on OSX
* f5c6de3 private field '_tokenCount' unused in BarrierSynchronization
* 204428c Fix "Abort trap: 6" on OSX
* da47202 Satisfy runtime linker on OSX for __cxa_pure_virtual's definition
* bb012f5 Implement constant dynamic support in Power codegen
* e67da56 Remove unnecessary synchronization in attach API logging
* 1346615 Check for valid Constant_Dynamic name and signature
* be18ee7 Remove use of deprecated API
* 31f655c Link omrsig to port and vm only if J9VM_PORT_OMRSIG_SUPPORT is enabled
* 2311b33 Fix indentation in hyvm/module.xml
* 45135e8 Enable OMRPORT_OMRSIG_SUPPORT globally in OpenJ9
* 11c7267 Remove OMRPORT_OMRSIG_SUPPORT definitions
* cfe191c Properly link omrsig when building constgen and j9ddrgen
* 50a2d7f Remove duplicate definitions for J9VMSTATE_GC states from OpenJ9
* 2c0ae6d Clean up pushArguments()
* 5914089 Use 64 bit regs on 32 in inlineAtomicOps
* 2dbbc37 Use 64 bit regs on 32 in BCDCHKEvaluator
* a34091f Update ldc to load Constant_Dynamic primitives as I_32 type
* 121dba9 Implement direct-call field resolve helpers
* 4012872 Enable idle tuning only for gencon policy
* ceb4f8e Handle string conversion errors in attach API
* ac51d09 Fix getNestMembers API exception handling
* 33096e2 Reject illegal entries in the bootstrap argument array
* f69c15c Move build instructions to the /doc directory
* 7ffe250 Handle private unresolved invokeinterface
* 0f85e46 Handle private unresolved invokevirtual
* 7ece490 Change .i macro annotations to not use @, pt. 2
* 41c63b9 Add support for gcc-7.3 toolchain to Jenkins Build pipelines
* 4d3f23c JDK11 nestmate virtual private function call
* 2699193 IBM Z Codegen changes to support Condy in jdk11
* 9ef5381 Add support for Constant Dynamic entry in classfile writer

Searching through those changes.

@pshipton
Copy link
Member

@pdbain-ibm Please exclude TestAttachErrorHandling_SE80_0 (Java 8 ) and TestAttachErrorHandling_0 (later versions) from the OpenJ9 extended testing so all the extended tests stop failing nightly.

@pdbain-ibm
Copy link
Contributor Author

@pshipton Will do. I am getting close to the root cause.

pdbain-ibm added a commit to pdbain-ibm/openj9 that referenced this issue Sep 20, 2018
Disable these tests while investigating issue eclipse-openj9#2871 "Writing to closed socket
connection kills the JVM".

[ci-skip]

Signed-off-by: Peter Bain <peter_bain@ca.ibm.com>
@pdbain-ibm
Copy link
Contributor Author

pdbain-ibm commented Sep 20, 2018

Reverting these omrsig commits fixes the problem:

To test, unpack attached archive, add the Java under test, and runt the runt script. If the script prints Java exited with status 141 the test fails. If it prints

java.net.SocketException: Broken pipe (Write failed)
	at java.net.SocketOutputStream.socketWrite(java.base@10.0.2-internal/SocketOutputStream.java:111)
	at java.net.SocketOutputStream.write(java.base@10.0.2-internal/SocketOutputStream.java:134)
	at attacher.main(attacher.java:21)
Attacher:exit
Java exited with status 0
``` it passed.
[2871_files.zip](https://github.com/eclipse/openj9/files/2401609/2871_files.zip)

@babsingh would you please investigate?

@pshipton
Copy link
Member

Not sure Peter's ping worked, since its quoted.

@babsingh

@pshipton
Copy link
Member

@babsingh See #2871 (comment)

@babsingh
Copy link
Contributor

Fix - #2965

babsingh added a commit to babsingh/openj9 that referenced this issue Sep 20, 2018
omrsig.cpp::omrsig_primary_sigaction/omrsig_primary_signal doesn't
support registration with SIG_DFL/SIG_IGN/NULL as the signal handler.

1) SIG_IGN is passed as the signal handler for SIGPIPE.
2) NULL is passed as the new signal handler for SIGILL.
 
In the above cases, omrsig_primary_sigaction/omrsig_primary_signal
shouldn't be used for registering a signal handler. 

Now onwards, only <signal.h>::sigaction/signal will used in the above
cases.

SIGPIPE needs to be ignored for the Java Attach API to work properly.

Fixes eclipse-openj9#2871

Signed-off-by: Babneet Singh <sbabneet@ca.ibm.com>
@andrew-m-leonard
Copy link
Contributor

@pshipton Interesting some of our TLS 1.3 SSL test failures fail with "Java exited with status 141", I will see if this PR has fixed that...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants