Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV on termination #842

Open
ShaheedHaque opened this issue Aug 16, 2020 · 13 comments
Open

SIGSEGV on termination #842

ShaheedHaque opened this issue Aug 16, 2020 · 13 comments

Comments

@ShaheedHaque
Copy link

As per #720, to get 1.0.2 working for us, I moved JPype initialisation to be delayed it until actually needed. On process exit, it seems that if the initialisation code is not actually called, the process exists with a SIGSEGV:

Fatal Python error: Segmentation fault

Thread 0x00007f84e4194740 (most recent call first):
  File "/usr/local/lib/python3.8/dist-packages/jpype/_core.py", line 321 in _JTerminate

And here is line 321:

# In order to shutdown cleanly we need the reference queue stopped
# otherwise, we can experience a crash if a Java thread is waiting
# for the GIL.
def _JTerminate():
    try:
        _jpype.shutdown()  <<<<<<<<<<<<< 321
    except RuntimeError:
        pass

The only reference to jpype that this program could have made is an import as part of its transitive fanout:

import jpype

and it won't have invoked startJVM().

@Thrameos
Copy link
Contributor

Thrameos commented Aug 16, 2020 via email

@ShaheedHaque
Copy link
Author

ShaheedHaque commented Aug 18, 2020

First, I have noticed that the SIGSEGV is not easy to reproduce, happening as rarely as perhaps 1 time in 200.

Second, I've added a print() of “isJVMStarted()”, and not seen the failure after a handful of runs.

Will report back after gathering more data.

@ShaheedHaque
Copy link
Author

OK, it failed just now even with the “isJVMStarted()” in place on my Ubuntu setup:

Fatal Python error: Segmentation fault

Thread 0x00007fbca3605740 (most recent call first):
  File "/usr/local/lib/python3.8/dist-packages/jpype/_core.py", line 322 in _JTerminate  <<< line number changed because of inserted isJVMStarted().

Again, it must have run without issue several hundred times before this point.

Interestingly, my MacOS-based colleage is regularly seeing what we think is the same issue, and he was able to extract a crash log: hs_err_pid53983.log. This repro is from the cycling of Celery workers, with the SIGSEGV at process exit (or at least we assume so, since it has no discernible effect on the operation of the system). Note: he does not have the inserted isJVMStarted().

@ShaheedHaque
Copy link
Author

And here is a curious thing...I just ran myusual test script, and it seemed to exit twice, like this:

...
========== 3 failed, 362 passed, 1261 warnings in 6067.84s (1:41:07) ===========
isJVMStarted=================== True
isJVMStarted=================== False

So, _JTerminate() was called twice, and once thought the JVM started, and once not.

@Thrameos
Copy link
Contributor

Thrameos commented Aug 23, 2020 via email

@ShaheedHaque
Copy link
Author

I tried before, but there is a lot of stuff, and when I trimmed too far it stopped failing. Now that we have a slightly different problem, I'll try once again. I'll report back with any results.

@mariusvniekerk
Copy link

What are the ramifications of doing?

def _JTerminate():
    try:
         if _jpype.isStarted():
              _jpype.shutdown() 
    except RuntimeError:
        pass

@Thrameos
Copy link
Contributor

Thrameos commented Sep 3, 2020 via email

@ShaheedHaque
Copy link
Author

What are the ramifications of doing?

def _JTerminate():
    try:
         if _jpype.isStarted():
              _jpype.shutdown() 
    except RuntimeError:
        pass

Based on my experiment adding calls to isJVMStarted(), it is not clear to me that would make any difference, because the SEGV can occur even when the test returns False as in this example:

=========================== short test summary info ============================
FAILED test/test_suite74gb_franecki.py::TestPeoplesPension::test_100_complete_use_cases[SubmitEnrolmentsAndContributions_]
FAILED test/test_suite90_live.py::TestLiveA::test_400_check_log_files____ - A...
========== 2 failed, 363 passed, 1242 warnings in 6168.39s (1:42:48) ===========
Fatal Python error: Segmentation fault

Thread 0x00007f6260ee6740 (most recent call first):
  File "/usr/local/lib/python3.8/dist-packages/jpype/_core.py", line 322 in _JTerminate
isJVMStarted=================== False

@Thrameos
Copy link
Contributor

Can you look over #937 to see if an option fixes this issue?

@nayana-prashanth
Copy link

nayana-prashanth commented May 7, 2021

Hi. I am using JPype 1.2.0 and have been seeing this issue for a while. The Jenkins build or local run has intermittent failures with below failure. Any suggestion to resolve this issue is appreciated:

2021-05-06 16:32:36.041  
2021-05-06 16:32:36.041  Thread 0x00007fee56e3a100 (most recent call first):
2021-05-06 16:32:36.041    File "/opt/app-root/lib64/python3.8/site-packages/jpype/_core.py", line 340 in _JTerminate
2021-05-06 16:32:36.041  #
2021-05-06 16:32:36.041  # A fatal error has been detected by the Java Runtime Environment:
2021-05-06 16:32:36.042  #
2021-05-06 16:32:36.042  #  SIGSEGV (0xb) at pc=0x00007fee55d6a9bf (sent by kill), pid=270, tid=427
2021-05-06 16:32:36.042  #
2021-05-06 16:32:36.042  # JRE version: OpenJDK Runtime Environment 18.9 (11.0.9.1+1) (build 11.0.9.1+1-LTS)
2021-05-06 16:32:36.042  # Java VM: OpenJDK 64-Bit Server VM 18.9 (11.0.9.1+1-LTS, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
2021-05-06 16:32:36.042  # Problematic frame:
2021-05-06 16:32:36.042  # C  [libpthread.so.0+0x129bf]  raise+0x10f
2021-05-06 16:32:36.042  #
2021-05-06 16:32:36.042  # Core dump will be written. Default location: /home/jenkins/workspace/ne-learning_concord-mono_IS-2243/e2e/tests/step_defs/core.270
2021-05-06 16:32:36.042  #
2021-05-06 16:32:36.042  # An error report file with more information is saved as:
2021-05-06 16:32:36.042  # /home/jenkins/workspace/ne-learning_concord-mono_IS-2243/e2e/tests/step_defs/hs_err_pid270.log
2021-05-06 16:32:36.042  #
2021-05-06 16:32:36.042  # If you would like to submit a bug report, please visit:
2021-05-06 16:32:36.042  #   https://bugzilla.redhat.com/enter_bug.cgi?product=Red%20Hat%20Enterprise%20Linux%208&component=java-11-openjdk
2021-05-06 16:32:36.042  #
2021-05-06 16:32:36.042  Fatal Python error: Aborted
2021-05-06 16:32:36.042  
2021-05-06 16:32:36.042  Thread 0x00007fee56e3a100 (most recent call first):
2021-05-06 16:32:36.042    File "/opt/app-root/lib64/python3.8/site-packages/jpype/_core.py", line 340 in _JTerminate
2021-05-06 16:32:36.042  /home/jenkins/workspace/ne-learning_concord-mono_IS-2243@tmp/durable-0824d76d/script.sh: line 54:   270 Aborted     ```

@noamaviv
Copy link

Could it be related to what you wrote here in your documenation?

https://jpype.readthedocs.io/en/latest/userguide.html#errors-reported-by-python-fault-handler

@noamaviv
Copy link

It seems like using the -p no:faulthandler switch on pytest might help avoid these errors.

hinerm added a commit to imagej/pyimagej that referenced this issue Nov 18, 2021
This may help avoid the random segfaults, according to
jpype-project/jpype#842 (comment)
normanrz added a commit to scalableminds/webknossos-libs that referenced this issue Jun 14, 2023
wjp added a commit to SubDisc/pySubDisc that referenced this issue Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants