Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV during imports for Ludwig components #2882

Closed
jimthompson5802 opened this issue Dec 24, 2022 · 4 comments
Closed

SIGSEGV during imports for Ludwig components #2882

jimthompson5802 opened this issue Dec 24, 2022 · 4 comments

Comments

@jimthompson5802
Copy link
Collaborator

jimthompson5802 commented Dec 24, 2022

Describe the bug
SIGSEGV fault occurs during import of Ludwig components

To Reproduce
Steps to reproduce the behavior:

  1. create python file mwe_sigsegv_example.py with this contents:
from ludwig.schema.model_config import ModelConfig
from tests.integration_tests.utils import number_feature

print("all done")
  1. run python mwe_sigsegv_example.py

Expected behavior
No errors for module imports

Screenshots
Output from running the above program

python mwe_sigsegv_import.py

# program output
Segmentation fault

Environment (please complete the following information):

  • OS: MacOS 12.6.1, Docker Desktop 4.9.1, Docker Image OS: Debian GNU/Linux 10 (buster)
  • Python version: 3.8.15
  • Ludwig version: ludwig v0.7.dev

Additional context

  • First encountered this error during development of data augmentation feature on my development branch. However, the problem also occurs on the master branch as well.

  • if the order of from ... statements are reversed, the SIGSEGV fault DOES NOT occur. No error when the order of imports are this

from tests.integration_tests.utils import number_feature
from ludwig.schema.model_config import ModelConfig

print("all done")
  • Here is output when faulthandler is enabled in the program
import faulthandler
faulthandler.enable()

from ludwig.schema.model_config import ModelConfig
from tests.integration_tests.utils import number_feature

print("all done")

Output from the above

python mwe_sigsegv_import.py
Fatal Python error: Segmentation fault

Current thread 0x00007fc062499740 (most recent call first):
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 1166 in create_module
  File "<frozen importlib._bootstrap>", line 556 in module_from_spec
  File "<frozen importlib._bootstrap>", line 657 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
  File "/usr/local/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 47 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 843 in exec_module
  File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
  File "/usr/local/lib/python3.8/site-packages/ray/core/generated/common_pb2.py", line 6 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 843 in exec_module
  File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "/usr/local/lib/python3.8/site-packages/ray/exceptions.py", line 10 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 843 in exec_module
  File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 1174 in exec_module
  File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "/usr/local/lib/python3.8/site-packages/ray/__init__.py", line 116 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 843 in exec_module
  File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 961 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "/opt/project/ludwig/progress_bar.py", line 7 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 843 in exec_module
  File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "/opt/project/ludwig/models/predictor.py", line 19 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 843 in exec_module
  File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "/opt/project/ludwig/api.py", line 68 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 843 in exec_module
  File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "/opt/project/tests/integration_tests/utils.py", line 37 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 843 in exec_module
  File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "mwe_sigsegv_import.py", line 5 in <module>
Segmentation fault
@jimthompson5802
Copy link
Collaborator Author

FWIW...after looking at the fault handler stack trace, determined SEGV occurs in the import processing for ray. The SEGV occurs when ray imports google.protobuf.

Triming down the prior example code to this.

import faulthandler
faulthandler.enable()

from ludwig.schema.model_config import ModelConfig
from google.protobuf.pyext import _message

print("all done")

The SEGV fault trace is this

Fatal Python error: Segmentation fault

Current thread 0x00007f8a29c9b740 (most recent call first):
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 1166 in create_module
  File "<frozen importlib._bootstrap>", line 556 in module_from_spec
  File "<frozen importlib._bootstrap>", line 657 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
  File "/opt/project/sandbox/vision_models/augmentation/schema_testing/mwe_sigsegv_import.py", line 5 in <module>

As noted earlier, if the order for the from ... import ... statements are reversed, the SEGV fault DOES NOT occur.

@jimthompson5802
Copy link
Collaborator Author

In searching issues for google.protobuf there are, as of this posting, 4 open issues related to SIGSEGV with Python import.

One work-around mentioned in a related GH issue: export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python DID NOT RESOLVE SIGSEGV

@jimthompson5802
Copy link
Collaborator Author

Another layer of the onion...noticed in my development environment I was running with ray 2.1.0. Rebuilt the development environment to use ray 2.0.0, which matches the current CI test environment.

With ray 2.0.0 the segmentation fault NO longer occurs in the sample code provided with this issue.

@jimthompson5802
Copy link
Collaborator Author

Closing. For awareness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant