Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not import pytorch with other library (Segmentation Fault) #26405

Closed
ilham-bintang opened this issue Sep 18, 2019 · 1 comment
Closed

Can not import pytorch with other library (Segmentation Fault) #26405

ilham-bintang opened this issue Sep 18, 2019 · 1 comment
Labels
high priority module: binaries Anything related to official binaries that we release to users module: crash Problem manifests as a hard crash, as opposed to a RuntimeError triage review

Comments

@ilham-bintang
Copy link

ilham-bintang commented Sep 18, 2019

I did not know this exactly pytorch issue or ngt issue. I also submit this issue in NGT yahoojapan/NGT#37 (comment)

馃悰 Bug

  • I can import pytorch properly
    Screen Shot 2019-09-18 at 21 37 05

  • After import other library (ngtpy, flair, or other), it raised segfault core dump
    Screen Shot 2019-09-18 at 21 37 57

  • If I swap the import (ngtpy first), it will raised free(): invalid pointer.
    Screen Shot 2019-09-18 at 21 39 22

I did not know what happened. I use ngt (https://github.com/yahoojapan/NGT)

To Reproduce

Steps to reproduce the behavior:

  1. Import torch, then other library -> Segfault
  2. Import other library, then torch -> Free(): invalid pointer

Expected behavior

can import properly

Environment

PyTorch version: 1.2.0
Is debug build: No
CUDA used to build PyTorch: 10.0.130

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: version 3.10.2

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.0.130
GPU models and configuration:
GPU 0: Tesla V100-DGXS-32GB
GPU 1: Tesla V100-DGXS-32GB
GPU 2: Tesla V100-DGXS-32GB
GPU 3: Tesla V100-DGXS-32GB

Nvidia driver version: 410.104
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.5.0

Versions of relevant libraries:
[pip3] numpy==1.17.2
[pip3] pytorch-transformers==1.2.0
[pip3] torch==1.2.0
[conda] blas                      1.0                         mkl
[conda] mkl                       2019.3                      199
[conda] mkl-service               1.1.2            py37he904b0f_5
[conda] mkl_fft                   1.0.10           py37ha843d7b_0
[conda] mkl_random                1.0.2            py37hd81dba3_0

Additional context

I inspect it using gdb, and the backtrace output:

#0  0x00007fff1eaf3e3e in pybind11::detail::make_new_python_type (rec=...) at /opt/python/cp36-cp36m/include/python3.6m/pybind11/detail/class.h:565
#1  0x00007fff1eaf938e in pybind11::detail::generic_type::initialize (this=this@entry=0x7fffffffc900, rec=...) at /opt/python/cp36-cp36m/include/python3.6m/pybind11/pybind11.h:902
#2  0x00007fff1eac1a97 in pybind11::class_<Index>::class_<> (name=0x7fff1eb61378 "Index", scope=..., this=0x7fffffffc900)
    at /opt/python/cp36-cp36m/include/python3.6m/pybind11/pybind11.h:1092
#3  pybind11_init_ngtpy (m=...) at src/ngtpy.cpp:436
#4  0x00007fff1eac34d0 in PyInit_ngtpy () at src/ngtpy.cpp:421
#5  0x00000000005e4268 in _PyImport_LoadDynamicModuleWithSpec ()
#6  0x00000000005e4522 in ?? ()
#7  0x000000000056246e in PyCFunction_Call ()
#8  0x00000000004fed26 in _PyEval_EvalFrameDefault ()
#9  0x00000000004f6128 in ?? ()
#10 0x00000000004f7d60 in ?? ()
#11 0x00000000004f876d in ?? ()
#12 0x00000000004f98c7 in _PyEval_EvalFrameDefault ()
#13 0x00000000004f7a28 in ?? ()
#14 0x00000000004f876d in ?? ()
#15 0x00000000004f98c7 in _PyEval_EvalFrameDefault ()
#16 0x00000000004f7a28 in ?? ()
#17 0x00000000004f876d in ?? ()
#18 0x00000000004f98c7 in _PyEval_EvalFrameDefault ()
#19 0x00000000004f7a28 in ?? ()
#20 0x00000000004f876d in ?? ()
#21 0x00000000004f98c7 in _PyEval_EvalFrameDefault ()
#22 0x00000000004f7a28 in ?? ()
#23 0x00000000004f876d in ?? ()
#24 0x00000000004f98c7 in _PyEval_EvalFrameDefault ()
#25 0x00000000004f4065 in _PyFunction_FastCallDict ()
#26 0x000000000057c8f1 in _PyObject_FastCallDict ()
#27 0x000000000057cc5e in _PyObject_CallMethodIdObjArgs ()
#28 0x00000000004cf5dd in PyImport_ImportModuleLevelObject ()
#29 0x00000000004fb864 in _PyEval_EvalFrameDefault ()
#30 0x00000000004f6128 in ?? ()
#31 0x00000000004f9023 in PyEval_EvalCode ()
#32 0x00000000006415b2 in ?? ()
#33 0x000000000064166a in PyRun_FileExFlags ()
#34 0x0000000000643730 in PyRun_SimpleFileExFlags ()
#35 0x000000000062b26e in Py_Main ()
#36 0x00000000004b4cb0 in main ()

cc @ezyang @gchanan @zou3519

@izdeby izdeby added high priority module: binaries Anything related to official binaries that we release to users module: crash Problem manifests as a hard crash, as opposed to a RuntimeError triage review labels Sep 18, 2019
@ilham-bintang
Copy link
Author

HI @izdeby thank you for the fast response.
I already solve it. and this is not pytorch issue.

Regarding this issue: yahoojapan/NGT#34
the error caused by NGT C++ module error. I reinstall it from source and found permission issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority module: binaries Anything related to official binaries that we release to users module: crash Problem manifests as a hard crash, as opposed to a RuntimeError triage review
Projects
None yet
Development

No branches or pull requests

2 participants