You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
My laboratory updated the glibc on our workstation, and after that my pytorch begin to fail from time to time. It seems that the error is caused by customed usage of glibc, which depends on a certain version probably. Does anyone encounter similar problem?
Thank you!
-----------------------------------update record ---------------
Jul 19 03:19:34 Updated: glibc-common-2.17-157.el7_3.5.x86_64
Jul 19 03:19:35 Updated: glibc-2.17-157.el7_3.5.x86_64
Jul 19 03:19:35 Updated: glibc-headers-2.17-157.el7_3.5.x86_64
Jul 19 03:19:35 Updated: glibc-devel-2.17-157.el7_3.5.x86_64
Jul 19 03:19:36 Updated: glibc-2.17-157.el7_3.5.i686
------------ldd info----------
My error has been solved, it's due to memory constraints of the system. The training process is very memory consuming and use 32 GB, which far exceeds the allocated 10 GB.
Hi,
My laboratory updated the glibc on our workstation, and after that my pytorch begin to fail from time to time. It seems that the error is caused by customed usage of glibc, which depends on a certain version probably. Does anyone encounter similar problem?
Thank you!
-----------------------------------update record ---------------
Jul 19 03:19:34 Updated: glibc-common-2.17-157.el7_3.5.x86_64
Jul 19 03:19:35 Updated: glibc-2.17-157.el7_3.5.x86_64
Jul 19 03:19:35 Updated: glibc-headers-2.17-157.el7_3.5.x86_64
Jul 19 03:19:35 Updated: glibc-devel-2.17-157.el7_3.5.x86_64
Jul 19 03:19:36 Updated: glibc-2.17-157.el7_3.5.i686
------------ldd info----------
ldd /vulcan/scratch/jackwang/anaconda3/lib/python3.6/site-packages/torch/lib/libcudnn.so
------------Below is the output:-------
*** Error in `python': double free or corruption (fasttop): 0x00007fe508103130 ***
======= Backtrace: =========
/usr/lib64/libc.so.6(+0x7c503)[0x7fe6a59e1503]
/usr/lib64/nvidia/libcuda.so.1(+0x1aa1df)[0x7fe674bf01df]
/usr/lib64/nvidia/libcuda.so.1(+0xd051b)[0x7fe674b1651b]
/usr/lib64/nvidia/libcuda.so.1(cuStreamCreate+0x5b)[0x7fe674c3629b]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/site-packages/torch/lib/libcudnn.so.6(+0x864236)[0x7fe690180236]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/site-packages/torch/lib/libcudnn.so.6(+0x8966a4)[0x7fe6901b26a4]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/site-packages/torch/lib/libcudnn.so.6(+0x7839bf)[0x7fe69009f9bf]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/site-packages/torch/lib/libcudnn.so.6(cudnnRNNBackwardData+0xf42)[0x7fe69009dd12]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(ffi_call_unix64+0x4c)[0x7fe69d1cf5b0]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(ffi_call+0x1f5)[0x7fe69d1ced55]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(ctypescallproc+0x3dc)[0x7fe69d1c689c]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(+0x9df3)[0x7fe69d1bedf3]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyObjectFastCallDict+0x9e)[0x7fe6a68bec1e]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(+0x14895b)[0x7fe6a699b95b]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyEvalEvalFrameDefault+0x2c40)[0x7fe6a699ed40]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(+0x146514)[0x7fe6a6999514]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(+0x148c88)[0x7fe6a699bc88]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyEvalEvalFrameDefault+0x2c40)[0x7fe6a699ed40]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(+0x146514)[0x7fe6a6999514]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyFunctionFastCallDict+0x285)[0x7fe6a699a515]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyObjectFastCallDict+0x166)[0x7fe6a68bece6]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyObjectCall_Prepend+0xcc)[0x7fe6a68bef3c]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyObject_Call+0x56)[0x7fe6a68befd6]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyEvalEvalFrameDefault+0x3ec9)[0x7fe6a699ffc9]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(+0x147100)[0x7fe6a699a100]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyFunctionFastCallDict+0x10c)[0x7fe6a699a39c]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyObjectFastCallDict+0x166)[0x7fe6a68bece6]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyObjectCall_Prepend+0xcc)[0x7fe6a68bef3c]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyObject_Call+0x56)[0x7fe6a68befd6]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so(Z23THPFunctiondo_backwardP11THPFunctionP7_object+0x133)[0x7fe695b0d533]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyCFunctionFastCallDict+0x229)[0x7fe6a6916429]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(+0x148b8c)[0x7fe6a699bb8c]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyEvalEvalFrameDefault+0x2c40)[0x7fe6a699ed40]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(+0x147100)[0x7fe6a699a100]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyFunctionFastCallDict+0x10c)[0x7fe6a699a39c]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyObjectFastCallDict+0x166)[0x7fe6a68bece6]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyObjectCall_Prepend+0xcc)[0x7fe6a68bef3c]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyObject_Call+0x56)[0x7fe6a68befd6]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(+0x6c0bd)[0x7fe6a68bf0bd]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(+0x6c16d)[0x7fe6a68bf16d]
/vulcan/scratch/jackwang/anaconda3/envs/magma/bin/../lib/libpython3.6m.so.1.0(PyObject_CallMethod+0xe6)[0x7fe6a68c26e6]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so(ZN5torch8autograd10PyFunction5applyERKSt6vectorISt10sharedptrINS0_8VariableEESaIS5_EE+0xe1)[0x7fe695b0dfe1]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so(ZN5torch8autograd6Engine17evaluatefunctionERNS0_12FunctionTaskE+0x248)[0x7fe695b03be8]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so(ZN5torch8autograd6Engine11threadmainESt10shared_ptrINS0_10ReadyQueueEE+0x3b)[0x7fe695b056eb]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so(ZN12PythonEngine11threadmainESt10shared_ptrIN5torch8autograd10ReadyQueueEE+0x54)[0x7fe695b16284]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so(ZNSt6thread5ImplISt12_Bind_simpleIFSt7_Mem_fnIMN5torch8autograd6EngineEFvSt10shared_ptrINS4_10ReadyQueueEEEEPS5_S8_EEE6_M_runEv+0x48)[0x7fe695b07f18]
/vulcan/scratch/jackwang/anaconda3/envs/magma/lib/python3.6/site-packages/torch/lib/../../../../libstdc++.so.6(+0xb7260)[0x7fe67f0eb260]
/usr/lib64/libpthread.so.0(+0x7dc5)[0x7fe6a663edc5]
/usr/lib64/libc.so.6(clone+0x6d)[0x7fe6a5a5c76d]
======= Memory map: ========
The text was updated successfully, but these errors were encountered: