-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LD_PRELOAD seg fault #86
Comments
Some more information. I modified one of the tests, TestSimpleStackTracker, by parallelizing it using MPI_Init/MPI_Finalize. I then rebuilt Malt using mpicc/mpic++ and the Cray MPICH version 8.1.25. I tested this on one of our Cray clusters that uses Slurm resource manager. I ran with 'srun -n 1 /path/to/malt --mpi ./src/lib/tests/TestSimpleStackTracker'. I then get this error TestSimpleStackTracker: /users/rta/workspace/malt/src/lib/common/SimpleAllocator.cpp:179: void MALT::SimpleAllocator::free(void *): Assertion `unusedMemory <= totalMemory' failed. |
Hello, thanks for reporting the issue. I would ask two things to help debugging:
|
For 1., I see
unusedMemory = 262328, totalMemory = 262144
Let me work on answering 2.
On Dec 21, 2023, at 12:16 PM, Sébastien Valat ***@***.******@***.***>> wrote:
Hello, thanks for reporting the issue.
I would ask two things to help debugging:
1. Can you extract the values of unusedMemory and totalMemory to see if one of them is 0 or totally wrong value ?
2. For the segfault, if in case you can get a core dump to know where it appears, at least with symbol name or better source line ?
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/memtt/malt/issues/86*issuecomment-1866811652__;Iw!!Bt8fGhp8LhKGRg!EJ2cP3vvUQN7QwKE1tWxC0H7vJs5K-oC-uKEi0slsODV_leC_y-RNhkfAnTMJXqGwAPzOzkdeamykdLvFznQsQ$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AB46TOXLNHB7KIGNJI4QR33YKSDI5AVCNFSM6AAAAABA3T2WBCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRWHAYTCNRVGI__;!!Bt8fGhp8LhKGRg!EJ2cP3vvUQN7QwKE1tWxC0H7vJs5K-oC-uKEi0slsODV_leC_y-RNhkfAnTMJXqGwAPzOzkdeamykdJuKSOoqQ$>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Here is the backtrace I get from a core dump:
#0 0x000014ceba351cbb in raise () from /lib64/libc.so.6
#1 0x000014ceba353355 in abort () from /lib64/libc.so.6
#2 0x000014ceba349cba in __assert_fail_base () from /lib64/libc.so.6
#3 0x000014ceba349d42 in __assert_fail () from /lib64/libc.so.6
#4 0x000014cebc2bad75 in MALT::SimpleAllocator::free(void*) ()
from /usr/projects/perfeng/utils/malt/ro/rta/lib64/libmalt.so
#5 0x000014cebc2b4cf0 in std::_Rb_tree<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key, std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo>, std::_Select1st<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> >, std::less<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key>, MALT::STLInternalAllocator<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> > >::_M_erase(std::_Rb_tree_node<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> >*) () from /usr/projects/perfeng/utils/malt/ro/rta/lib64/libmalt.so
#6 0x000014cebc2b4ce1 in std::_Rb_tree<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key, std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo>, std::_Select1st<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> >, std::less<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key>, MALT::STLInternalAllocator<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> > >::_M_erase(std::_Rb_tree_node<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> >*) () from /usr/projects/perfeng/utils/malt/ro/rta/lib64/libmalt.so
#7 0x000014cebc2b4ce1 in std::_Rb_tree<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key, std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo>, std::_Select1st<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> >, std::less<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key>, MALT::STLInternalAllocator<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> > >::_M_erase(std::_Rb_tree_node<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> >*) () from /usr/projects/perfeng/utils/malt/ro/rta/lib64/libmalt.so
#8 0x000014cebc2b4ce1 in std::_Rb_tree<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key, std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo>, std::_Select1st<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> >, std::less<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key>, MALT::STLInternalAllocator<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> > >::_M_erase(std::_Rb_tree_node<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> >*) () from /usr/projects/perfeng/utils/malt/ro/rta/lib64/libmalt.so
#9 0x000014cebc2b4ce1 in std::_Rb_tree<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key, std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo>, std::_Select1st<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> >, std::less<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key>, MALT::STLInternalAllocator<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> > >::_M_erase(std::_Rb_tree_node<std::pair<MALT::StackSTLHashMap<MALT::CallStackInfo>::Key const, MALT::CallStackInfo> >*) () from /usr/projects/perfeng/utils/malt/ro/rta/lib64/libmalt.so
#10 0x000014cebc2c0972 in AllocWrapperGlobal::onExit() ()
from /usr/projects/perfeng/utils/malt/ro/rta/lib64/libmalt.so
#11 0x000014cebc5c4743 in _dl_fini () from /lib64/ld-linux-x86-64.so.2
#12 0x000014ceba354ae9 in __run_exit_handlers () from /lib64/libc.so.6
#13 0x000014ceba354c7a in exit () from /lib64/libc.so.6
#14 0x000014ceba33c2a4 in __libc_start_main () from /lib64/libc.so.6
#15 0x00000000004175fa in _start () at ../sysdeps/x86_64/start.S:120
On Dec 21, 2023, at 12:16 PM, Sébastien Valat ***@***.******@***.***>> wrote:
Hello, thanks for reporting the issue.
I would ask two things to help debugging:
1. Can you extract the values of unusedMemory and totalMemory to see if one of them is 0 or totally wrong value ?
2. For the segfault, if in case you can get a core dump to know where it appears, at least with symbol name or better source line ?
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/memtt/malt/issues/86*issuecomment-1866811652__;Iw!!Bt8fGhp8LhKGRg!EJ2cP3vvUQN7QwKE1tWxC0H7vJs5K-oC-uKEi0slsODV_leC_y-RNhkfAnTMJXqGwAPzOzkdeamykdLvFznQsQ$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AB46TOXLNHB7KIGNJI4QR33YKSDI5AVCNFSM6AAAAABA3T2WBCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRWHAYTCNRVGI__;!!Bt8fGhp8LhKGRg!EJ2cP3vvUQN7QwKE1tWxC0H7vJs5K-oC-uKEi0slsODV_leC_y-RNhkfAnTMJXqGwAPzOzkdeamykdJuKSOoqQ$>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi, I'm continuing to get this LD_PRELOAD segfault, but not on all apps that I run with malt. Is there anything else I can try? |
I think I've found the problem. One of the apps we're profiling is built with Intel compilers. When I built malt with icx/icpx using Intel 2021 compilers, then I don't get the segfault. For the other apps, I used gcc 10. |
Hi, sorry didn't has yet time to investigate. But as you pointed, there could be a problem due to mix of C++ libraries (intel / gnu). Have you tried to also compile MALT with icpc so everything is under intel (malt & the app) ? |
Yes, I compiled a version using Intel icpx/icx and that worked. thank you! |
Hum, thanks very much for the reporting, that's good to know. I had the impression up to now that there was no issue in that case, but apparently yes. |
I followed the instructions for fixing the LD_PRELOAD problem by building malt with -DLIBUNWIND_PREFIX=/path/to/libunwind. But I continue to get a segfault when running and it occurs after the app finishes running. If I try to add the malt runtime option '-s libunwind', I get the LD_PRELOAD segfault immediately. Is there something else I can try?
The text was updated successfully, but these errors were encountered: