Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LD_PRELOAD seg fault #86

Open
aulwes opened this issue Dec 19, 2023 · 9 comments
Open

LD_PRELOAD seg fault #86

aulwes opened this issue Dec 19, 2023 · 9 comments
Labels
bug Something isn't working

Comments

@aulwes
Copy link

aulwes commented Dec 19, 2023

I followed the instructions for fixing the LD_PRELOAD problem by building malt with -DLIBUNWIND_PREFIX=/path/to/libunwind. But I continue to get a segfault when running and it occurs after the app finishes running. If I try to add the malt runtime option '-s libunwind', I get the LD_PRELOAD segfault immediately. Is there something else I can try?

@aulwes
Copy link
Author

aulwes commented Dec 21, 2023

Some more information. I modified one of the tests, TestSimpleStackTracker, by parallelizing it using MPI_Init/MPI_Finalize. I then rebuilt Malt using mpicc/mpic++ and the Cray MPICH version 8.1.25. I tested this on one of our Cray clusters that uses Slurm resource manager. I ran with 'srun -n 1 /path/to/malt --mpi ./src/lib/tests/TestSimpleStackTracker'. I then get this error

TestSimpleStackTracker: /users/rta/workspace/malt/src/lib/common/SimpleAllocator.cpp:179: void MALT::SimpleAllocator::free(void *): Assertion `unusedMemory <= totalMemory' failed.
/usr/projects/perfeng/utils/malt/ro/rta/bin/malt: line 458: 134989 Aborted (core dumped) LD_PRELOAD="${MPI_WRAPPER_DIR}/libmaltmpi.so:${MALT_LIB}:${LD_PRELOAD}" "$@"

@svalat
Copy link
Member

svalat commented Dec 21, 2023

Hello, thanks for reporting the issue.

I would ask two things to help debugging:

  1. Can you extract the values of unusedMemory and totalMemory to see if one of them is 0 or totally wrong value ?
  2. For the segfault, if in case you can get a core dump to know where it appears, at least with symbol name or better source line ?

@aulwes
Copy link
Author

aulwes commented Dec 21, 2023 via email

@aulwes
Copy link
Author

aulwes commented Dec 21, 2023 via email

@svalat svalat added the bug Something isn't working label Feb 16, 2024
@aulwes
Copy link
Author

aulwes commented Apr 4, 2024

Hi, I'm continuing to get this LD_PRELOAD segfault, but not on all apps that I run with malt. Is there anything else I can try?

@aulwes
Copy link
Author

aulwes commented Apr 4, 2024

I think I've found the problem. One of the apps we're profiling is built with Intel compilers. When I built malt with icx/icpx using Intel 2021 compilers, then I don't get the segfault. For the other apps, I used gcc 10.

@svalat
Copy link
Member

svalat commented Apr 10, 2024

Hi, sorry didn't has yet time to investigate.

But as you pointed, there could be a problem due to mix of C++ libraries (intel / gnu).

Have you tried to also compile MALT with icpc so everything is under intel (malt & the app) ?

@aulwes
Copy link
Author

aulwes commented Apr 10, 2024

Yes, I compiled a version using Intel icpx/icx and that worked. thank you!

@svalat
Copy link
Member

svalat commented Apr 10, 2024

Hum, thanks very much for the reporting, that's good to know.

I had the impression up to now that there was no issue in that case, but apparently yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants