Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault with lpf_collectives_init #12

Open
alberto-scolari opened this issue Aug 11, 2023 · 1 comment
Open

Segmentation fault with lpf_collectives_init #12

alberto-scolari opened this issue Aug 11, 2023 · 1 comment

Comments

@alberto-scolari
Copy link
Collaborator

The ALP code here

https://github.com/Algebraic-Programming/ALP/blob/b50fe72e957ed4da10c1cd1c59d924260f051a8d/include/graphblas/bsp1d/exec.hpp#L184

immediately segfaults because of the sizeof( size_t ) value passed as max_byte_size; the stack trace is

Program received signal SIGSEGV, Segmentation fault.
0x000014cd06466274 in lpf::MessageSort::addRegister(unsigned long, char*, unsigned long) () from /home/user/Projects/install/lpf/lib/liblpf_core_univ_mpimsg_Release.so
(gdb) bt
#0  0x000014cd06466274 in lpf::MessageSort::addRegister(unsigned long, char*, unsigned long) () from /home/user/Projects/install/lpf/lib/liblpf_core_univ_mpimsg_Release.so
#1  0x000014cd06445f8f in lpf::MessageQueue::addGlobalReg(void*, unsigned long) () from /home/user/Projects/install/lpf/lib/liblpf_core_univ_mpimsg_Release.so
#2  0x000014cd0645b136 in lpf_register_global () from /home/user/Projects/install/lpf/lib/liblpf_core_univ_mpimsg_Release.so
#3  0x000055c08771e2e3 in lpf_collectives_init ()
#4  0x000055c08771a0ad in _grb_exec_varin_spmd<output, true> (ctx=0x7ffca3145950, s=0, P=2, args=...) at /home/user/Projects/graphblas_fix_bsp1d_exec/include/graphblas/bsp1d/exec.hpp:180

The solution (so far) is to set max_byte_size to 0, as in

https://github.com/Algebraic-Programming/ALP/blob/b50fe72e957ed4da10c1cd1c59d924260f051a8d/include/graphblas/bsp1d/exec.hpp#L67

I don't know whether this problem depends on a specific combination of the function parameters or is simply a bug. In the first case, anyway, no error is raised or error code is returned; the function indeed segfaults during its own execution.

@anyzelman
Copy link
Member

anyzelman commented Sep 11, 2023

In which situation does it segfault exactly? (Existing ALP smoke tests seem to test for this use of exec, but don't segfault?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants