Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMP fails on A64FX (AArch64 SVE) #6451

Open
AssadHashmi opened this issue Nov 13, 2023 · 1 comment
Open

OpenMP fails on A64FX (AArch64 SVE) #6451

AssadHashmi opened this issue Nov 13, 2023 · 1 comment

Comments

@AssadHashmi
Copy link
Contributor

AssadHashmi commented Nov 13, 2023

Raised by user, see https://groups.google.com/g/dynamorio-users/c/S9dHoHjYHns

Describe the bug
Any OpenMP program with more than one thread fails on an A64FX SVE machine.

To Reproduce

$ OMP_NUM_THREADS=2 $DYNAMORIO_DIR/bin64/drrun -debug -- ./bin/is.A.x
<Starting application NPB3.4-OMP/bin/is.A.x (4028)>
<Initial options = -no_dynamic_options -code_api -stack_size 64K -signal_stack_size 64K -max_elide_jmp 0 -max_elide_call 0 -vmm_block_size 64K -initial_heap_unit_size 64K -initial_heap_nonpers_size 64K -initial_global_heap_unit_size 512K -max_heap_unit_size 4M -heap_commit_increment 64K -cache_commit_increment 64K -cache_bb_unit_init 64K -cache_bb_unit_max 64K -cache_bb_unit_quadruple 64K -cache_trace_unit_init 64K -cache_trace_unit_max 64K -cache_trace_unit_quadruple 64K -cache_shared_bb_unit_init 512K -cache_shared_bb_unit_max 512K -cache_shared_bb_unit_quadruple 512K -cache_shared_trace_unit_init 512K -cache_shared_trace_unit_max 512K -cache_shared_trace_unit_quadruple 512K -cache_bb_unit_upgrade 64K -cache_trace_unit_upgrade 64K -cache_shared_bb_unit_upgrade 512K -cache_shared_trace_unit_upgrade 512K -early_inject -emulate_brk -no_inline_ignored_syscalls -no_per_thread_guard_pages -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct >

 NAS Parallel Benchmarks (NPB3.4-OMP) - IS Benchmark

 Size:  8388608  (class A)
 Iterations:  10
 Number of available threads:  2

<Application NPB3.4.2/NPB3.4-OMP/bin/is.A.x (4028). Cannot correctly handle received signal 11 in thread 4029: default action in native thread.>

The crash seems to happen at thread creation when entering an OpenMP parallel region or pthread_create().
Stack trace:

(gdb) bt
#0  0x00004000004574c8 in get_clone_record (xsp=70368753806112) at /home/runner/work/dynamorio/dynamorio/core/unix/signal.c:944
#1  0x000040000043aed8 in new_thread_setup (mc=0x40000092eb20) at /home/runner/work/dynamorio/dynamorio/core/arch/x86_code.c:284
#2  0x0000400200696638 in ?? ()
#3  0x0000ffffffffceb0 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

The user has mentioned this workaround, reducing the size of dstack_base allocated in core/unix/signal.c. By changing:

#ifdef AARCH64
    dstack_base = (byte *)ALIGN_FORWARD(xsp, PAGE_SIZE) + PAGE_SIZE;
#else
    dstack_base = (byte *)ALIGN_FORWARD(xsp, PAGE_SIZE);
#endif

To:

    dstack_base = (byte *)ALIGN_FORWARD(xsp, PAGE_SIZE);

Additional context
This error was probably introduced by the initial work on SVE support, see #5835

It should be fixed when #6317 is implemented.

@AssadHashmi
Copy link
Contributor Author

Related to #5365.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant