Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minimal.cc example fails on ARM #25

Closed
slyon opened this issue Dec 13, 2021 · 1 comment · Fixed by #26
Closed

minimal.cc example fails on ARM #25

slyon opened this issue Dec 13, 2021 · 1 comment · Fixed by #26

Comments

@slyon
Copy link
Contributor

slyon commented Dec 13, 2021

Describe the bug
ptl-minimal crashes with SIGFPE, Arithmetic exception on 32bit ARM. If commit 4e230f6 is reverted, the problem disappears. Especially the following line seems to introduce the Floating point exception:
static intmax_t nincr = std::max<intmax_t>(ncores / ncpus, 1);

See also: https://bugs.debian.org/1001237

(gdb) run
Starting program: /root/ptl-2.3.0/examples/build/minimal/ptl-minimal 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".


          ##############################
          !!! Backtrace is activated !!!
          ##############################

[ptl-minimal]> Number of threads: 1
[New Thread 0xf7c259a0 (LWP 17519)]

Thread 1 "ptl-minimal" received signal SIGFPE, Arithmetic exception.
__libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47
47	../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S: No such file or directory.
(gdb) bt
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47
#1  0xf7cac3e0 in __pthread_kill_implementation (threadid=4160618512, signo=signo@entry=8, no_tid=no_tid@entry=0)
    at pthread_kill.c:43
#2  0xf7cac424 in __pthread_kill_internal (signo=<optimized out>, threadid=<optimized out>) at pthread_kill.c:80
#3  0xf7c7b690 in __GI_raise (sig=8) at ../sysdeps/posix/raise.c:26
#4  0xf7d8201e in __aeabi_ldiv0 () from /lib/arm-linux-gnueabihf/libgcc_s.so.1
#5  0x00407e5c in main::{lambda(long long)#3}::operator()(long long) const ()
#6  0x0040d49e in long long std::__invoke_impl<long long, main::{lambda(long long)#3}&, long long>(std::__invoke_other, main::{lambda(long long)#3}&, long long&&) ()
#7  0x0040ca6e in std::enable_if<is_invocable_r_v<long long, main::{lambda(long long)#3}&, long long>, long long>::type std::__invoke_r<long long, main::{lambda(long long)#3}&, long long>(main::{lambda(long long)#3}&, long long&&) ()
#8  0x0040baf6 in std::_Function_handler<long long (long long), main::{lambda(long long)#3}>::_M_invoke(std::_Any_data const&, long long&&) ()
#9  0xf7f9f620 in PTL::ThreadPool::set_affinity(long long, std::thread&) const ()
   from /lib/arm-linux-gnueabihf/libptl.so.2
#10 0xf7fa1234 in PTL::ThreadPool::initialize_threadpool(unsigned int) () from /lib/arm-linux-gnueabihf/libptl.so.2
#11 0xf7fa22d8 in PTL::ThreadPool::ThreadPool(PTL::ThreadPool::Config const&) ()
   from /lib/arm-linux-gnueabihf/libptl.so.2
#12 0x00408ada in main ()

To Reproduce
Steps to reproduce the behavior:

  1. apt install libptl-dev (v2.3.0-1)
  2. cd examples && mkdir build && cd build
  3. cmake .. && make
  4. ./minimal/ptl-minimal (Observe crash)

Expected behavior
Test should be run successfully.

Desktop (please complete the following information):

  • OS: Ubuntu Jammy (devel)
  • Version 22.04

Additional context
Does not seem to happen on other architectures but armhf (32bit ARM)

@slyon
Copy link
Contributor Author

slyon commented Dec 13, 2021

On further investigation I think the problem is in the new Threading::GetNumberOfPhysicalCpus() function.

It reads /proc/cpuinfo and checks for core id lines, that is (for some reason) not available on armhf, returning 0 and producing a division by zero:

$ uname -a
Linux deciding-barnacle 5.4.0-54-generic #60-Ubuntu SMP Fri Nov 6 10:42:16 UTC 2020 armv8l armv8l armv8l GNU/Linux
$ cat /proc/cpuinfo 
processor	: 0
BogoMIPS	: 100.00
Features	: fp asimd evtstrm cpuid
CPU implementer	: 0x50
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0x000
CPU revision	: 1

slyon added a commit to slyon/PTL that referenced this issue Dec 13, 2021
On armhf this function could return 0, as /proc/cpuinfo does not contain any
"core id" lines:

processor	: 0
BogoMIPS	: 100.00
Features	: fp asimd evtstrm cpuid
CPU implementer	: 0x50
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0x000
CPU revision	: 1

This leads to a division-by-zero in the minimal.cc example. We should fall back
to GetNumberOfCores() if it would be 0.

Fixes: jrmadsen#25
jrmadsen added a commit that referenced this issue Dec 13, 2021
)

* Threading: Fix GetNumberOfPhysicalCpus, it should never return zero

On armhf this function could return 0, as /proc/cpuinfo does not contain any
"core id" lines:

processor	: 0
BogoMIPS	: 100.00
Features	: fp asimd evtstrm cpuid
CPU implementer	: 0x50
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0x000
CPU revision	: 1

This leads to a division-by-zero in the minimal.cc example. We should fall back
to GetNumberOfCores() if it would be 0.

Fixes: #25

Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant