Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash: segfault with threading on arm since HA_ATOMIC_DWCAS #105

Closed
lukastribus opened this issue May 23, 2019 · 3 comments
Closed

crash: segfault with threading on arm since HA_ATOMIC_DWCAS #105

lukastribus opened this issue May 23, 2019 · 3 comments
Labels
severity: medium This issue is of MEDIUM severity. status: fixed This issue is a now-fixed bug. subsystem: core The issue is within the core of haproxy. type: bug This issue describes a bug.

Comments

@lukastribus
Copy link
Member

John Smith reported on discourse that haproxy 1.9 and dev crashes on Raspberry Pi's (CentOs 7):

https://discourse.haproxy.org/t/segmentation-fault-on-raspberry-pi/3850

kernel, gcc, haproxy version and /proc/cpuinfo

[root@osmc haproxy]# uname -a
Linux osmc 4.14.82-v7.1.el7 #1 SMP Sun Nov 25 22:42:27 UTC 2018 armv7l armv7l armv7l GNU/Linux
[root@osmc haproxy]# gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[root@osmc haproxy]# cat /proc/cpuinfo
processor       : 0
model name      : ARMv7 Processor rev 4 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xd03
CPU revision    : 4

processor       : 1
model name      : ARMv7 Processor rev 4 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xd03
CPU revision    : 4

processor       : 2
model name      : ARMv7 Processor rev 4 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xd03
CPU revision    : 4

processor       : 3
model name      : ARMv7 Processor rev 4 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xd03
CPU revision    : 4

Hardware        : BCM2835
Revision        : a02082
Serial          : 00000000xxxxxxxx
[root@osmc haproxy]# ./haproxy -vv
HA-Proxy version 2.0-dev4-1713c0-5 2019/05/23 - https://haproxy.org/
Build options :
  TARGET  = linux2628
  CPU     = generic
  CC      = gcc
  CFLAGS  = -O0 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-old-style-declaration -Wno-ignored-qualifiers -Wno-clobbered -Wno-missing-field-initializers -Wtype-limits
  OPTIONS = USE_THREAD_DUMP=1

Feature list : +EPOLL -KQUEUE -MY_EPOLL -MY_SPLICE +NETFILTER -PCRE -PCRE_JIT -PCRE2 -PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED -REGPARM -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H -VSYSCALL -GETADDRINFO -OPENSSL -LUA +FUTEX +ACCEPT4 -MY_ACCEPT4 -ZLIB -SLZ +CPU_AFFINITY -TFO -NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL -SYSTEMD -OBSOLETE_LINKER +PRCTL +THREAD_DUMP -EVPORTS

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=32, default=4).
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built without compression support (neither USE_ZLIB nor USE_SLZ are set).
Compression algorithms supported : identity("identity")
Built without PCRE or PCRE2 support (using libc's regex instead)
Encrypted password support via crypt(3): yes

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
              h2 : mode=HTTP       side=FE        mux=H2
              h2 : mode=HTX        side=FE|BE     mux=H2
       <default> : mode=HTX        side=FE|BE     mux=H1
       <default> : mode=TCP|HTTP   side=FE|BE     mux=PASS

Available services : none

Available filters :
        [SPOE] spoe
        [COMP] compression
        [CACHE] cache
        [TRACE] trace

[root@osmc haproxy]#

GDB backtrace

[root@osmc haproxy]# gdb haproxy  core
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "armv7hl-redhat-linux-gnueabi".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/haproxy/haproxy...done.
[New LWP 2350]
[New LWP 2349]
[New LWP 2351]
[New LWP 2348]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Core was generated by `./haproxy -f ../haproxy.cfg -d'.
Program terminated with signal 11, Segmentation fault.
#0  0x001a3730 in __ha_cas_dw (target=0x101d208, compare=0x76439d08, set=0xfffffffe) at include/common/hathreads.h:1080
1080                             : "r" (*(uint64_t *)compare), "r" (*(uint64_t *)set), "r" (target)
(gdb) bt
#0  0x001a3730 in __ha_cas_dw (target=0x101d208, compare=0x76439d08, set=0xfffffffe) at include/common/hathreads.h:1080
#1  0x001a4af0 in fd_rm_from_fd_list (list=0x2ce870 <update_list>, fd=4, off=24) at src/fd.c:275
#2  0x0001635c in done_update_polling (fd=4) at include/proto/fd.h:152
#3  0x0001742c in _do_poll (p=0x2ce844 <cur_poller>, exp=0) at src/ev_epoll.c:139
#4  0x000c7ab8 in run_poll_loop () at src/haproxy.c:2546
#5  0x000c7ccc in run_thread_poll_loop (data=0x2) at src/haproxy.c:2603
#6  0x76ea0c64 in start_thread (arg=0x7643a3a0) at pthread_create.c:309
#7  0x76ddabb8 in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:96 from /lib/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) bt full
#0  0x001a3730 in __ha_cas_dw (target=0x101d208, compare=0x76439d08, set=0xfffffffe) at include/common/hathreads.h:1080
        previous = 8571581165952040960
        tmp = 1
#1  0x001a4af0 in fd_rm_from_fd_list (list=0x2ce870 <update_list>, fd=4, off=24) at src/fd.c:275
        cur_list = {next = -1, prev = -1}
        next_list = {next = -2, prev = -2}
        old = 0
        new = -2
        prev = 1
        next = 8
        last = 1984144392
#2  0x0001635c in done_update_polling (fd=4) at include/proto/fd.h:152
        update_mask = 0
#3  0x0001742c in _do_poll (p=0x2ce844 <cur_poller>, exp=0) at src/ev_epoll.c:139
        status = 1984142708
        fd = 4
        count = 0
        updt_idx = 1
        wait_time = 1984144392
        old_fd = 4
#4  0x000c7ab8 in run_poll_loop () at src/haproxy.c:2546
        next = 0
        exp = 0
#5  0x000c7ccc in run_thread_poll_loop (data=0x2) at src/haproxy.c:2603
        ptaf = 0x2432b0 <per_thread_alloc_list>
        ptif = 0x2432b8 <per_thread_init_list>
        ptdf = 0x0
        ptff = 0x0
#6  0x76ea0c64 in start_thread (arg=0x7643a3a0) at pthread_create.c:309
        pd = 0x7643a3a0
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {1984144392, 0, 1995178472, 338, 0, 0, 1995726760, 1984143068, 1984142760, 1995050032, 0 <repeats 54 times>}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
#7  0x76ddabb8 in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:96 from /lib/libc.so.6
No locals.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) quit
[root@osmc haproxy]#

What's the configuration?

global
 maxconn 100
 nbthread 4

defaults
 mode http
 timeout connect 10s
 timeout client 60s
 timeout server 60s

listen www
 mode tcp
 bind :80
 server s1 www.example.org:80

Steps to reproduce the behavior

Start haproxy with threading enabled on a Raspberry Pi, it crashes either immediately or after a few seconds.

Workaround

Disable threading with nbthread 1.

Do you have any idea what may have caused this?

This is a regression caused by commit 6a38b32 (BUILD: threads: fix again the __ha_cas_dw() definition) and also affects 1.9.8 @wtarreau

Note there is also a different threading issue reported on the ML in 1.9.8 (though probably unrelated):
https://www.mail-archive.com/haproxy@formilux.org/msg33854.html

@lukastribus lukastribus added type: bug This issue describes a bug. 1.9 This issue affects the HAProxy 1.9 stable branch. dev This issue affects the HAProxy development branch. severity: medium This issue is of MEDIUM severity. status: reviewed This issue was reviewed. A fix is required. subsystem: core The issue is within the core of haproxy. labels May 23, 2019
@wtarreau
Copy link
Member

wtarreau commented May 24, 2019 via email

@wtarreau
Copy link
Member

wtarreau commented May 24, 2019 via email

haproxy-mirror pushed a commit that referenced this issue May 27, 2019
…forms

On armv7 haproxy doesn't work because of the fixes on the double-word
CAS. There are two issues. The first one is that the last argument in
case of dwcas is a pointer to the set of value and not a value ; the
second is that it's not enough to cast the data as (void*) since it will
be a single word. Let's fix this by using the pointers as an array of
long. This was tested on i386, armv7, x86_64 and aarch64 and it is now
fine. An alternate approach using a struct was attempted as well but it
used to produce less optimal code.

This fix must be backported to 1.9. This fixes github issue #105.

Cc: Olivier Houchard <ohouchard@haproxy.com>
@TimWolla TimWolla added status: fixed This issue is a now-fixed bug. and removed dev This issue affects the HAProxy development branch. status: reviewed This issue was reviewed. A fix is required. labels May 27, 2019
@wtarreau
Copy link
Member

OK, now merged into dev and backported. Tested on my MIQI boards which used to fail and they're fine now. Closing the issue, thanks!

@wtarreau wtarreau removed the 1.9 This issue affects the HAProxy 1.9 stable branch. label May 27, 2019
FireBurn pushed a commit to FireBurn/haproxy that referenced this issue Jan 29, 2020
…forms

On armv7 haproxy doesn't work because of the fixes on the double-word
CAS. There are two issues. The first one is that the last argument in
case of dwcas is a pointer to the set of value and not a value ; the
second is that it's not enough to cast the data as (void*) since it will
be a single word. Let's fix this by using the pointers as an array of
long. This was tested on i386, armv7, x86_64 and aarch64 and it is now
fine. An alternate approach using a struct was attempted as well but it
used to produce less optimal code.

This fix must be backported to 1.9. This fixes github issue haproxy#105.

Cc: Olivier Houchard <ohouchard@haproxy.com>
(cherry picked from commit c3b5958)
[wt: adjust context, s/_HA/HA/]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity: medium This issue is of MEDIUM severity. status: fixed This issue is a now-fixed bug. subsystem: core The issue is within the core of haproxy. type: bug This issue describes a bug.
Projects
None yet
Development

No branches or pull requests

3 participants