Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault on armv6l (rapsberrypi) #925

Closed
pgasiorowski opened this issue Jun 4, 2023 · 12 comments
Closed

Segfault on armv6l (rapsberrypi) #925

pgasiorowski opened this issue Jun 4, 2023 · 12 comments

Comments

@pgasiorowski
Copy link

Description

ebusd or ebusctl segfaults when installed or build from source.

$ uname -a
Linux raspberrypi 5.10.92+ #1514 Mon Jan 17 17:35:21 GMT 2022 armv6l GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Raspbian
Description:	Raspbian GNU/Linux 11 (bullseye)
Release:	11
Codename:	bullseye

$ ldd --version
ldd (Debian GLIBC 2.31-13+rpt2+rpi1+deb11u2) 2.31

$ file /usr/bin/ebusd
/usr/bin/ebusd: ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, BuildID[sha1]=ed81d86462f46f87d327cbf9d3dcdc6775df02b0, for GNU/Linux 3.2.0, stripped

$ dpkg -l | grep ebus
hi  ebusd                                23.1                             armhf        eBUS daemon.

$ cat /etc/apt/sources.list.d/ebusd.list
deb https://repo.ebusd.eu/apt/default/bullseye bullseye main

Result

(gdb) run
Starting program: /usr/bin/ebusd
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x004578e6 in ?? ()
(gdb) bt
#0  0x004578e6 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

When built from source I get get this trace:

(gdb) run
Starting program: /home/woodzu/bin/ebusd
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0xb6c5d744 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) () from /lib/arm-linux-gnueabihf/libstdc++.so.6

The binary was build on Ubuntu 20.04.4 (GLIBC 2.31-0ubuntu9.9) with

CC=arm-linux-gnueabihf-gcc
CXX=arm-linux-gnueabihf-g++
CXXFLAGS='-std=c++11'
LIBS="-lcrypto -lssl -lmosquitto"
LDFLAGS="-L$OPENSSL_DIR/lib -L$LIBMOSQUITTODIR/lib"
CPPFLAGS="-I$OPENSSL_DIR/include -I$LIBMOSQUITTODIR/include"
CXXFLAGS="-I$OPENSSL_DIR/include -I$LIBMOSQUITTODIR/include"

./autogen.sh --host=arm-linux-gnueabihf --with-ssl --with-mqtt --without-knx

Actual behavior

No segfaults.

Expected behavior

Segfault

ebusd version

current source from git

ebusd arguments

none

Operating system

Debian 11 (Bullseye) / Ubuntu 20-21 / Raspbian 11 / Raspberry Pi OS 11 (including lite)

CPU architecture

other

Dockerized

None

Hardware interface

adapter 3.1 USB

Related integration

other

Logs

(empty)

@Commifreak
Copy link
Contributor

Commifreak commented Jun 5, 2023

Do you have MQTT enabled? It seems, I face the same issue for master branch (as of now) - docker - , but I get something inside the log:

terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid

this was thrown just while mqtt publish with newest hassio.cfg.

@pgasiorowski
Copy link
Author

pgasiorowski commented Jun 5, 2023

Yes, mqtt support is enabled. I also tried adding -lstdc++ to LIBS as suggested in a few places in GitHub.

Can your error be related to configuration parameters (like empty string somewhere)?

@Commifreak
Copy link
Contributor

Cant tell. Disabling the hassio.cfg does not raise the crash and ebusd runs ok.

@valeriop
Copy link

valeriop commented Jun 8, 2023

I can confirm that this issue is MQTT related.
I am building from git and there are no problems until I run it with --mqtt options.:

ebusd -f -c http://cfg.ebusd.eu --scanconfig -d 192.168.0.2:9999 -p 8888 --latency=30000 --mqtthost=127.0.0.1 --mqttjson --mqttint=/etc/ebusd/mqtt-hassio.cfg --mqttuser=homeassistant --mqttport=1883 --mqttlog
[...]
2023-06-08 12:21:07.962 [update notice] sent scan-read scan.0a QQ=31: Vaillant;PMW00;0117;4402
2023-06-08 12:21:07.962 [bus notice] scan 0a: ;Vaillant;PMW00;0117;4402
2023-06-08 12:21:08.120 [update notice] sent unknown MS cmd: 310ab5090124 / 09003231313132353030
2023-06-08 12:21:08.282 [update notice] sent scan-read scan.0a id QQ=31:
2023-06-08 12:21:08.455 [update notice] sent scan-read scan.0a id QQ=31:
2023-06-08 12:21:08.733 [update notice] sent scan-read scan.0a id QQ=31: 21;11;25;0010007267;3110;002833;N8
2023-06-08 12:21:08.733 [bus notice] scan 0a: ;21;11;25;0010007267;3110;002833;N8
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_M_construct null not valid
Aborted (core dumped)

ebusd -f -c http://cfg.ebusd.eu --scanconfig -d 192.168.0.2:9999 -p 8888 --latency=30000
[...]
2023-06-08 12:24:59.093 [update notice] sent scan-read scan.0a QQ=31: Vaillant;PMW00;0117;4402
2023-06-08 12:24:59.093 [bus notice] scan 0a: ;Vaillant;PMW00;0117;4402
2023-06-08 12:24:59.250 [update notice] sent unknown MS cmd: 310ab5090124 / 09003231313132353030
2023-06-08 12:24:59.407 [update notice] sent scan-read scan.0a id QQ=31:
2023-06-08 12:24:59.567 [update notice] sent scan-read scan.0a id QQ=31:
2023-06-08 12:24:59.725 [update notice] sent scan-read scan.0a id QQ=31: 21;11;25;0010007267;3110;002833;N8
2023-06-08 12:24:59.725 [bus notice] scan 0a: ;21;11;25;0010007267;3110;002833;N8
2023-06-08 12:24:59.990 [main notice] read scan config file vaillant/0a.pmw.hwc.csv for ID "pmw00", SW0117, HW4402
2023-06-08 12:25:00.012 [bus notice] max. symbols per second: 113
2023-06-08 12:25:00.317 [main notice] found messages: 321 (18 conditional on 36 conditions, 0 poll, 10 update)
2023-06-08 12:25:02.302 [update notice] received unknown MS cmd: 10ecb5040121 / 055200055864
2023-06-08 12:25:03.040 [bus notice] scan 15: ;Vaillant;UI ;0508;6201
2023-06-08 12:25:03.040 [update notice] store 15 ident: done
[...scanning continues..]

@valeriop
Copy link

valeriop commented Jun 8, 2023

From few days ago, commit #699f056 seems to start having issue.

Current master commit runs flawlessly building with debug symbols:
CXXFLAGS="-ggdb" ./make_debian.sh
doh
Maybe some optimization flags trigger the issue

@valeriop
Copy link

valeriop commented Jun 8, 2023

(gdb) backtrace
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737317692992) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140737317692992) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140737317692992, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007ffff758e476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00007ffff75747f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007ffff791fbbe in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007ffff792b24c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x00007ffff792b2b7 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x00007ffff792b518 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007ffff792238a in std::__throw_logic_error(char const*) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00005555555f3dbb in std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_construct<char*> (
this=this@entry=0x7ffff5d3d350, __beg=0x0, __end=0x7fffec000e60 "ebusd/global/signal") at /usr/include/c++/11/bits/basic_string.tcc:212
#11 0x00005555555f8e56 in std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_construct_aux<char*> (
__end=, __beg=, this=0x7ffff5d3d350) at /usr/include/c++/11/bits/basic_string.h:255
#12 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_construct<char*> (__end=, __beg=,
this=0x7ffff5d3d350) at /usr/include/c++/11/bits/basic_string.h:274
#13 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::basic_string (
__str=<error reading variable: Cannot create a lazy string with address 0x0, and a non-zero length.>, this=0x7ffff5d3d350)
at /usr/include/c++/11/bits/basic_string.h:459
#14 ebusd::StringReplacers::reduce (this=this@entry=0x7ffff5d3d6c0, compress=compress@entry=false) at stringhelper.cpp:517
#15 0x00005555555a8a41 in ebusd::MqttHandler::run (this=0x55555565e920) at mqtthandler.cpp:1131
#16 0x00005555555b3e52 in ebusd::Thread::enter (this=0x55555565e988) at thread.cpp:68
#17 0x00005555555b3e6d in ebusd::Thread::runThread (arg=) at thread.cpp:29
#18 0x00007ffff75e0b43 in start_thread (arg=) at ./nptl/pthread_create.c:442
#19 0x00007ffff7672a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) exit

I debugged with a couple of cout in src/lib/ebus/stringhelper.cpp:
void StringReplacers::reduce(bool compress) {
// iterate through variables and reduce as many to constants as possible
[...]
cout << "str = "" << str << """ << endl;
bool restart = set(it->first, str, false);
str2 = it->first;
cout << "it->first = "" << str2 << """ << endl;
it = m_replacers.erase(it);
reduced = true;
if (restart) {
517: string upper = it->first; /* crash here */

And these are its last words:
str = ""post run""
it->first = "entry"
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_M_construct null not valid

@john30
Copy link
Owner

john30 commented Jun 11, 2023

good spot @valeriop but not related to the mentioned commit.
anyway, @WooDzu could you try again with the current source? and if it fails again, compile with debug infos as stated in the wiki

@piotr-lwks
Copy link

Thanks for looking into this. I tried the latest but it is sill failing.

Proper backtrace with debug symbols below:

Program received signal SIGSEGV, Segmentation fault.
0x00456690 in ebusd::NumberDataType::readFromRawValue (this=<error reading variable: Cannot access memory at address 0xc>,
    value=<error reading variable: Cannot access memory at address 0x8>, outputFormat=<error reading variable: Cannot access memory at address 0x4>,
    output=<error reading variable: Cannot access memory at address 0x0>) at datatype.cpp:900

pointing at this line:

negative = false;

@valeriop
Copy link

valeriop commented Jun 14, 2023

good spot @valeriop but not related to the mentioned commit. anyway, @WooDzu could you try again with the current source? and if it fails again, compile with debug infos as stated in the wiki

Current source c66bd85 works correctly with mqtt options enabled, no more segfault. Thank you!

@john30
Copy link
Owner

john30 commented Jun 15, 2023

Thanks for looking into this. I tried the latest but it is sill failing.

Proper backtrace with debug symbols below:

Program received signal SIGSEGV, Segmentation fault.
0x00456690 in ebusd::NumberDataType::readFromRawValue (this=<error reading variable: Cannot access memory at address 0xc>,
    value=<error reading variable: Cannot access memory at address 0x8>, outputFormat=<error reading variable: Cannot access memory at address 0x4>,
    output=<error reading variable: Cannot access memory at address 0x0>) at datatype.cpp:900

pointing at this line:

negative = false;

I'm pretty sure that the run was not with the latest source as that line of the backtrace can't cause a memory access error

@john30
Copy link
Owner

john30 commented Jun 16, 2023

meanwhile I've verified on a Raspi 2 that commit c66bd85 works fine whereas the previous one crashes with MQTT. so I'm closing this as resolved

@john30 john30 closed this as completed Jun 16, 2023
@piotr-lwks
Copy link

Ok. It is still failing for me but the access to addresses like 0x4, 0x8 and 0xc makes me think it's more related to compiler's setup.

Program received signal SIGSEGV, Segmentation fault.
ebusd::NumberDataType::getFloatFromRawValue (this=<error reading variable: Cannot access memory at address 0xc>, value=<error reading variable: Cannot access memory at address 0x8>, 
    output=<error reading variable: Cannot access memory at address 0x4>) at datatype.cpp:845

if (m_divisor < 0) {

I'll try it on another fresh VM and RPi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants