Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Driver API test failure #83

Closed
floppym opened this issue Dec 12, 2017 · 8 comments
Closed

Driver API test failure #83

floppym opened this issue Dec 12, 2017 · 8 comments
Labels

Comments

@floppym
Copy link
Contributor

floppym commented Dec 12, 2017

Our tinderbox ran into a test failure on dbus-broker v9.

33/34 dbus-broker / Driver API                FAIL     0.12 s

--- command ---
/var/tmp/portage/sys-apps/dbus-broker-9/work/dbus-broker-9-build/test/dbus/test-driver
--- stderr ---
test-driver: ../dbus-broker-9/test/dbus/util-broker.c:466: util_broker_terminate: Assertion `!value' failed.
-------

A full build log is attached to the Gentoo bug report: https://bugs.gentoo.org/640876

I am unable to reproduce the failure myself, but perhaps you can provide some ideas to further diagnose it.

@dvdhrm
Copy link
Member

dvdhrm commented Dec 13, 2017

The test-driver suite runs a dummy launcher in a separate thread, controlling an independent dbus-broker instance. The assertion in question tells us the launcher thread finished with a non-zero error code. Furthermore, it tells us no assertion in the thread fired. Hence, it must have completed, and as such its return-code is taken from the sd-event exit-code. The only place that sets a non-zero exit-code is util_event_sigchld(), which just takes the exit-code of the broker.

In summary, the dbus-broker instance exited and returned non-zero (either explicitly or via a terminal signal).

The reasons for this is not clear from the logs. If dbus-broker segfaulted, there should be a hint in the system logs, though the report does not include the system logs (at least I didn't find them). And if the broker returned non-zero itself, it must have either printed something to the system log, or to stderr.

Is there a way to get access to the journal entries? Or, if no journal is around, then dmesg and/or syslog?

@dvdhrm dvdhrm added the bug label Dec 13, 2017
@toralf
Copy link

toralf commented Dec 13, 2017

from the host syslogs

zgrep dbus-broker /var/log/*

/var/log/debug:Dec 12 10:19:19 mr-fox kernel: traps: dbus-broker[13439] general protection ip:56470655120c sp:7ffcc545d930 error:0 in dbus-broker[564706541000+2c000]

/var/log/kern.log:Dec 12 10:19:19 mr-fox kernel: [151499.624964] traps: dbus-broker[13439] general protection ip:56470655120c sp:7ffcc545d930 error:0 in dbus-broker[564706541000+2c000]

/var/log/messages:Dec 12 10:19:19 mr-fox kernel: traps: dbus-broker[13439] general protection ip:56470655120c sp:7ffcc545d930 error:0 in dbus-broker[564706541000+2c000]

/var/log/syslog:Dec 12 10:19:19 mr-fox kernel: traps: dbus-broker[13439] general protection ip:56470655120c sp:7ffcc545d930 error:0 in dbus-broker[564706541000+2c000]

@floppym
Copy link
Contributor Author

floppym commented Dec 14, 2017

Oddly, I am seeing similar messages in the kernel log, despite the test passing.

I rebuilt with debug symbols and systemd captured a core dump.

Core was generated by `./src/dbus-broker --verbose --controller 23'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00005596e9ea51f7 in peer_flush_matches (peer=0x5596ea9403e0) at ../dbus-broker-9/src/bus/peer.c:523
523                     if (rule->keys.sender && *rule->keys.sender != ':' && strcmp(rule->keys.sender, "org.freedesktop.DBus") != 0)
(gdb) bt full
#0  0x00005596e9ea51f7 in peer_flush_matches (peer=0x5596ea9403e0) at ../dbus-broker-9/src/bus/peer.c:523
        name = 0x0
        rule = 0x5596ea93f060
        node = 0x5596ea93f088
#1  0x00005596e9e9e32a in driver_goodbye (peer=0x5596ea9403e0, silent=false) at ../dbus-broker-9/src/bus/driver.c:1732
        reply = 0x5596e9eadb15 <c_list_is_empty+31>
        reply_safe = 0x0
        rule = 0x7fff67ddd780
        rule_safe = 0x5596e9eae68d <socket_discard_input+47>
        ownership = 0x5596e9eae9e8 <socket_hangup_input+59>
        ownership_safe = 0x5596ea940490
        r = 1742591840
        __func__ = "driver_goodbye"
#2  0x00005596e9ea3b06 in peer_dispatch (file=0x5596ea940d90) at ../dbus-broker-9/src/bus/peer.c:117
        peer = 0x5596ea9403e0
        interest = {17, 4}
        i = 0
        r = 4
        __func__ = "peer_dispatch"
#3  0x00005596e9eb0c20 in dispatch_context_dispatch (ctx=0x5596ea93d370) at ../dbus-broker-9/src/util/dispatch.c:344
        todo = {next = 0x7fff67ddd850, prev = 0x7fff67ddd850}
        file = 0x5596ea940d90
        r = 0
        __func__ = "dispatch_context_dispatch"
        __PRETTY_FUNCTION__ = "dispatch_context_dispatch"
#4  0x00005596e9e9182d in broker_run (broker=0x5596ea93d260) at ../dbus-broker-9/src/broker/broker.c:162
        signew = {__val = {16386, 0 <repeats 15 times>}}
        sigold = {__val = {66048, 0, 0, 0, 0, 0, 0, 0, 0, 0, 36, 7683565942679753216, 0, 94106657819104, 140734935980800, 0}}
        r = 0
        __func__ = "broker_run"
#5  0x00005596e9e96ab9 in run () at ../dbus-broker-9/src/broker/main.c:250
        broker = 0x5596ea93d260
        r = 0
        __func__ = "run"
#6  0x00005596e9e96bf9 in main (argc=4, argv=0x7fff67dddb08) at ../dbus-broker-9/src/broker/main.c:272
        r = 0
        __func__ = "main"

@dvdhrm
Copy link
Member

dvdhrm commented Dec 15, 2017

Right. The runtime-tests do not correctly capture failures if they happen after the tests succeeded (i.e., when we take down the test-daemon). As it turns out, there is a reference-leak in dbus-broker regarding monitor-matches. Hence, the broker faults, but the tests don't fail.

I found the culprit in a recent c-rbtree change (which disallows moving trees by hand). I am working on a fix.

However, I somehow suspect that this is not related to the initial bug-report. I guess we will see once I pushed the fix.

Thanks a lot!

@dvdhrm
Copy link
Member

dvdhrm commented Dec 15, 2017

I pushed a fix to -master just now. This resolves all issues that I can reproduce locally.

However, this still does not explain the original bug to me. Is there any chance to get a backtrace?

@floppym
Copy link
Contributor Author

floppym commented Dec 17, 2017

With latest master, I am no longer seeing those general protection errors.

@toralf - could you test sys-apps/dbus-broker-9999 on that tinderbox image? If you still get a test failure, it would be helpful if you could get a backtrace from the core dump.

@toralf
Copy link

toralf commented Dec 17, 2017

OK: 34
FAIL: 0
SKIP: 1
TIMEOUT: 0

:-)

@teg
Copy link
Contributor

teg commented Jan 19, 2018

Thanks for the confirmation!

@teg teg closed this as completed Jan 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants