Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failure inside ~QUsbDevicePrivate() #91

Closed
petrmanek opened this issue Jan 28, 2023 · 2 comments
Closed

Assertion failure inside ~QUsbDevicePrivate() #91

petrmanek opened this issue Jan 28, 2023 · 2 comments
Assignees
Labels

Comments

@petrmanek
Copy link
Contributor

Describe the bug
Inside the destructor of QtUsbDevicePrivate, the call to libusb_exit() results in the following assertion failure, which leads to a subsequent crash:

os/threads_posix.h:58: usbi_mutex_destroy: Assertion `pthread_mutex_destroy(mutex) == 0' failed.

Platform:

  • OS: Archlinux x86_64, kernel 6.1.8-arch1-1
  • Version: QtUsb ea6d509, libusb 1.0.26

To Reproduce

  1. Compile the SimpleBulkTransfer example on the reported platform
  2. Use a real USB device (e.g. a FTDI serial cable), modify endpoint IDs to match its configuration
  3. Run the example

Expected behavior
The program should finish successfully. Instead of that, it crashes with assertion failure.

Additional context
I took time to do some debugging work. Using a debugger, a custom fork of QtUsb and libusb with logging statements, I traced the error through the following points in code (each of the links below points to a location inside libusb source code):

My first conjecture was that there was an event or a flying transfer that was somehow left behind. This has proven not to be the case. Instead everything suggested that there was something wrong with the state of libusb's ctx->events_lock, which synchronizes event handling across multiple threads. I did some sanity checks, e.g. that in the context of the functions referenced above, the mutex is indeed destroyed from a thread that created it, that it is not destroyed in a locked state etc. Everything seemed to check out. Being stumped, I tried annotating the calls to libusb_try_lock_events, libusb_lock_events and libusb_unlock_events. This pointed me to what I believe to be the actual cause of the crash.

Based on my analysis, it seems that calls to acquire and relinquish events_lock are unbalanced. Specifically, it seems that the lock is relinquished one more time than it is acquired. Immediately after that, it is destroyed in usbi_io_exit, which crashes the program. Since in the reported platform, mutexes are implemented using the pthread, calls to usbi_mutex_unlock translate to pthread_mutex_unlock. Having a brief look at the "Unlock when not owner" section of pthread_mutex_lock(3p) man page, the behavior of such call seems to be undefined. This is exactly in line with what I am seeing: one extra relinquish call silently pollutes the state of events_lock, which later triggers assertion failure from pthread_mutex_destroy.

With this conjecture, I attempted to validate my theory by looking at any cases where events_lock is handled from QtUsb. The list of possible suspects was not long, and soon enough I settled on a possible culprit. This call to libusb_unlock_events is in my humble opinion what taints events_lock by relinquishing it when unowned. Browsing through libusb documentation, I found this page, which contains the following snippet that I suspect inspired the implementation of QUsbEventsThread:

// inside event loop:
if (libusb_try_lock_events(ctx) == 0) {
    // we obtained the event lock: do our own event handling
    while (!completed) {
        if (!libusb_event_handling_ok(ctx)) {
            libusb_unlock_events(ctx);
            goto retry;
        }
        poll(libusb file descriptors, 120*1000);
        if (poll indicates activity)
            libusb_handle_events_locked(ctx, 0);
    }
    libusb_unlock_events(ctx);
}

Putting aside the fact that the referenced documentation page mostly refers to cases where one would wish to implement direct polling of libusb's file descriptors, which I am not aware is something that QtUsb is doing, the relevant part of this snippet (corresponding to lines 505-508 of qusbdevice.cpp) is the if condition that checks libusb_event_handling_ok, and if needed, relinquishes the events_lock on demand. This is reinforced by the following statement in the documentation referring to libusb_event_handling_ok:

If this function returns 0, the correct behaviour is for you to give up the event handling lock, and then to repeat the cycle.

While I would not want to dispute this polling scheme nor the design strategy behind libusb's architecture, I would point out that all of this logic happens after libusb_try_lock_events has acquired events_lock earlier in the code. This means that in the snippet, the lock needs to be released, which justifies the call to libusb_unlock_events. In contrast, QUsbEventsThread does not acquire the lock in the first place, so it stands to reason that it should not be relinquishing it either. If this requires more evidence, note that the referenced snippet calls libusb_handle_events_locked with events_lock acquired, whereas QUsbEventsThread calls libusb_handle_events_timeout_completed, which does not require the lock to be acquired at the time of calling, and in fact attempts to acquire and release events_lock inside its implementation.

For this reason, I would recommend that the call to libusb_unlock_events from inside QUsbEventsThread should be removed in order to fix this crash. I have already prepared a fork of QtUsb, which corrects the assertion failure crash on my system. I intend to submit it as a merge request that will be referenced below. In my modified version of QUsbEventsThread, I still left the call to libusb_event_handling_ok. However, I would strongly urge that you consider removing as well, since the thread never holds events_lock. This is consistent with the following quote taken from the function's docstring:

This function should be called while the events lock is held: you don't need to worry about the results of this function if your thread is not the current event handler.

Leaving call to this function in the loop risks a possible future bug, where event handling is interrupted prematurely without the knowledge of user code that calls QtUsb.

@petrmanek
Copy link
Contributor Author

For completeness, here is a crash log from a simple app that just opens and closes a QUsbDevice (for simplicity,no endpoints are involved).

crash.log

@fpoussin
Copy link
Owner

fpoussin commented Feb 4, 2023

Fixed in #92

@fpoussin fpoussin closed this as completed Feb 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants