Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GC Adapter breaking and burning a full CPU core after sleep-wake on Linux #12015

Merged
merged 6 commits into from Jul 22, 2023

Conversation

nyanpasu64
Copy link
Contributor

@nyanpasu64 nyanpasu64 commented Jun 30, 2023

https://bugs.dolphin-emu.org/issues/13284

After sleep-wake on Linux, talking to the GC USB adapter fails with LIBUSB_ERROR_IO because the kernel driver regains ownership over the adapter device. Previously Dolphin would continually try talking to the adapter in an infinite loop, burning a full CPU core and failing to receive new input from the adapter.

The minimal solution is to libusb_detach_kernel_driver() the adapter again, but I've been told that the best approach to handling LIBUSB_ERROR_IO is to instead call libusb_reset_device and reinitialize the device. This PR calls libusb_reset_device when the system sleep-wakes and talking to the GC adapter returns LIBUSB_ERROR_IO.

  • In "Fix memory leak in libusb code", should we instead use MakeConfigDescriptor() which wraps the error code and descriptor in an RAII handle?
  • Should I reset the device on some/all other errors besides LIBUSB_ERROR_IO? I have not found any other ways to trigger an endless CPU-burning loop, but it may be a good idea to do so defensively.

Another bug I found:

  • If you launch Dolphin with all GC controllers set to not Adapter, then pick Adapter, Dolphin fails to detect the adapter until you switch away and back. This does not happen if you launch Dolphin with GC Adapter already set.
    • If you enable logs, for whatever reason, "GC Adapter scanning thread started" only starts once you switch away from the GC Adapter as Controller 1. This is not related to the code I touched in this PR, and I could investigate or fix it here or defer it to a later PR.

@nyanpasu64
Copy link
Contributor Author

nyanpasu64 commented Jun 30, 2023

If you launch Dolphin with all GC controllers set to not Adapter, then pick Adapter, Dolphin fails to detect the adapter until you switch away and back. This does not happen if you launch Dolphin with GC Adapter already set.

Every time you change the list of controllers, Qt calls lambda -> GamecubeControllersWidget::SaveSettings():

  {
    Config::ConfigChangeCallbackGuard config_guard;
    ...
    if (GCAdapter::UseAdapter())
      [bool UseAdapter()] { return s_is_adapter_wanted; }
    if ^
      GCAdapter::StartScanThread();
    else
      GCAdapter::StopScanThread();
  }
  [config_guard.~ConfigChangeCallbackGuard()]
    ...
    [static void RefreshConfig()] { initialize s_is_adapter_wanted }

The problem is that SaveSettings() reads from s_is_adapter_wanted to check whether to enable the GC adapter thread, before it calls the function which initializes s_is_adapter_wanted to the new value based on whether any controllers use the GC Adapter.

There are multiple ways to fix the bug, with different possible side effects. You could move GCAdapter::UseAdapter() past ~ConfigChangeCallbackGuard() (changing the order of operations may potentially have side effects), or you could manually call RefreshConfig() to initialize the variable before GCAdapter::UseAdapter() (at the cost of double-computing the value), check all FromGCMenuIndex(m_gc_controller_boxes[i]->currentIndex()) for GC adapters directly, etc.

@@ -636,6 +636,7 @@ static void AddGCAdapter(libusb_device* device)
}
}
}
libusb_free_config_descriptor(config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can refactor this to MakeConfigDescriptor() then I think that's the better choice, just for the RAII cleanup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I assume I'd write something like const auto [error, config] = LibusbUtils::MakeConfigDescriptor(device) to match the existing calls to this function.
    • I don't really like that you don't see the types of error and config anymore, but there's not really a better way to use MakeConfigDescriptor().
  • Do I need to introduce a new scope to eagerly destroy the ConfigDescriptor (unique_ptr with dynamic destructor) when not in use? If not, should I reset the unique_ptr or leave it initialized until the function returns? And should I reuse the error value as the return value of libusb_interrupt_transfer() below?

LibusbUtils::ErrorWrap(error));
if (error != 0)
{
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the user have to do to re-enable the GC Adapter if this break; is hit? I get how you arrived at this but it's kinda weird how the thread just dies here without informing anyone about this failure.

Copy link
Contributor Author

@nyanpasu64 nyanpasu64 Jul 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof. Normally Reset() checks if (s_read_adapter_thread_running.TestAndClear()) s_read_adapter_thread.join();, which happens when the USB device is unplugged.

If I ignore libusb_reset_device() return value and always assume it fails, I found that switching away from GC Adapter and back doesn't work (and the controller dialog wrongly states "Adapter Detected"). I think there needs to be some way for the GC thread to communicate to the GUI whether it's stopped on its own or not. Unfortunately std::thread lacks a non-blocking "has terminated" API (and only offers blocking join).

One possible workaround is to add a new atomic boolean variable for "is thread running". I tried setting s_adapter_error = static_cast<libusb_error>(error); s_status = AdapterStatus::Error before breaking out of the loop, but this shows a message on the UI "Error Opening Adapter: Input/Output Error" (suggesting an error on opening the adapter, not recovering from sleep-wake), and switching away from the adapter and back still shows the same error. Unfortunately even unplugging and plugging the adapter does not help for some reason.

I'll have to trace what's going on with the threads further, perhaps I didn't close a device somewhere. (Though does it really matter given that I've only ever seen this case by pretending that libusb_reset_device() fails?)

EDIT: The current code is quite designed around the assumption that Reset() will be the one cleaning up GC adapter state, not the thread terminating itself after CheckDeviceAccess() succeeds and AddGCAdapter() runs. So I don't know if I can correctly implement returning early from ReadThreadFunc() at all, without a substantial rewrite of the GC adapter threading (which may produce a better result, but I'm very much not interested in spending weeks learning libusb's threading properties and wondering what C++ code/variables is and isn't thread-safe).

Perhaps I'll call libusb_reset_device() (which blocks for 430 ms when it completes successfully, both normally and after LIBUSB_ERROR_IO), discard the error code, and if it fails then call libusb_reset_device() again on the next iteration? I don't know how long it blocks if it fails, and if it can even fail to reset the device without LIBUSB_HOTPLUG_EVENT_DEVICE_LEFT being called.

@AdmiralCurtiss
Copy link
Contributor

AdmiralCurtiss commented Jul 3, 2023

re the config callback bug, I would just move the Start/StopScanThread() block outside the callback guard, that makes the most sense to me. I don't think that should have side effects but who knows...

e: In fact, a5dbf6b I seem to have introduced this bug so it's probably fine to just move it outside.

@nyanpasu64
Copy link
Contributor Author

re the config callback bug, I would just move the Start/StopScanThread() block outside the callback guard, that makes the most sense to me. I don't think that should have side effects but who knows...

e: In fact, a5dbf6b I seem to have introduced this bug so it's probably fine to just move it outside.

Should I fix this in this PR or another, since it's a UI-side code fix?

@nyanpasu64
Copy link
Contributor Author

nyanpasu64 commented Jul 5, 2023

poke, i've force pushed the branch and need a CI run

Fixes GC adapter breaking on sleep-wake on Linux and burning a full CPU
core. This is cleaner than alternative approaches.
GCAdapter::UseAdapter() reads s_is_adapter_wanted, which gets
initialized by config_guard.~ConfigChangeCallbackGuard(). So we must
wait until after destroying the config guard to know whether we have any
controllers set to GC Adapter.
@AdmiralCurtiss
Copy link
Contributor

So what's the status here, are you planning to change anything else? This looks reasonable to me as-is, admittedly without understanding the intricacies of libusb.

@nyanpasu64
Copy link
Contributor Author

I don't think I'm planning to change anything else in this branch, it worked properly last time I tested the branch and I'm working on other changes now.

(unrelated) ...though it would be nice if my AMD chipset/GPU on Linux wouldn't randomly fail to resume from sleep, and/or kill all USB devices in a way they don't come back even after soft-resetting my machine until I unplug and replug them.

@AdmiralCurtiss AdmiralCurtiss merged commit fb2b375 into dolphin-emu:master Jul 22, 2023
14 checks passed
@nyanpasu64 nyanpasu64 deleted the gc-adapter-sleep-detach branch July 22, 2023 01:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants