-
-
Notifications
You must be signed in to change notification settings - Fork 37.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] Refactor ChibiOS USB endpoints to be fully async #21656
Conversation
d546925
to
39feccb
Compare
39feccb
to
e5881c2
Compare
Does this supersede #21537? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aside from already dealt with mouse endpoint issues, works well in testing.
Something about this seems to cause reports to get stuck. Can reproduce on both moonlander (stm32f303) and blackpill based board.
Thanks for testing and reporting. Can you tell me which kind of reports, all of them or only keyboard reports? How does this manifest and is there a way to reproduce? Maybe this can be resolved by increasing the default buffer size per endpoint to something like 12 reports by |
8260d0c
to
50cff75
Compare
@drashna I can confirm that mouse scrolling was broken with this implementation as usb reports that where below the size of an endpoint would be buffered and sent in an incomplete manner. I've committed a fix. If you have the time please try it again. |
Yeah, this seems to work properly, and doesn't cause the issue with keys getting stuck. |
Thanks for the confirmation! I'll wait for another review before merging though as the changes are rather profound and maybe there is an edge case that neither of us triggered during testing. |
52e6792
to
6a21b10
Compare
unfortunately, the behavior seems to be back. I'm not sure what is causing it specifically. The define above seems to allow it to work a bit longer before going crazy. And specifically, it looks like the last report sent gets stuck (eg, not cleared properly). A key or two will work fine before the issue triggers. |
You are using an STM32F4 board as the testing device right? |
STM32F411 and STM32F303 |
I've been using my F303 split keyboard for multiple hours and couldn't trigger the described behavior (mashing buttons and all that) on a host with a Linux OS (Ubuntu 23.04). Could you describe me some specific actions (e.g. oneshot mods combined with button mashing) that triggers those hangs? |
honestly, it's frustratingly inconsistent. right now, I can get it to trigger consistently on my moonlander (stm32f303), but no longer will occur. And I just tap a normal keycodes a couple of times and it triggers. Basic keycodes, too. And can confirm that it happens on RP2040, (adafruit macropad). |
This refactoring unifies the endpoint handling for all IN and OUT endpoints (except for EP0 and IDLE handling) for ChibiOS based keyboards: - All endpoints are now backed by a configurable buffer that queues received reports, or reports that shall be sent. - IN endpoints are served by the `send_report` function. - OUT endpoints are served by the `receive_report` function. Sending is now asynchronous and doesn't block the main loop when there is space in the sending buffer. If the sending buffer is full, a short timeout is applied to wait for new space in the buffer. If this fails a "disconnected" endpoint is assumed and the next report will no longer apply a timeout. If sending fails the report the current queue is reset and the report enqued. This queueing and timeout of reports makes the sending resilient to: 1. Host USB stack implementations that don't support advanced features like NKRO, extra keys or mouse keys. 2. Short hick-ups in the Host USB stack where an endpoint isn't polled for a short amount of time. e.g. W11/W10 has a resume from suspend quirk where the first report with a key down event is received but a fast subsequent key release event can only be received ~500ms-1s later. The queueing of these events should prevent the stuck key. Signed-off-by: Stefan Kerkmann <karlk90@pm.me>
Previously this would be done in the SOF callback which is fired with a frequency of 1KHz thus wasting many cycles for the context switching overhead and usually doing no work as the endpoints have no data to send most of the time. Instead we now flush the endpoints in dedicated endpoint tasks, which is a cheap check NULL pointer check when no work is to be done. Signed-off-by: Stefan Kerkmann <karlk90@pm.me>
The USB idle rate handling for non-nkro keyboard endpoint was implemented via a virtual timer that fired asynchronously at the host requested idle rate. Sending was done in the callback of the virtual timer in an ISR context. This is no longer an option as the asynchronous endpoints take ownership of all used USB endpoints and thus assume that all reports will be send via it's API. This commit introduces a report storage that holds the last successfully send USB report for all USB IN **HID** endpoints. This report storage is modelled in a OOP fashion as the shared endpoint with its multiple report types complicates the handling. With this report storage we can not only fullfill the idle rate handling but also the mandatory get report requests for all USB HID IN endpoints in a generic fashion. Signed-off-by: Stefan Kerkmann <karlk90@pm.me>
9581cc0
to
639ac02
Compare
I've rebased onto Tested on RP2040 and STM32F411 again. |
|
||
#define QMK_USB_REPORT_STORAGE(_get_report, _set_report, _reset_report, _get_idle, _set_idle, _idle_timer_elasped, _report_count, _reports...) \ | ||
&((usb_report_storage_t){ \ | ||
.reports = (_Alignas(4) usb_fs_report_t *[_report_count]){_reports}, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turned out that this kind of usage of _Alignas
in compound literals is supported only in GCC >= 8.x. Because of that, GNU Guix is no longer suitable for building the QMK firmware (they have only gcc-arm-none-eabi-7-2018-q2-update
).
This particular usage of _Alignas
is also questionable — apparently it is applied to an array of pointers, which should be aligned appropriately anyway. Other places where the alignment is applied to byte arrays used for endpoint buffers look reasonable though.
Description
This PR aims to unifies the ChibiOS USB endpoint implementation for all IN and OUT endpoints to fully async operation.
Current ChibiOS USB endpoint implementation
Previously unidirectional sending for all IN endpoints was implemented in a blocking manner. Sending was implemented by invoking the ChibiOS USB API directly, waiting with a fixed timeout of 100ms when the endpoint wasn't ready and discarding any report that couldn't be send.
Bidirectional IN and OUT endpoints used a driver that was derived from the ChibiOS CDC driver that was already async in operation. This implementation wasn't suitable for unidirectional endpoints though and discarded any new report if the sending queue was full.
Why asynchronous report handling instead of simple blocking?
The basic assumption is that we don't know when an IN endpoint is polled the next time or at all and we can't apply a heuristic for how long to wait. Likewise we don't know if an OUT endpoint will ever receive data. Some examples to back this up:
The console endpoint is not regularly polled on Windows and Linux if no application e.g.
qmk console
is listening to this endpoint. See All ARM Keyboards do not work on Ubuntu 18.04 #5631.Simple USB stack implementations that are found in BIOSes might not support all QMK features like extrakey, CDC or MIDI endpoints but the user might still attempt to send out a USB report of this kind.
After resuming from suspend a host might immediately poll an IN endpoint but the 2nd poll has a gap of multiple seconds. This was happening on Windows hosts see this AVR example [Bug] Teensy 2.0 handwired pad key stuck after Windows resumes from sleep #19871.
Furthermore USB HID keybord reports (and other absolute mode reports) are stateful. A pressed key report must be followed by a report that releases the key again or it will remain stuck.
From this I derive the following requirements:
Which leads to my conclusion that:
Which is done in this PR.
As the USB idle rate and get report request handling was tightly coupled to the blocking sending implementation it was refactored as well to not introduce any regression in features. A benefit is that these are now supported for all IN endpoints in a generic manner.
Tests
All tests are run against
handwired/onekey/<board>:default
keyboards with enabledconsole
,nkro
andmousekeys
features. So the most used feature set is covered.Automated USB compliance tests
For compliance tests the USB3CV tool version 3.0.0.0 is used on a W11 hosts.
Manual suspend and resume tests
These manual tests exercised that resume and suspend is working as expected, which means:
Types of Changes
Issues Fixed or Closed by This PR
Checklist