-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Bluetooth: Host: Remove bt_buf_get_cmd_complete
#68008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The next commit will change the behavior of `bt_buf_get_evt` to no longer use `bt_buf_get_cmd_complete`. We have to remove the test that would prevent this. Signed-off-by: Aleksander Wasaznik <aleksander.wasaznik@nordicsemi.no>
87c846f to
1e33d88
Compare
bt_buf_get_cmd_completebt_buf_get_cmd_complete
bt_buf_get_cmd_completebt_buf_get_cmd_complete
8d52cf4 to
5a00a19
Compare
`bt_buf_get_cmd_complete` is broken due to zephyrproject-rtos#64158, and fixing it would require changing its signature and put even more complexity into the HCI drivers, as it would require the drivers to perform an even deeper peek into the event in order to observe the opcode. Instead of the above, this patch removes the use of `bt_buf_get_cmd_complete` and adds logic to allow the host to accept command complete events in normal event buffers. The above means performing a copy into the destination buffer, which is the original command buffer. This is a small inefficiency for now, but we should strive to redesign the host into a streaming architecture as much as possible and handle events immediately instead of retaining buffers. This fixes zephyrproject-rtos#64158: Like all command completed events, the completion event for `BT_HCI_OP_HOST_NUM_COMPLETED_PACKETS` is now placed in normal event buffers. The the logic where the host discards this event is already present. Since it's discarded, it will not interfere with the logic around `bt_dev.cmd_send`. Signed-off-by: Aleksander Wasaznik <aleksander.wasaznik@nordicsemi.no>
After the previous commit, this function no longer has any users. Signed-off-by: Aleksander Wasaznik <aleksander.wasaznik@nordicsemi.no>
This is purely a syntactical refactor. Signed-off-by: Aleksander Wasaznik <aleksander.wasaznik@nordicsemi.no>
5a00a19 to
b7d1ea2
Compare
|
I'm not opposed to this change, but another reason (besides being an optimisation) for the reuse is that it completely eliminates one category of bugs, namely deadlocks caused by |
and to continue on this thought: unless my memory fails me, the feature was originally inspired by wanting to get rid of fighting deadlocks like this. It was introduced in commit 50678b0 but unfortunately the commit message doesn't seem to go into much detail besides mentioning "reducing pressure on the RX buffer pool". The feature would have its 7th anniversary in a few days :) Btw, might be worth to do some profiling to see what kind of impact this has on the performance of Bluetooth Mesh, since that actually depends on a continuous stream of HCI commands (as opposed to ACL packets) for the transmission of data. @PavelVPV FYI. |
|
The CI error is unrelated. |
|
I think the deadlocking problems the host has can really only be solved a principled way: We must stop blocking the event stream from the controller. All these priority levels and extra pools and stuff are just band-aids without technical soundness. I mean all events must be treated as ISR. And I mean the event buffer must be made available for the next event immediately without any dependencies. If an event handler must keep data around, it must have it's own buffers (it can copy data into or give to the driver for DMA), or it can choose to drop the data. The handler cannot delay the event stream. It's also not true that deadlocks caused by |
I cherry-picked these 4 commits to the nrf connect sdk fork and ran our performance tests. The relay node performance is not changed. Throughput varies within the margin of error. |
That's a good point. And as a general design approach what you suggest sounds like the only safe bet. What mitigates the current situation a little bit is the concept of "discardable" events (e.g. advertising reports) but that doesn't cover everything that could be in the transport queue. |
Delete tests that did not end up bringing any value. What ended up happening is busy-work to "make the test pass" without understanding what's their original purpose. Worse, the CI change-based testing is broken and doesn't pick them up, even by PRs modifying the tests themeselves. See zephyrproject-rtos#68008 Signed-off-by: Jonathan Rico <jonathan.rico@nordicsemi.no>
Delete tests that did not end up bringing any value. What ended up happening is busy-work to "make the test pass" without understanding what's their original purpose. Worse, the CI change-based testing is broken and doesn't pick them up, even by PRs modifying the tests themeselves. See #68008 Signed-off-by: Jonathan Rico <jonathan.rico@nordicsemi.no>
bt_buf_get_cmd_completeis broken due to #64158, and fixing it would require changing its signature and put even more complexity into the HCI drivers, as it would require the drivers to perform an even deeper peek into the event in order to observe the opcode.Instead of the above, this patch removes the use of
bt_buf_get_cmd_completeand adds logic to allow the host to accept command complete events in normal event buffers.The above means performing a copy into the destination buffer, which is the original command buffer. This is a small inefficiency for now, but we should strive to redesign the host into a streaming architecture as much as possible and handle events immediately instead of retaining buffers.
This fixes #64158: Like all command completed events, the completion event for
BT_HCI_OP_HOST_NUM_COMPLETED_PACKETSis now placed in normal event buffers. The the logic where the host discards this event is already present. Since it's discarded, it will not interfere with the logic aroundbt_dev.cmd_send.