New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bluetooth: controller: Fix premature connection event close #28089
Bluetooth: controller: Fix premature connection event close #28089
Conversation
@jjamesson FYI Please try the following Kconfig options in your hci_uart application: Below is the throughput test of a nrf52840 hci_uart board connected to nRF52840 throughput sample connected each other using 5 wires, and a UART at 1Mbps baud rate: |
No. of times to force MD bit to be set in Tx PDU after a successful | ||
transmission of non-empty PDU. | ||
|
||
This will prolong the connection event to from being closed in cases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to allow runtime reconfiguration of this value? When it's non zero does it mean higher power consumption for all connections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to allow runtime reconfiguration of this value?
I cant think of a easy way for this value to be determined at runtime.
If the rate of Tx buffer supplied is 2.6 ms, and a BLE Trx for empty PDU takes 388us, this number shall be higher than ~7.
When it's non zero does it mean higher power consumption for all connections?
Yes, power consumption per connection event with non-empty PDU will be higher. I will make this feature kick in if Trx count in a radio event is BT_CTLR_TX_BUFFERS or more, this way only high throughput scenario will be affected.
@@ -50,6 +50,11 @@ static uint16_t trx_cnt; | |||
static uint8_t mic_state; | |||
#endif /* CONFIG_BT_CTLR_LE_ENC */ | |||
|
|||
#if defined(CONFIG_BT_CTLR_FORCE_MD_COUNT) && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as i am big fan of avoiding preprocessor i would do something like
#if defined(CONFIG_BT_CTLR_FORCE_MD_COUNT) && (CONFIG_BT_CTLR_FORCE_MD_COUNT > 0)
#define FORCE_MD_COUNT CONFIG_BT_CTLR_FORCE_MD_COUNT
#else
#define FORCE_MD_COUNT 0
#endif
And use that everywhere else as C value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, will reduce the conditional compile noise.
2928fb7
to
3bf04a7
Compare
Hi, @cvinayak, I have tried the contents of this PR with my embedded platform (see thread #27981), and indeed the throughput increased for me. However, I am unable to match the speeds that you have demonstrated. When previously I had throughputs of ~ 410 kbps and ~100 kbps (with connection intervals of 6 units and 31 units respectively), I now achieve throughputs of ~570 kbps and ~360 kbps. This is definitely better, but I'm still expecting this number could go higher. I did some troubleshooting, and I see that the controller occasionally utilizes the connection interval pretty well, and then again at times, starts stuttering again when sending back ACKs. See the image below, where I use a connection interval of 31 units. With regards, |
3bf04a7
to
f6e16bc
Compare
@jjamesson I suspect the rate of new Tx buffers made available is not fast enough, hence BLE connection events are closed. The feature to keep alive using I have updated the PR, please fetch latest for your testing. |
Yes, I can easily view the TX times via an oscilloscope. I see that one TX packet does indeed take 2.6 ms to transfer, and for two sequential packets the transfer time is 5.2 ms (without an ACK in between). Thus I would assume that I did try values 8 and 9 as well, without any increase in throughput. Having PS. Perhaps it was meant this way, but the latest commit you did result in a throughput decrease in general, whereas I am able to achieve a throughput of 478 kbps with this SHA. With the latest but one commit, this was 568 kbps.
Please note the image displayed previously. The green signal represents the ACKs from the controller. As I see it, in some places, the ACKs/time is not very frequent, but then all of the sudden the controller starts doing faster bursts. Would this be expected if the number of TX buffers is 0? With regards, |
Hi @jjamesson
The commit added was ae42af0
Reasons for connection events to close:
My throughput numbers are with nRF52dk_nrf52832 with throughput sample from NCS repository, as the peer. This application will not close connection event if received PDU's have MD bit set. With the > 690Kbps throughput reported by the application, I am assuming I dont have CRC errors. If you can provide me nRF sniffer log for your latest throughput testing, I can conclude on the reason for your bursts. -Vinayak |
Hi, @cvinayak,
When I tried this, I cloned your repo and checked out the github_hci_uart_throughput branch, so I was referencing to your 2 latest commits in your branch, namely f6e16bc5 and a922a9e4.
I see that the bursts were present in the a922a9e4 commit, and I'm not able to reproduce them with the latest f6e16bc5 commit. The incoming ACKs are more controlled, but then again, the throughput is decreased by the absence of these bursts, and I still get few ACKs per connection interval. Here's the NRF sniffer log for connection interval of 31. Maybe this uncovers something. With regards, |
f6e16bc
to
3890910
Compare
3890910
to
6d1188a
Compare
@jjamesson from the sniffer trace, new Tx packet was not made available within the keep alive Forced MD bit count of time (I observe ~3.9 ms between packet no. 2712 to 2727). The next connection event did not have all its BT_CTLR_TX_BUFFERS filled, hence the keep alive logic did not fire). Similar pattern around packet 492 to 507 etc... I am on slack with nickname vich, and available in the Zephyr's Bluetooth channel. In case our timezone overlap, we can discuss live. |
@jjamesson please use this link to join Slack if you would like to discuss this in real-time: https://tinyurl.com/y5glwylp |
Highlights of discussion in slack:
|
cccb4b4
to
a55c6f2
Compare
a55c6f2
to
e54230f
Compare
e54230f
to
7bb9920
Compare
@jjamesson Please try a build of hci_uart, latest commits in this PR, and let me know if the throughput is satisfactory? I have added a draft automatic calculation of the Force MD bit count, hence, you do not need to set any explicit Kconfig options other than configuring the max data length and tx bufffer size. |
@nordic-krch could you please test this with the |
@cvinayak this needs rebase, now. |
7bb9920
to
8b24349
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All in order. @MaureenHelm and @ioannisg we really need this for 2.4.0, it fixes an impactful issue with throughput.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is safe to get this in at this late state of the release, considering that it had been tested extensively while it was waiting to get approvals.
Remove commented out code. Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Minor relocation of lll_conn_flush function to place alongwith non-static functions. Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Move tx_ull_dequeue function to be placed alongwith other static function definitions. Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Add a static inline interface to get ULL context reference count. Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Refactor ull_conn_tx_ack function as it no longer needs to return the connection context back to caller. Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Fix premature connection event close due to the new Tx buffers not being de-multiplexed and routed to connection's Lower Link Layer context when they arrive while being inside the connection's radio event. Also, fix master prepare to demux and enqueue two Tx buffers so that MD bit can be set correctly. Relates to zephyrproject-rtos#27981. Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Optimize the Tx PDU preparation, empty PDU only needs MD bit to be modified, other fields be initialised only at power up. Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Fix the check that decides to close a connection event, which was missing a check on MD bit being set for empty PDU to be Tx-ed out. Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Add force Md bit feature wherein connection events can be extended to match the rate at which applications provide new Tx data. Fixes zephyrproject-rtos#27981. Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Add Force MD bit feature wherein connection events can be extended to match the rate at which applications provide new Tx data. MD bit in Tx PDUs is forced on if there has been CONFIG_BT_CTLR_TX_BUFFERS count number times Tx/Rx happened in a connection event. The assumption here is, if controller's tx buffers are full, it is probable that the application would send more Tx data during the connection event, and keeping the connection event alive will help improve overall throughput. Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Added feature to in-system measure incoming Tx throughput. Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Added automatic runtime calculation of Forced MD bit count based on incoming Tx buffer throughput. Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
8b24349
to
2f622fb
Compare
Fixes the following:
Fix premature connection event close due to the new Tx buffers not being de-multiplexed and routed to connection's Lower Link Layer context when they arrive while being inside the connection's radio event.
Fix central role event prepare to demux and enqueue two Tx buffers so that MD bit can be set correctly.
Add force MD bit feature wherein connection events can be extended to match the rate at which applications provide new Tx data.
MD bit in Tx PDUs is forced on if there has been CONFIG_BT_CTLR_TX_BUFFERS count number times Tx/Rx happened in a connection event. The assumption here is, if controller's tx buffers are full, it is probable that the application would send more Tx data during the connection event, and keeping the connection event alive will help improve overall throughput.
Added feature to in-system measure incoming Tx throughput.
Added automatic runtime calculation of Forced MD bit count based on incoming Tx buffer throughput.
Fixes #27981.
Signed-off-by: Vinayak Kariappa Chettimada vich@nordicsemi.no