-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bluetooth controller extended advertisement crashes in lll layer #48812
Comments
Some observations:
Request:
Frankly, if the issue is not something that I cannot reproduce with simple steps at my end, it will be a lot of effort interacting or speculating. |
Do you refer to Bluetooth Qualification Process ?
The driver implementation is closely following the H4 sample implementation, which uses UART. It implements the same rx_thread for (controller -> host) communication. The data retrieved by this thread is copied into our layer for transmission. The only difference is that the data is then copied and passed to the our layer instead of UART irq flow. When receiving the only difference is that the net_buf_put(tx_queue) is called from system_workqueue instead of uart isr. For this purpose i can capture the HCI communication, which might verify that the controller is receiving the HCI commands correctly.
We have not modified the source code of the controller. It is as the link in the ncs-zephyr repository.
I will test several devices with the BT_ASSERT enabled and will report back with the results.
I will check the UART IRQ priority, however i assume that it is unchanged from default priority given by nrf_hal. The custom transport layer is using the same UART api as H4 or H5 sample. So i assume the priorities should be the same.
|
As long as all Controller's downstream command and data path, i.e. LL interface functions in
Then why do you state this:
You have modified the Controller's IRQ priorities, they can now overlap with UART IRQ priority level. |
Disabled by Kconfig Option
I can test the same use case using default values for, however in previous version we used it:
If it might help to discover the origin of the issue will will retest using default priorities of the controller bt part. |
I will send a PR when I get time to not be able disabling of I am unable to assist further, without details of the assertions and detailed steps to reproduce a crash using a simple sample. I do not rule out bugs in Controller though, you may try using the latest upstream Zephyr Project repository to if things are already fixed. |
I have performed following test:
Removed:
Running on ten devices overnight. None of them crashed. I have one question regarding the H4 controller setup. Why there is not a controller driver abstraction layer or module ? The controller has to be build from the H4 sample and modified if the user require any additional functionality from the controller chip. The ideal situation would by a clear main function. With enabled H4 controller driver and defined UART interface on which it should operate. |
Glad to hear :-)
The H4 sample, I assume you are referring to maybe I am misunderstanding, @jhedberg may be you have some comments on the H4 sample question |
So the only possible option for a user to add additional functionality to the controller, without compromising the Bluetooth Conformance. Is to implement some simple logic, which can be interfaced using the vendor specific HCI commands ? If i add additional thread to the controller dedicated for some processing from other UART interface. Will it compromise the Bluetooth Conformance ? How far can one go in adding logic to the controller without compromising the Bluetooth Conformance. Is there some API, which can be used to easily implement vendor specific HCI commands, without directly modifying the I can see there is a lot of zephyr specific HCI commands available already inside the controller. Does the Host some how use these capabilities in |
Closing this as the bug itself is resolved, please open a discussion to continue the conversation. |
Zephyr version:
NCS 2.0.0 - Zephyr tag v3.0.99-ncs1
I'm running and Bluetooth host and controller combination of nRF9160(Host) and nRF52833(Controller) using the BT_LL_SW_SPLIT variant. The application is very complicated so i cannot provide a reproducible sample. I have managed to capture an stack trace using Memfault. The problem occurs rather random once in a few hours.
lll.c
Note that the trace is capture with disabled CONFIG_BT_ASSERT=n on the controller.
We had issues with advertisement raising radio tx not ready. In production we disabled the assert to remove unnecessary restarts with cost of some advertisements not getting transmitted.
I can try to catch some debug outputs using the BT_ASSERT, however the assert string is usually lost due to LOG_DEFFER. To catch the issue with memfault is also possibility with the BT_ASSERT but requires more work.
Afterwards i tried to optimize IRQ times by these settings(replaced Zero Latency Interrupts) and haven't tested the assert variant ever since:
Basic description of the application.
It broadcasts a large amount of advertisements sets
6 sets of advertisements using BT legacy(1 connectable).
6 sets of advertisements using BT long range (Coded phy).
Interval of all advertisements are 4HZ.
Application performs regular ADV_DATA updates for the advertisements with interval 1Hz-4Hz.
Communication between Controller and Host is implemented by custom layer. I can provide HCI traces using RTT BT debug if necessary. However, we are using same layer to implement other communication between the chips and we had no problems with it. It is basically H5(Confirmed messages, and retransmission) over multiplexed UART with Flow Control. The controller and host does not seem to generate any error logs.
Configuration of the host(BT part):
Configuration of the controller(BT part):
What other outputs might be helpful to identify the origin of the problem ?
The text was updated successfully, but these errors were encountered: