New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bt_rand() is called over HCI when BT_HOST_CRYPTO=y, even if BT_CTLR_LE_ENC=n #22202
Comments
The correct way for the host to test LE Encryption support in the controller is:
The reasons why the host is avoiding to use a local entropy source are mainly historical: in the early days of Zephyr (3-4 years ago) most boards did not have an implementation for it, which resulted in them easily falling back to using a "dummy" PRNG which was something similar to what you suggested with I'm fine with falling back to the local entropy support if the controller indicates lack of encryption support, but we have to ensure that it's not a dummy one. In any other scenario the Bluetooth init ( |
On a combined build CONFIG_BT_CTLR_CRYPTO is selected by default and the corresponding crypto.c file that calls into ecb.c functions. Switching to CONFIG_BT_HOST_CRYPTO (if there is no crypto hw acceleration in controller) on a combined build results in that deadlock |
@George-Stefan The controller shouldn't do calls into to the host. And currently bt_rand is used by the controller. For example to create the access address of the connection (ull_master.c). @cvinayak @jhedberg So I would rename the implementation and all uses of bt_rand in the controlelr to bt_ctlr_rand(), and then introduce bt_rand which calls this function. That would solve the recursion problem. For the host we should also handle the case where the controller does not provide the LE Rand function. That would mean that the host would possibly do a call to the entropy device. But if there is no entropy device in the host it appears that we have a problem. The host would need these random numbers for the Privacy feature. Which relies on secure random numbers. Using k_uptime would not be good enough for this. Is there a non-secure rand function implement somewhere? We appear to use bt_rand most places, not all of those need secure random numbers I guess? |
We could kind of do that.. I suppose. The reason I hesitate is that the core spec states LE Encryption as the feature which makes this command available. That said, instead of checking for supported features the host could be checking for supported commands (which is what we do when the core spec doesn't clearly specify what feature enables support for an optional command) |
...or then we just pretend ignorance and keep the host code presuming that this command is always available ;) |
So the use case we need to consider is a privacy-enabled host build with a controller without encryption, and where the host does not have a access to secure random numbers. Using pseudo-random numbers with privacy would mean you could track it across RPA update simply because the rand part of the RPA would be known. @jhedberg Yeah I meant that we could include the function and set the function as supported, in which case the host would have to read the supported commands instead of features. Is there any possible drawbacks to this? |
Maybe the discussion I started should be taken to it's own issue? In any case I think what should be done to solve this problem is what I already mentioned:
|
The host is already reading the supported commands and storing them into |
In my opinion this is good enough. We just need to fix the recursion problem. |
Fix infinite recursion in host-based bt_rand function. This would call HCI LE Random Number command, which would in turn call bt_rand, causing an infinite recursion. bt_rand -> prng_reseed -> BT_HCI_OP_LE_RAND -> le_rand -> bt_rand To solve this issue the controller should avoid doing calls into the host, so all calls to bt_rand in the controller should be replaced with a call to a controller function. Fixes zephyrproject-rtos#22202 Signed-off-by: Joakim Andersson <joakim.andersson@nordicsemi.no>
@alexandru-porosanu-nxp Although I don't get the exact same crash as you do, setting BT_CTLR_CRYPTO=n is definitively not working on Nordic either. |
Fix infinite recursion in host-based bt_rand function. This would call HCI LE Random Number command, which would in turn call bt_rand, causing an infinite recursion. bt_rand -> prng_reseed -> BT_HCI_OP_LE_RAND -> le_rand -> bt_rand To solve this issue the controller should avoid doing calls into the host, so all calls to bt_rand in the controller should be replaced with a call to a controller function. Fixes #22202 Signed-off-by: Joakim Andersson <joakim.andersson@nordicsemi.no>
Describe the bug
If LE encryption is not supported (like the case for VEGA controllers), and host encryption is enabled, there will still be calls to the controller for obtaining the necessary 8 bytes of random data.
This is done in two places in subsys/bluetooth/host/crypto.c, in prng_reseed() and in prng_init().
The problem is that the sequence of calls is actually a kind of infinite recursion, as per below
prng_reseed() -> BT_HCI_OP_LE_RAND -> le_rand() -> bt_rand() (host) -> prng_reseed()
The call to bt_rand() in host is done because only one implementation of the bt_rand() function (either from controller or from host) is being linked at a time.
Although I have not tested this, I suppose that this could be reproducible on Nordic as well, if setting BT_CTLR_LE_ENC_SUPPORT to n and enabling host crypto. Will test later on, when I can get access to a Nordic board.
To Reproduce
Steps to reproduce the behavior:
This is due to the fact that eventually the stack will run out.
Expected behavior
The controller random function should not be called when BT_CTLR_LE_ENC_SUPPORT is not enabled. This is according the the BLE spec, vol 2, part E, 3.1 LE Controller Requirements: where the HCI_LE_Rand command is noted as C4; per the note "C4: Mandatory if LE Feature (LE Encryption) is supported, otherwise excluded.".
Environment (please complete the following information):
Additional context
One quick-fix would be to re-implement prng_reseed() and prng_init() to call locally the entropy device. Another dirty fix would be to just use the kernel ticks (k_uptime_get) as the 8 bytes of 'entropy'.
The text was updated successfully, but these errors were encountered: