Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ble_gattc_disc_all_svcs() fails on ESP32S3 with several peripherals (IDFGH-8664) #10109

Closed
3 tasks done
kojibuta opened this issue Nov 4, 2022 · 60 comments
Closed
3 tasks done
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Bug bugs in IDF

Comments

@kojibuta
Copy link

kojibuta commented Nov 4, 2022

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

IDF version.

ESP-IDF v4.4.2-388-g755ce1077d

Operating System used.

macOS

How did you build your project?

Command line with idf.py

If you are using Windows, please specify command line type.

No response

Development Kit.

Custom board

Power Supply used.

USB

What is the expected behavior?

When using NimBLE (BLE central role) on ESP32-S3 it is not possible to discover services on some peripheral devices. The same code runs smoothly on ESP32 without errors.

After connection is established with a peripheral (event BLE_GAP_EVENT_CONNECT received with event->connect.status == 0) calling ble_gattc_disc_all_svcs should result in invokation of callback function ble_gatt_disc_svc_fn with error->status == 0.

What is the actual behavior?

When running the code on ESP32-S3, after connection is established with a peripheral, calling ble_gattc_disc_all_svcs results in immediate invokation of callback function ble_gatt_disc_svc_fn with error->status == 7.

Steps to reproduce.

  1. Flash central test_app to a ESP32-S3 board
  2. Power on the board
  3. Start monitoring the log
  4. Power on the peripheral
  5. Error appears on the log and service discovery fails
  6. Flash central test_app to a ESP32 board
  7. Power on the board
  8. Start monitoring the log
  9. Power on the peripheral
  10. Service discovery is successful and no error appears on the log

ESP32-S3 logs (SleepO2 device OK, B02T device KO)
ESP32-S3 board + SleepO2 device: esp32-s3-peripheral-1.log
ESP32 board + B02T device: esp32-s3-peripheral-2.log

ESP32 logs (SleepO2 and B02T device OK)
ESP32 board + SleepO2 device: esp32-peripheral-1.log
ESP32 board + B02T device: esp32-peripheral-2.log

Debug Logs.

--- idf_monitor on /dev/cu.usbserial-120 115200 ---
--- Quit: Ctrl+] | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x1 (POWERON),boot:0x8 (SPI_FAST_FLASH_BOOT)
SPIWP:0xee
mode:DIO, clock div:1
load:0x3fce3808,len:0x1670
load:0x403c9700,len:0xbb8
load:0x403cc700,len:0x2e8c
entry 0x403c9954
I (24) boot: ESP-IDF v4.4.2-388-g755ce1077d 2nd stage bootloader
I (25) boot: compile time 18:46:00
I (25) boot: chip revision: 0
I (28) boot.esp32s3: Boot SPI Speed : 80MHz
I (32) boot.esp32s3: SPI Mode       : DIO
I (37) boot.esp32s3: SPI Flash Size : 2MB
I (42) boot: Enabling RNG early entropy source...
I (47) boot: Partition Table:
I (51) boot: ## Label            Usage          Type ST Offset   Length
I (58) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (66) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (73) boot:  2 factory          factory app      00 00 00010000 00100000
I (81) boot: End of partition table
I (85) esp_image: segment 0: paddr=00010020 vaddr=3c050020 size=0e780h ( 59264) map
I (104) esp_image: segment 1: paddr=0001e7a8 vaddr=3fc95b70 size=01870h (  6256) load
I (105) esp_image: segment 2: paddr=00020020 vaddr=42000020 size=479c0h (293312) map
I (163) esp_image: segment 3: paddr=000679e8 vaddr=3fc973e0 size=0213ch (  8508) load
I (165) esp_image: segment 4: paddr=00069b2c vaddr=40374000 size=11b64h ( 72548) load
I (184) esp_image: segment 5: paddr=0007b698 vaddr=50000000 size=00010h (    16) load
I (192) boot: Loaded app from partition at offset 0x10000
I (192) boot: Disabling RNG early entropy source...
I (204) cpu_start: Pro cpu up.
I (205) cpu_start: Starting app cpu, entry point is 0x403751d8
0x403751d8: call_start_cpu1 at /Users/roberto/esp/esp-idf/components/esp_system/port/cpu_start.c:148

I (0) cpu_start: App cpu up.
I (219) cpu_start: Pro cpu start user code
I (219) cpu_start: cpu freq: 160000000
I (219) cpu_start: Application information:
I (221) cpu_start: Project name:     test_app
I (226) cpu_start: App version:      driver-v1.1.6-b13-3-gbf6b4e6-di
I (233) cpu_start: Compile time:     Nov  4 2022 18:45:55
I (239) cpu_start: ELF file SHA256:  bc8b6e398803187d...
I (245) cpu_start: ESP-IDF:          v4.4.2-388-g755ce1077d
I (252) heap_init: Initializing. RAM available for dynamic allocation:
I (259) heap_init: At 3FC9B798 len 0004DF78 (311 KiB): D/IRAM
I (265) heap_init: At 3FCE9710 len 00005724 (21 KiB): STACK/DRAM
I (272) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
I (278) heap_init: At 600FE000 len 00002000 (8 KiB): RTCRAM
I (285) spi_flash: detected chip: gd
I (289) spi_flash: flash io: dio
W (293) spi_flash: Detected size(16384k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I (306) sleep: Configure to isolate all GPIO pins in sleep state
I (313) sleep: Enable automatic switching of GPIO sleep configuration
I (320) coexist: coexist rom version e7ae62f
I (325) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.
W (407) BT_INIT: esp_bt_controller_mem_release not implemented, return OK
W (407) BT_INIT: esp_bt_controller_mem_release not implemented, return OK
I (407) BT_INIT: BT controller compile version [421c279]
I (417) phy_init: phy_version 503,13653eb,Jun  1 2022,17:47:08
I (457) system_api: Base MAC address is not set
I (457) system_api: read default base MAC address from EFUSE
I (457) BT_INIT: Bluetooth MAC: f4:12:fa:cb:81:b6

I (467) TEST_APP: test_app_host_task
I (477) NimBLE: GAP procedure initiated: stop advertising.

I (477) TEST_APP: test_app_on_sync
I (477) TEST_APP: test_app_scan: scanning...
I (487) NimBLE: GAP procedure initiated: discovery; 
I (487) NimBLE: own_addr_type=0 filter_policy=0 passive=0 limited=0 filter_duplicates=1 
I (497) NimBLE: duration=forever
I (507) NimBLE: 

I (547) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (547) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (557) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (567) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (577) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (577) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (587) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (807) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (807) TEST_APP: test_app_gap_event: name miaomiao
I (877) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (877) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (877) TEST_APP: test_app_gap_event: name WGX_iBeacon
I (897) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (897) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (897) TEST_APP: test_app_gap_event: name B6
I (1127) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (1127) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (1127) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (1127) TEST_APP: test_app_gap_event: name Holy-IOT
I (1137) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (1137) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (1147) TEST_APP: test_app_gap_event: name Jinou_Beacon
I (1937) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (1937) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (1937) TEST_APP: test_app_gap_event: name ThermoBeacon
I (2807) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (2807) TEST_APP: test_app_gap_event: name DuoEK 1429
I (2807) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (5137) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (5147) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (5147) TEST_APP: test_app_gap_event: name BPM-188
I (5147) NimBLE: GAP procedure initiated: connect; 
I (5157) NimBLE: peer_addr_type=0 peer_addr=
I (5157) NimBLE: a4:c1:38:84:01:cc
I (5167) NimBLE:  scan_itvl=16 scan_window=16 itvl_min=24 itvl_max=40 latency=0 supervision_timeout=256 min_ce_len=0 max_ce_len=0 own_addr_type=0
I (5177) NimBLE: 

I (5177) TEST_APP: test_app_connect: connecting...
I (5237) TEST_APP: test_app_gap_event: 0 BLE_GAP_EVENT_CONNECT
I (5237) TEST_APP: test_app_gap_event: status=0
I (5237) TEST_APP: print_conn_desc: handle=1 our_ota_addr_type=0 our_ota_addr=f4:12:fa:cb:81:b6
I (5247) TEST_APP: print_conn_desc: our_id_addr_type=0 our_id_addr=f4:12:fa:cb:81:b6
I (5257) TEST_APP: print_conn_desc: peer_ota_addr_type=0 peer_ota_addr=a4:c1:38:84:01:cc
I (5267) TEST_APP: print_conn_desc: peer_id_addr_type=0 peer_id_addr=a4:c1:38:84:01:cc
I (5267) TEST_APP: print_conn_desc: conn_itvl=40 conn_latency=0 supervision_timeout=256 encrypted=0 authenticated=0 bonded=0
I (5287) NimBLE: GATT procedure initiated: discover all services

E (5567) TEST_APP: test_app_svc_disced: conn_handle=1 uuid= error_status=7
I (5567) NimBLE: GAP procedure initiated: connect; 
I (5567) NimBLE: peer_addr_type=0 peer_addr=
I (5577) NimBLE: a4:c1:38:84:01:cc
I (5577) NimBLE:  scan_itvl=16 scan_window=16 itvl_min=24 itvl_max=40 latency=0 supervision_timeout=256 min_ce_len=0 max_ce_len=0 own_addr_type=0
I (5587) NimBLE: 

I (5627) TEST_APP: test_app_gap_event: 0 BLE_GAP_EVENT_CONNECT
I (5627) TEST_APP: test_app_gap_event: status=0
I (5627) TEST_APP: print_conn_desc: handle=1 our_ota_addr_type=0 our_ota_addr=f4:12:fa:cb:81:b6
I (5637) TEST_APP: print_conn_desc: our_id_addr_type=0 our_id_addr=f4:12:fa:cb:81:b6
I (5647) TEST_APP: print_conn_desc: peer_ota_addr_type=0 peer_ota_addr=a4:c1:38:84:01:cc
I (5657) TEST_APP: print_conn_desc: peer_id_addr_type=0 peer_id_addr=a4:c1:38:84:01:cc
I (5657) TEST_APP: print_conn_desc: conn_itvl=40 conn_latency=0 supervision_timeout=256 encrypted=0 authenticated=0 bonded=0
I (5677) NimBLE: GATT procedure initiated: discover all services

E (5967) TEST_APP: test_app_svc_disced: conn_handle=1 uuid= error_status=7
I (5967) NimBLE: GAP procedure initiated: connect; 
I (5967) NimBLE: peer_addr_type=0 peer_addr=
I (5977) NimBLE: a4:c1:38:84:01:cc
I (5977) NimBLE:  scan_itvl=16 scan_window=16 itvl_min=24 itvl_max=40 latency=0 supervision_timeout=256 min_ce_len=0 max_ce_len=0 own_addr_type=0
I (5987) NimBLE: 

I (6017) TEST_APP: test_app_gap_event: 0 BLE_GAP_EVENT_CONNECT
I (6017) TEST_APP: test_app_gap_event: status=0
I (6017) TEST_APP: print_conn_desc: handle=1 our_ota_addr_type=0 our_ota_addr=f4:12:fa:cb:81:b6
I (6027) TEST_APP: print_conn_desc: our_id_addr_type=0 our_id_addr=f4:12:fa:cb:81:b6
I (6037) TEST_APP: print_conn_desc: peer_ota_addr_type=0 peer_ota_addr=a4:c1:38:84:01:cc
I (6047) TEST_APP: print_conn_desc: peer_id_addr_type=0 peer_id_addr=a4:c1:38:84:01:cc
I (6047) TEST_APP: print_conn_desc: conn_itvl=40 conn_latency=0 supervision_timeout=256 encrypted=0 authenticated=0 bonded=0
I (6067) NimBLE: GATT procedure initiated: discover all services

E (6367) TEST_APP: test_app_svc_disced: conn_handle=1 uuid= error_status=7
I (6367) NimBLE: GAP procedure initiated: connect; 
I (6367) NimBLE: peer_addr_type=0 peer_addr=
I (6377) NimBLE: a4:c1:38:84:01:cc
I (6377) NimBLE:  scan_itvl=16 scan_window=16 itvl_min=24 itvl_max=40 latency=0 supervision_timeout=256 min_ce_len=0 max_ce_len=0 own_addr_type=0
I (6387) NimBLE: 

I (6407) TEST_APP: test_app_gap_event: 0 BLE_GAP_EVENT_CONNECT
I (6407) TEST_APP: test_app_gap_event: status=0
I (6407) TEST_APP: print_conn_desc: handle=1 our_ota_addr_type=0 our_ota_addr=f4:12:fa:cb:81:b6
I (6417) TEST_APP: print_conn_desc: our_id_addr_type=0 our_id_addr=f4:12:fa:cb:81:b6
I (6427) TEST_APP: print_conn_desc: peer_ota_addr_type=0 peer_ota_addr=a4:c1:38:84:01:cc
I (6427) TEST_APP: print_conn_desc: peer_id_addr_type=0 peer_id_addr=a4:c1:38:84:01:cc
I (6437) TEST_APP: print_conn_desc: conn_itvl=40 conn_latency=0 supervision_timeout=256 encrypted=0 authenticated=0 bonded=0
I (6447) NimBLE: GATT procedure initiated: discover all services

E (6717) TEST_APP: test_app_svc_disced: conn_handle=1 uuid= error_status=7
I (6717) TEST_APP: test_app_gap_event: 1 BLE_GAP_EVENT_DISCONNECT

More Information.

Error always occur using some peripheral devices, but not with every device.
Error occurs using ESP-IDF release/v4.4, release/v5.0 and latest versions.

Some peripheral causing the error are:

  • Wellue/Viatom Blood Pressure Monitor B02T
  • Viatom Thermomether AOJ-20A

Some peripheral working without errors are:

  • Viatom Pulsi Oximeter SleepO2
  • Viatom Pulsi Oximeter O2Ring
  • Viatom Blood Pressure Monitor BP2
@kojibuta kojibuta added the Type: Bug bugs in IDF label Nov 4, 2022
@espressif-bot espressif-bot added the Status: Opened Issue is new label Nov 4, 2022
@github-actions github-actions bot changed the title ble_gattc_disc_all_svcs() fails on ESP32S3 with several peripherals ble_gattc_disc_all_svcs() fails on ESP32S3 with several peripherals (IDFGH-8664) Nov 4, 2022
@kojibuta
Copy link
Author

kojibuta commented Nov 4, 2022

Here is the complete project directory: test_app.zip

@kojibuta
Copy link
Author

kojibuta commented Nov 6, 2022

I created an identical app using bluedroid and the bahavior is exactly the same, the only difference is the error code that is 0x3B using Bluedroid while it is 7 using NimBLE.

@SumeetSingh19
Copy link
Collaborator

SumeetSingh19 commented Nov 8, 2022

Hi @kojibuta,
Could you please share the "DEBUG" logs for the success and failed cases.
To enable debug logs, do:

  1. idf,py menuconfig > Component config > Bluetooth > NimBLE options > NimBLE Host log verbosity > Set it to "Debug logs"
  2. idf.py menuconfig > Log Output > Default log verbosity > Set it to "Debug"

Save and quit the menuconfig, build the central, flash, and monitor and you should see log output starting with D
e.g. "D (744) NimBLE: registered service 0x1800 with handle=1"
This means that debug logs are enabled. You may need to re-enable them when building the application again.

@SumeetSingh19
Copy link
Collaborator

Also, try connecting the peripherals you mentioned with third party BLE testing tools like nRF Connect (as central) and go for service discovery. Please share the logs for the same.

@kojibuta
Copy link
Author

kojibuta commented Nov 8, 2022

Hi @SumeetSingh19,

here are the debug logs running the same source code with two different devices (Blood Pressure Monitor B02T and Thermomether AOJ-20A). Log files 1 and 2 running on ESP32-S3-WROOM-1 (connection failure), log files 3 and 4 running on ESP32-WROOM-32D (successful connect/discovery).

  1. Log of ESP32-S3 using Wellue/Viatom Blood Pressure Monitor B02T, advertised name "BPM-188": esp32-s3-peripheral-debug-3.log

  2. Log of ESP32-S3 using Viatom Thermomether AOJ-20A, advertised name "AOJ-20A": esp32-s3-peripheral-debug-4.log

  3. Log of ESP32 using Wellue/Viatom Blood Pressure Monitor B02T, advertised name "BPM-188": esp32-peripheral-debug-3.log

  4. Log of ESP32 using Viatom Thermomether AOJ-20A, advertised name "AOJ-20A": esp32-peripheral-debug-4.log

Here are the logs using nRF-Connect on iOS (not really verbose BTW):

Please note that I had to slightly modify the code to start connection with device "AOJ-20A" because the function ble_hs_adv_parse_fields() is not parsing correctly the advertisement. The function return BLE_HS_EBADDATA because the advertisement is padded with 0x00 bytes up to the length of 25 bytes (i.e. 3 extra bytes):

I (3354) TEST_APP: test_app_gap_event: event_type BLE_HCI_ADV_RPT_EVTYPE_SCAN_RSP
I (3364) TEST_APP: 0x3fc9e9c5   08 09 41 4f 4a 2d 32 30  41 02 0a 00 09 ff 00 00  |..AOJ-20A.......|
I (3374) TEST_APP: 0x3fc9e9d5   00 00 00 00 00 00 00 00  00                       |.........|

I think the function should be more "forgiving" with zero-padded advertisements in order to support a wider range of physical devices. You can always think of padding zeros as records having a length of 0. The above advertisement you have 6 records (instead of 3) with lengths 8, 2, 9, 0, 0, 0.

Here is the modified main.c file: test_app.zip

@kojibuta
Copy link
Author

Hi @SumeetSingh19,

Apparently the controller is sending a different HCI command to the host and this triggers the error.

image

I tried running the same app on Unexpected Maker TinyS3 development board with ESP32-S3FN8 MCU and I get the same results.

I would like to know if you plan to address this problem in the near future, otherwise I must go back to ESP32-WROOM-32E module for my device. I would like to use the new ESP32-S3-WROOM-1 because it has better overall performances but I need to support several BLE devices that do not work properly with ESP32-S3.

@SumeetSingh19
Copy link
Collaborator

Hi @kojibuta, the controller doesn't send different HCI commands, actually, the first byte of the command is different because the connection handles in the two chips start at 0 on esp32 and 1 on esp32s3.

@SumeetSingh19
Copy link
Collaborator

HI @kojibuta

i checked your logs and compared it with success logs . The failure scenario in your case is that even though connection packet has been sent, the actual connection is not happening. As seen in logs, we see a disconnect with reason code 0x3E. This basically points that the actual over the air connection with remote is failed.

This behaviour also points that this may be specific in the communication of S3 and the particular remote device. Am sure, S3 would be working with other devices , even in your tests ?

I wanted to check, if it is feasible for you to get a OTA log to check the packets ? Since from host side, we have already posted the command with valid parameters.

@kojibuta
Copy link
Author

Hi @SumeetSingh19

I run into this issue while porting an existing firmware from ESP32 to ESP32-S3. The firmware has been working very well on a custom BLE/MQTT gateway I designed for several years. I am in the process of re-designing the custom gateway and I would like to upgrade to ESP32-S3. Unfortunately some of the BLE peripherals my customers are actively using do not work with ESP32-S3.

This behaviour also points that this may be specific in the communication of S3 and the particular remote device. Am sure, S3 would be working with other devices , even in your tests ?

The ESP32-S3 board and my firmware both work well with many devices by the way. But not with all of them. A subset of the devices I must support is not working.

Since all devices work well using the same firmware on ESP32 (instead of ESP32-S3) I thought it could be a BLE stack implementation bug specific to ESP32-S3.

I wanted to check, if it is feasible for you to get a OTA log to check the packets ? Since from host side, we have already posted the command with valid parameters.

I really would like to send you all the logs you need, but I don't know how/where to get a OTA log.
Is it something I can enable on ESP32?
Do I need some external BLE packet sniffer? If yes, can you point me to the correct one?

@kojibuta
Copy link
Author

kojibuta commented Dec 1, 2022

Hi @SumeetSingh19,

Since from host side, we have already posted the command with valid parameters.

Do you mean it can be a RF related problem?
Could it be something related to the fact that ESP32 also supports legacy bluetooth but ESP32-S3 only supports BLE?
Are the two CPU using a different BLE radio stack?

@SumeetSingh19
Copy link
Collaborator

Do you mean it could be a RF related problem?

So, we need OTA logs to be sure of this, and hence I requested the logs.

Could it be something related to the fact that ESP32 also supports legacy Bluetooth but ESP32-S3 only supports BLE?

Well, I believe you are using NimBLE as a stack for both chips. So, eventually, it should not matter, as these are standalone BLE operations.

Are the two CPU using a different BLE radio stack?

The underlying CPU has Xtensa architecture support (one is LX6 while the other is LX7). So they are the same architecture.

@kojibuta
Copy link
Author

kojibuta commented Dec 9, 2022

Hi @SumeetSingh19,

So, we need OTA logs to be sure of this, and hence I requested the logs.

I don't know how to get the OTA logs. Can you please tell me how to do it?

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
Greetings of the day !!!
I have gone through the logs and info on the JIRA, esp32s3 connection is failing because the reason of 0x3e connection failed to be established. The connection is failing with the 0x3e means a couple of factors might be involved so we need to check each and every scenario. The problem can be reproduced on any release so we can start debugging on release/v5.0, We need your help with the debug logs.
I will provide you with the debug binary to find the root cause of the issue since you do not have an OTA.
Can you please confirm that you are using controller lib 2b9445a6 and IDF release/v5.0?
How many iterations need to run the reproduced issue? could you please let me know the peripheral device BLE version and feature support? Does adv stop while entering into the connection mode?
Thanks,
Satish

@kojibuta
Copy link
Author

Hi @SatishSolankeEsp,
I checked out release/5.0, compiled and run the test_app.
In the logs I find the following lines:

I (290) cpu_start: ESP-IDF:          v5.0-beta1-824-ga8ef7570ca-dirt
I (524) coexist: coexist rom version e7ae62f
I (580) BT_INIT: BT controller compile version [76c24c9]
I (580) phy_init: phy_version 503,13653eb,Jun  1 2022,17:47:08

Apparently I am using controller lib 76c24c9 and IDF release/v5.0.

The issue always occurs immediately. Connection is never established, no matter how many times I try.
Some peripheral models never work. Some (different) peripheral models always work.

The scan is canceled before initiating the connection. Here is the nimble API function call sequence:

ble_gap_disc_cancel()
ble_hs_id_infer_auto()
ble_gap_connect()

The device supports Bluetooth LE 4.0. No other specs are available.
Here is the link to the supplier website https://www.viatomtech.com/aoj-20a-b.

The test_app initiates connection as soon as it finds a device with name "AOJ-20A".
Here is a complete log of last test_app run.
connect-failure.log

@SumeetSingh19
Copy link
Collaborator

SumeetSingh19 commented Dec 22, 2022

Hi @kojibuta ,
Thanks for the input, I have attached two files for you.
The .patch file is the git diff patch that I need you to apply to your idf.
The .a file is a controller lib file that I need you to put at esp-idf/components/bt/controller/lib_esp32c3_family/esp32s3 in place of the original file.

As discussed earlier could you please reproduce the issue of connection fail with debug controller lib and the provided patch to provide us the logs?
Also since it is easily reproduce could you provide the logs with the controller lib and without the patch as well?
Make sure that the version is release/v5.0.

0x3EDebugPrints.zip

@kojibuta
Copy link
Author

Hi @SumeetSingh19,

Thanks for the input, I have attached two files for you.

I don't see the attachments.

@SumeetSingh19
Copy link
Collaborator

@kojibuta , you should see a zip file now. It contains the two files.

@kojibuta
Copy link
Author

Hi @SumeetSingh19,

sorry but I don't see any attchment. Here is the screenshot of my github page.

image

@SumeetSingh19
Copy link
Collaborator

Hi @kojibuta ,
The link is in the original comment, but I'll reference it here anyway.
0x3EDebugPrints.zip

@kojibuta
Copy link
Author

Hi @SumeetSingh19,

thank you, now I can see it.
Don't know why but it was not visible.
Going to do what you suggest ASAP.

Thank you.

@kojibuta
Copy link
Author

Hi @SumeetSingh19,

I completed the tests you suggested.

Unfortunately the libbtdm_app.a file you sent seems to cause system failure on the ESP32S3:

E (729) BLE_INIT: controller init failed

I performed three tests:

  1. applying the patch only
  2. overwriting libbtdm_app.a file only
  3. with both modifications

Here are the log files of each of the three tests.

Patch only
2022_12_23_esp32s3-log-patch-only.txt

Controller lib only
2022_12_23_esp32s3-log-libbtdm_app-only.txt

Patch + controller lib
2022_12_23_esp32s3-log-libbtdm_app-and-patch.txt

I deleted and re-installed the SDK from scratch so to have a "clean" installation.

idf.py --version
ESP-IDF v5.1-dev-2509-gcfef24863f-dirty

@kojibuta
Copy link
Author

Hi @SumeetSingh19 ,

I think the controller init failed depends on bt controller lib compile version mismatch.

The latest "release/v5.0" SDK version comes with 80abacd:

I (583) BT_INIT: BT controller compile version [80abacd]

The modified lib you sent instead is version 76c24c9 (as per my previous post).

Maybe you can point me to the correct commit hash to use the same source code as the modified library version you sent.

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
Understood, conflict with the controller lib version will provide you debug lib on the tip of 80abacd.
Thanks,
Satish

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
Could you please reproduce the issue using the below controller debug lib on the tip of release/v5.0?

BT controller compile version [ae171b2]

esp32s3_debug_bin.zip

Thanks,
Satish

@kojibuta
Copy link
Author

Hi @SumeetSingh19,

Please find attached the log file you requested using the modified controller lib, plus the log using the original lib for completeness.

Modified BT controller lib
2022_12_28_libbtdm_ae171b2_modified.log

I (25) boot: ESP-IDF v5.0-494-g490216a2ac-dirty 2nd stage bootloader
I (763) BT_INIT: BT controller compile version [ae171b2]

Un-modified (original) BT controller lib
2022_12_28_libbtdm_80abacd_original.log

I (25) boot: ESP-IDF v5.0-494-g490216a2ac-dirty 2nd stage bootloader
I (583) BT_INIT: BT controller compile version [80abacd]

Let me lnow if I can do something else to help.

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
Do you able to reproduce the issue only with controller lib without applying the patch given by @SumeetSingh19 ?
Thanks,
Satish

@kojibuta
Copy link
Author

Hi @SatishSolankeEsp,

Here is the log without the patch (fresh clone from github) and modified BT controller lib.
2022_12_30_libbtdm_ae171b2_no_patch.log

The connection attempt is at the end of the log file, it starts after the following lines:

I (7263) TEST_APP: test_app_gap_event: 7 BLE_GAP_EVENT_DISC
I (7273) TEST_APP: test_app_gap_event: BLE_GAP_EVENT_DISC event_type BLE_HCI_ADV_RPT_EVTYPE_SCAN_RSP
I (7283) TEST_APP: 0x3fca1341   08 09 41 4f 4a 2d 32 30  41 02 0a 00 09 ff 00 00  |..AOJ-20A.......|
I (7283) TEST_APP: 0x3fca1351   00 00 00 00 00 00 00 00  00                       |.........|
I (7293) TEST_APP: test_app_gap_event: name "AOJ-20A"

@SatishSolankeEsp
Copy link
Contributor

SatishSolankeEsp commented Feb 3, 2023

HI @kojibuta ,
Thanks for the log and OTA which help us find the root cause of the issue and fix it.

Please try this fix by putting this controller lib of esp32s3 on the tip of release/v5.0.
IDF TIP release/v5.0:

image

Controller lib HEAD TIP: (controller commit id 80abacdd)

image

Replace this attached lib in the below path:

/esp-idf/components/bt/controller/lib_esp32c3_family/esp32s3

ble_conn_fail_fix_esp32s3.zip

Let me know the result and share the OTA as well.

Thanks,
Satish

@kojibuta
Copy link
Author

kojibuta commented Feb 3, 2023

Hi @SatishSolankeEsp,

Great news! Your fix is working perfectly. All my ESP32-S3 boards successfully connect to the peripherals and perform the service discovery.

Thank you very much for your support.

Please find below some logs and an OTA capture file recorded during some successful connections.

[log file]
log_aoj-20a_success.txt

[log file debug level]
log_aoj-20a_success_debug.txt

[wireshark OTA capture]
2023_02_03_connect_success.pcapng.zip

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
Great !!! Thanks a lot for the quick turnaround time. We will merge soon this fix on release/v5.0 meanwhile you can use the above lib.
CC @Isl2017
Thanks,
Satish

@kojibuta
Copy link
Author

kojibuta commented Feb 6, 2023

Hi @SatishSolankeEsp,

Apparently the FIX causes a regression. Sorry for the bad news.

Using the library with your fix solves connection problems with several devices (e.g. "AOJ-20A").

Unfortunately other devices (e.g. "DuoEK 1429"), that are working without the patch, with the fix cannot connect anymore.

What I can see in the debug logs is that the connection is setup correctly but the "exchange mtu" procedure fails. The connect is retried 3 times but MTU exchange always fail and the disconnect event is generated.

To better clarify:

  • device "AOJ-20A": works with FIX, does not work with stable/v5.0
  • device "DuoEK 1429": does not work with your FIX, works with stable/v5.0

Please find attaches some logs, hope they can help.

[ESP32-S3 - using lib fix - device "AOJ-20A" successful connect]
2023_02_06_esp32s3_lib_fix_aoj20a_success.txt

[ESP32-S3 - using lib fix - device "DuoEK 1429" connect error (log+air)]
2023_02_06_esp32s3_lib_fix_duoek_error.txt
2023_02_06_esp32s3_lib_fix_duoek_connect_error.pcapng.zip
2023_02_06_esp32s3_lib_fix_duoek_connect_error_2.pcapng.zip

[ESP32-S3 - stable/v5.0 - device "AOJ-20A" connect error]
2023_02_06_esp32s3_stable_v50_aoj20a_error.txt

[ESP32-S3 - stable/v5.0 - device "DuoEK 1429" successful connect]
2023_02_06_esp32s3_stable_v50_duoek_success.txt

I also found some errors with service discovery using the stable/v5.0 SDK version. After successful connection and MTU exchange, service discovery fails and causes ESP32-S3 to hang. Maybe this is all part of the same problem.

[ESP32-S3 - stable/v5.0 - device "DuoEK 1429" successful connect + service discovery error (log+air)]
2023_02_06_esp32s3_stable_v50_duoek_connect_discovery_error.txt
2023_02_06_esp32s3_stable_v50_duoek_discovery_error.pcapng.zip
2023_02_06_esp32s3_stable_v50_duoek_discovery_error_2.pcapng.zip

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
We are working on it and will tell you the procedure for how to use it with a new fix.
Thanks,
Satish

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
Please try the fix below for all your devices and let me know the result.
S3_fix_conn_fail.zip
Thanks,
Satish

@kojibuta
Copy link
Author

Hi @SatishSolankeEsp,

I already tested the lib S3_fix_conn_fail.zip with several devices supported by my IoT gateway app. It is working with all of them.

Devices tested so far:

  • Ambrosia BluCon (for FreestyleLibre)
  • Eve Door (Apple HomeKit over BLE)
  • Philips Smart Plug
  • MiaoMiao (for FreestyleLibre)
  • NetAtmo Smoke sensor (Apple HomeKit over BLE)
  • Omron Evolv BPM
  • Omron M7 BPM
  • Omron RS3 BPM
  • Omron Viva Smart Scale
  • Omron X4 BPM
  • Onvis SMS1 Motion sensor (Apple HomeKit over BLE)
  • Viatom B02T BPM
  • Viatom BP2 EKG + BPM
  • Viatom DuoEK EKG
  • Viatom SleepO2
  • Viatom AOJ-20A Thermometer
  • Wellue O2Ring

I will make some more tests in the upcoming days and let you know the results.

Thank you very much for your support.

@kojibuta
Copy link
Author

Hi @SatishSolankeEsp,

I also run the test_app (NimBLE + ESP32-S3) and the service discovery error with device Viatom DuoEK is still present.

nimble_error_log.txt

Using Bluedroid instead I get no error at all and all devices work correctly.

Apparently your fix is working perfectly with Bluedroid (all devices working) but not with NimBLE (some device still not working).

Let me know if I can help you sending more debug/OTA logs.

@SatishSolankeEsp
Copy link
Contributor

SatishSolankeEsp commented Feb 13, 2023

Hi @kojibuta ,
Can you please tell me how many devices still have the issue with nimble?
share the OTA of Viatom DuoEK .
Great !!! all the devices work with Bluedroid which was not the case earlier.
Thanks,
Satish

@kojibuta
Copy link
Author

Hi @SatishSolankeEsp,

After extensive testing I can confirm that all the test devices I have are currently working with ESP32-S3 and Bluedroid.

I can also confirm that roughly a half of the same devices is not working as expected using NimBLE instead.

Please find attached some logs files both of working and not working devices (2+2).
Each archive contains the debug log and the OTA log of a test session.

NOTE: the device E66 is a Fitness Bracelet using Nordic nRF52832 chip. It is a very common chip so I think it is important to find out why NimBLE is not working with this device.

[Successful tests]
BPM188_success.zip
O2Ring_success.zip

[Failed tests]
DuoOK_fail.zip
E66_fail.zip

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
I have checked a couple of time fail and successful OTA logs but there is no difference all the device works fine, and there is packet exchange going on. I've included for you the attached screenshot.
MicrosoftTeams-image (12)
MicrosoftTeams-image (11)
MicrosoftTeams-image (10)

@SumeetSingh19 ,
The controller issue is solved now, please do check failure is because of nimble since Bluedroid works fine which was not the earlier case.
Thanks,
Satish

@espressif-bot espressif-bot added Resolution: Done Issue is done internally Status: Done Issue is done internally and removed Status: Opened Issue is new labels Apr 24, 2023
@kojibuta
Copy link
Author

Hi @SatishSolankeEsp,

I tried compiling my firmware using version ESP-IDF v5.0.2 and ESP32-S3 does not work yet.

Do you think your patch to the bluetooth controller lib will be merged into the master branch any time?

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
Yes, controller lib is not yet merged, you can continue with the lib I shared.
will get back to you on these.
Thanks,
Satish

@SatishSolankeEsp
Copy link
Contributor

HI @kojibuta ,
I have checked internally, We are not Merging the MR in the controller codebase as a remote device issue but, provided you the library with a workaround by selecting the cs#1 algorithm.

As per Bluetooth specification, we have selected the correct algorithm.
BLE4.0 ESP32 channel select algorithm #1
BLE5.0 ESP32S3 channel select algorithm #2

I have come up with a solution for it, the remote device fails to connect or the remote device supports ble4.2,4.0 for them you can disable the CONFIG_BT_BLE_50_FEATURES_SUPPORTED flag in the IDF menu config and re-try to connect.
This way you can use any ESP-IDF version.

Please let me know if these procedures are useful for you.

Thanks,
Satish

@kojibuta
Copy link
Author

Hi @SatishSolankeEsp,

Unfortunately disabling CONFIG_BT_BLE_50_FEATURES_SUPPORTED does not work for me.
I am currently compiling with "ESP-IDF v5.0.2".
Below you can find the "Bluetooth -> Bluedroid options" settings for my project.

image

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
Look like you are trying the thing on an older commit ID,
could you please try the same test case on the latest IDFV5.0?
image

Our fix is merged with the below commit id :
image

Thanks,
Satish

@kojibuta
Copy link
Author

kojibuta commented Nov 9, 2023

Hi @SatishSolankeEsp,
Using the workaround you suggest (i.e. CONFIG_BT_BLE_50_FEATURES_SUPPORTED=n) I am now able to use stable relase v5.1.1. It works with all devices on ESP32 and ESP32-S3. Thank you.
If I understand correctly, disabling CONFIG_BT_BLE_50_FEATURES_SUPPORTED=n on ESP32-S3 enables channel select algorithm #1 by default. While enabling BLE5.0 features enables channel select algorithm #2 by default.
Recently I started working with ESP32-C6 because I would like to add Zigbee support to my firmware. I found the same problem I encountered with ESP32-S3 but unfortunately the workaround does not work. That is, disabling BLE5.0 features does not switch to channel select algorithm #1. I also tried disabling CONFIG_BT_LE_50_FEATURE_SUPPORT=n without success.
I found other teams are facing similar problems (e.g. https://esp32.com/viewtopic.php?t=28832).
Is there a programmatic way of choosing which channel selection algorithm to use before calling esp_ble_gattc_open() ?
This would be very helpful because we would like support both BLE4.2 and BLE5.0 features.
Thank you.

@SatishSolankeEsp
Copy link
Contributor

HI @kojibuta
I'll look into it and get back to you. Are you trying to use esp32c6 in a central or peripheral role?
Thanks,
Satish

@kojibuta
Copy link
Author

Hi @SatishSolankeEsp,
I am trying to use ESP32-C6 in a central role.
I actively use ESP32 and ESP32-S3 in a central role and I would like to port my firmware on ESP32-C6 too.

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
Look like we have not ported the change on esp32c6 from esp32s3, will share the controller lib on Monday. Please let me know which IDF release are you using of esp32c6(IDFv5.1 or IDFv5.2).

Thanks,
Satish

@kojibuta
Copy link
Author

Hi @SatishSolankeEsp,

At the moment I am using the latest master branch, because of another bug in the Wi-Fi stack fixed recently #12478.
I can use either the 5.1 or the 5.2 for the production build, let me know which one is going to get the fix from esp32s3.
Thank you

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
That is fine could you please replace attach lib in below path esp-idf/components/bt/controller/lib_esp32c6/esp32c6-bt-lib
Lib is for CS#1 algo so peripheral should be used ble4.2 4.0 .
esp32c6_lib_fix.zip

let me know the result.
Thanks,
Satish

@kojibuta
Copy link
Author

Hi @SatishSolankeEsp,

The lib fix solves the issue with channel select algorithm on esp32c6. Now I am able to connect to a 4.0/4.2 device successfully.

Does the fix automatically select the correct channel select algorithm based on the device?
Or is it statically linked and it always use CS#2 if CONFIG_BT_LE_50_FEATURE_SUPPORT=y ?

Thank you

@SatishSolankeEsp
Copy link
Contributor

Hi @kojibuta ,
We will merge the change on IDF version 5.1,5.2, and master as well, the procedure would be the same as esp32s3.
Thanks,
Satish

@nicklasb
Copy link

nicklasb commented May 5, 2024

Hi @kojibuta , We will merge the change on IDF version 5.1,5.2, and master as well, the procedure would be the same as esp32s3. Thanks, Satish

Has these fixes been merged?
I am having similar issues with two T-Beams connecting to each others.
Basically I am being connected to, then calling ble_gattc_disc_all_svcs() in the BLE_GAP_EVENT_CONNECT callback and gets a 7 for return value.
Just would be quite odd if I wasn't connected while actually being connected to.

@ShenWeilong
Copy link
Contributor

hi @kojibuta @nicklasb ,

We have merged the change on versions 5.1, 5.2, and the master branch.
You can try the latest tag and disable the 5.0 features. Then the CS#2 will be disabled as well.
The following configs should be disabled:

  • NIMBLE host: CONFIG_BT_NIMBLE_50_FEATURE_SUPPORT=n
  • Blurdroid host: CONFIG_BT_LE_50_FEATURE_SUPPORT=n

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Bug bugs in IDF
Projects
None yet
Development

No branches or pull requests

6 participants