Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel panic on i2c_new_master_bus (IDFGH-11838) #12929

Closed
3 tasks done
babagreensheep opened this issue Jan 4, 2024 · 30 comments
Closed
3 tasks done

Kernel panic on i2c_new_master_bus (IDFGH-11838) #12929

babagreensheep opened this issue Jan 4, 2024 · 30 comments
Assignees
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally Type: Bug bugs in IDF

Comments

@babagreensheep
Copy link

babagreensheep commented Jan 4, 2024

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

IDF version.

v5.3-dev-1196-gece73357ca

Espressif SoC revision.

ESP32-D0WD-V3 (revision v3.1)

Operating System used.

Linux

How did you build your project?

VS Code IDE

If you are using Windows, please specify command line type.

None

Development Kit.

custom kit

Power Supply used.

USB

What is the expected behavior?

I2C master bus is installed created when running i2c_new_master_bus().

What is the actual behavior?

I am getting a Guru Meditation Error: Core 0 panic'ed (Interrupt wdt timeout on CPU0). error when attempting to run i2c_new_master_bus().

This function was working perfectly for a while, but suddenly crashed after I recompiled the code for a different change.

Steps to reproduce.

This seems to be triggered purely by a call to i2c_new_master_bus(). No other code is running at the time as I am unit testing.

static void display_bus_initialise(unsigned char SDA, unsigned char SCL) {
  if (display_bus_initialised) return;
  // Configuration
  i2c_master_bus_config_t display_bus_config = {
      .i2c_port = -1,  // -1 means auto selecting
      .sda_io_num = static_cast<gpio_num_t>(SDA),
      .scl_io_num = static_cast<gpio_num_t>(SCL),
      .clk_source = I2C_CLK_SRC_DEFAULT,  // Default clock source
      .glitch_ignore_cnt = 7,             // Filtering glitches
      .intr_priority = 0,                 // 0 = default
      .trans_queue_depth = 2,  // Maximum number of asynchronous transactions
      .flags = 1,
  };
  ESP_ERROR_CHECK(i2c_new_master_bus(&display_bus_config, &display_bus_handle));
  display_bus_initialised = true;
  ESP_LOGI(TAG("display_bus_initialise()"),
           "Bus handle: %p | SDA: %i | SCL: %i",
           display_bus_handle,
           SDA,
           SCL);
};

Called from the following function (which never actually gets called).

static void *add_display_device(unsigned char addr,
                                unsigned char SDA,
                                unsigned char SCL) {
  display_bus_initialise(SDA, SCL);
  i2c_device_config_t sh1106_config = {
      .dev_addr_length = I2C_ADDR_BIT_LEN_7,  // 7 bit address
      .device_address = addr,                 // 0x3C
      .scl_speed_hz = 400000                  // 400MHZ
  };
  i2c_master_dev_handle_t device_handle;
  ESP_ERROR_CHECK(i2c_master_bus_add_device(display_bus_handle,
                                            &sh1106_config,
                                            &device_handle));
  ESP_LOGI(TAG("add_display_device()"),
           "Bus handle: %p | Device handle: %p | Addr: %i | SDA: %i | SCL: %i",
           display_bus_handle,
           device_handle,
           addr,
           SDA,
           SCL);
  return static_cast<void *>(device_handle);
};

Debug Logs.

D (521) i2c.common: new bus(0) at 0x3ffb60c0
I (521) gpio: GPIO[23]| InputEn: 1| OutputEn: 1| OpenDrain: 1| Pullup: 1| Pulldown: 0| Intr:0 
I (531) gpio: GPIO[22]| InputEn: 1| OutputEn: 1| OpenDrain: 1| Pullup: 1| Pulldown: 0| Intr:0 
D (541) i2c.common: bus clock source frequency: 80000000hz
V (551) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): checking args
V (551) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): Args okay. Resulting flags 0x50E
D (561) intr_alloc: Connected src 49 to int 12 (cpu 0)
Guru Meditation Error: Core  0 panic'ed (Interrupt wdt timeout on CPU0). 

Core  0 register dump:
PC      : 0x40083281  PS      : 0x00050034  A0      : 0x400eaaed  A1      : 0x3ffb1030  
0x40083281: _xt_lowint1 at [redacted]/esp32/esp-idf/components/xtensa/xtensa_vectors.S:1240

0x400eaaed: i2c_new_master_bus at esp32/esp-idf/components/esp_driver_i2c/i2c_master.c:858

A2      : 0x00000000  A3      : 0x3ffb0370  A4      : 0x40082298  A5      : 0x3ffb1010  
0x40082298: shared_intr_isr at [redacted]/esp32/esp-idf/components/esp_hw_support/intr_alloc.c:435

A6      : 0x3ffb6370  A7      : 0x3ffb4ae0  A8      : 0x000005a8  A9      : 0x400871da  
0x400871da: _frxt_int_enter at [redacted]/esp32/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/portasm.S:140

A10     : 0x000005a8  A11     : 0x00000000  A12     : 0x00000000  A13     : 0x3ffb0ff0  
A14     : 0x3ffb0058  A15     : 0xffffffff  SAR     : 0x00000004  EXCCAUSE: 0x00000005  
EXCVADDR: 0x00000000  LBEG    : 0x4000c46c  LEND    : 0x4000c477  LCOUNT  : 0x00000000  
0x4000c46c: memset in ROM

0x4000c477: memset in ROM

Core  0 was running in ISR context:
EPC1    : 0x400d27bf  EPC2    : 0x00000000  EPC3    : 0x00000000  EPC4    : 0x40083281
0x400d27bf: uart_hal_write_txfifo at [redacted]/esp32/esp-idf/components/hal/uart_hal_iram.c:27

0x40083281: _xt_lowint1 at [redacted]/esp32/esp-idf/components/xtensa/xtensa_vectors.S:1240



Backtrace: 0x4008327e:0x3ffb1030 0x400eaaea:0x3ffb4a80 0x400d6598:0x3ffb4ad0 0x400d65e3:0x3ffb4b20 0x400d665c:0x3ffb4b60 0x400d6522:0x3ffb4b90 0x400eb79c:0x3ffb4fc0 0x40086db1:0x3ffb4ff0
0x4008327e: _xt_lowint1 at[ redacted]/esp32/esp-idf/components/xtensa/xtensa_vectors.S:1240

0x400eaaea: i2c_ll_enable_intr_mask at [redacted]/esp32/esp-idf/components/hal/esp32/include/hal/i2c_ll.h:204
 (inlined by) i2c_new_master_bus at [redacted]/esp32/esp-idf/components/esp_driver_i2c/i2c_master.c:857

0x400d6598: display_bus_initialise(unsigned char, unsigned char) at [redacted]/hardware/sh1106/sh1106.cpp:43

0x400d65e3: add_display_device(unsigned char, unsigned char, unsigned char) at [redacted]/hardware/sh1106/sh1106.cpp:57

0x400d665c: Hardware::SH1106::SH1106(unsigned char, unsigned char, unsigned char) at [redacted]/hardware/sh1106/sh1106.cpp:85

0x400d6522: app_main at [redacted]/main/main.cpp:37

0x400eb79c: main_task at [redacted]/esp32/esp-idf/components/freertos/app_startup.c:208

0x40086db1: vPortTaskWrapper at [redacted]/esp32/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:134



Core  1 register dump:
PC      : 0x400851e6  PS      : 0x00060734  A0      : 0x800d2656  A1      : 0x3ffb5ba0  
0x400851e6: esp_cpu_wait_for_intr at [redacted]/esp32/esp-idf/components/esp_hw_support/cpu.c:145

A2      : 0x00000000  A3      : 0x00000000  A4      : 0x3ffaf8d0  A5      : 0x3ffaf8b0  
A6      : 0x40081c58  A7      : 0x00000001  A8      : 0x800d87ce  A9      : 0x3ffb5b60  
0x40081c58: ipc_task at [redacted]/esp32/esp-idf/components/esp_system/esp_ipc.c:48

A10     : 0x00000000  A11     : 0x00000000  A12     : 0x3ffaf8b0  A13     : 0x3ffaf880  
A14     : 0x00000001  A15     : 0x3ffb5c88  SAR     : 0x00000000  EXCCAUSE: 0x00000005  
EXCVADDR: 0x00000000  LBEG    : 0x00000000  LEND    : 0x00000000  LCOUNT  : 0x00000000  


Backtrace: 0x400851e3:0x3ffb5ba0 0x400d2653:0x3ffb5bc0 0x40087ba6:0x3ffb5be0 0x40086db1:0x3ffb5c00
0x400851e3: xt_utils_wait_for_intr at [redacted]/esp32/esp-idf/components/xtensa/include/xt_utils.h:81
 (inlined by) esp_cpu_wait_for_intr at [redacted]/esp32/esp-idf/components/esp_hw_support/cpu.c:132

0x400d2653: esp_vApplicationIdleHook at [redacted]/esp32/esp-idf/components/esp_system/freertos_hooks.c:59

0x40087ba6: prvIdleTask at [redacted]/esp32/esp-idf/components/freertos/FreeRTOS-Kernel/tasks.c:4273 (discriminator 1)

0x40086db1: vPortTaskWrapper at [redacted]/esp32/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:134

More Information.

No response

@babagreensheep babagreensheep added the Type: Bug bugs in IDF label Jan 4, 2024
@espressif-bot espressif-bot added the Status: Opened Issue is new label Jan 4, 2024
@github-actions github-actions bot changed the title Kernel panic on i2c_new_master_bus Kernel panic on i2c_new_master_bus (IDFGH-11838) Jan 4, 2024
@suda-morris
Copy link
Collaborator

.flags = 1, looks weird, do you want .flags.enable_internal_pullup=1?

@babagreensheep babagreensheep closed this as not planned Won't fix, can't repro, duplicate, stale Jan 5, 2024
@babagreensheep babagreensheep reopened this Jan 5, 2024
@babagreensheep
Copy link
Author

babagreensheep commented Jan 5, 2024

.flags = 1, looks weird, do you want .flags.enable_internal_pullup=1?

I get this error when building. I may be wrong but I think in terms of actual structure data they are the same?

error: either all initializer clauses should be designated or none of them should be
   45 |       .flags.enable_internal_pullup = 1,
      |    

EDIT: Made it look prettier:

  i2c_master_bus_config_t display_bus_config = {
      .i2c_port = -1,  // -1 means auto selecting
      .sda_io_num = static_cast<gpio_num_t>(SDA),
      .scl_io_num = static_cast<gpio_num_t>(SCL),
      .clk_source = I2C_CLK_SRC_DEFAULT,  // Default clock source
      .glitch_ignore_cnt = 7,             // Filtering glitches
      .intr_priority = 0,                 // 0 = default
      .trans_queue_depth = 2,  // Maximum number of asynchronous transactions
      .flags = {1},
  };

@mbratch
Copy link

mbratch commented Mar 7, 2024

I am having exactly the same problem with i2c_new_master_bus under ESP-IDF v5.2.0. My platform is an ESP32 Pico Kit V4, which has an ESP32-PICO-D4. Note in my configuration I am using external pull-ups, so I set internal pull-ups to false.

I had this all working using the now-deprecated I2C API and am trying to port to the new I2C API.

I'm getting the panic at the line where I call i2c_new_master_bus:

esp_err_t Point::switchInit(void)
{
    i2c_master_bus_config_t i2c_mst_config {};

    i2c_mst_config.clk_source = I2C_CLK_SRC_DEFAULT;
    i2c_mst_config.i2c_port = _i2cPort;    // this variable is const set to I2C_NUM_0
    i2c_mst_config.sda_io_num = static_cast<gpio_num_t>(pinSda);    // pinSda is const set to 21
    i2c_mst_config.scl_io_num = static_cast<gpio_num_t>(pinScl);    // pinScl is const set to 22
    i2c_mst_config.glitch_ignore_cnt = 7;
    i2c_mst_config.flags.enable_internal_pullup = false;

    return i2c_new_master_bus(&i2c_mst_config, &_i2cBusHandle); // _i2cBusHandle is a static member of the Point class
}

Here is the panic:

I (2058) BD-PTB: Initializing point settings
I (2063) gpio: GPIO[21]| InputEn: 1| OutputEn: 1| OpenDrain: 1| Pullup: 0| Pulldown: 0| Intr:0
I (2073) gpio: GPIO[22]| InputEn: 1| OutputEn: 1| OpenDrain: 1| Pullup: 0| Pulldown: 0| Intr:0
Guru Meditation Error: Core  0 panic'ed (Interrupt wdt timeout on CPU0).

Core  0 register dump:
PC      : 0x4000bfe7  PS      : 0x00060034  A0      : 0x8008a69e  A1      : 0x3ffb1fd0
A2      : 0x00000001  A3      : 0x00060023  A4      : 0xfffffff0  A5      : 0x3ffbd1b0
A6      : 0x00000003  A7      : 0x3ffc4954  A8      : 0x80089f8f  A9      : 0x3ffb1f80
A10     : 0x3ffc4954  A11     : 0xffffffff  A12     : 0x800820c4  A13     : 0x00060023
A14     : 0x3ffc4970  A15     : 0x0000cdcd  SAR     : 0x0000000f  EXCCAUSE: 0x00000005
EXCVADDR: 0x00000000  LBEG    : 0x4000c46c  LEND    : 0x4000c477  LCOUNT  : 0x00000000
Core  0 was running in ISR context:
EPC1    : 0x400d4603  EPC2    : 0x00000000  EPC3    : 0x00000000  EPC4    : 0x4000bfe7


Backtrace: 0x4000bfe4:0x3ffb1fd0 0x4008a69b:0x3ffb1fe0 0x40082444:0x3ffb2000 0x400834d1:0x3ffb2020 0x400f1006:0x3ffbd230 0x400dee69:0x3ffbd280 0x400d90c1:0x3ffbd2a0 0x400d90e8:0x3ffbd2d0 0x400d93a8:0x3ffbd2f0 0x401c5b0b:0x3ffbd360 0x4008a2f9:0x3ffbd390

Running a backtrace analysis, I get this (line Point.cpp:48 is where the function call is made):

C:\Users\-\.espressif\tools\xtensa-esp32-elf\esp-12.2.0_20230208\xtensa-esp32-elf\bin\xtensa-esp32-elf-addr2line.exe -fe build\BD-PTB.elf 0x4000bfe4:0x3ffb1fd0 0x4008a69b:0x3ffb1fe0 0x40082444:0x3ffb2000 0x400834d1:0x3ffb2020 0x400f1006:0x3ffbd230 0x400dee69:0x3ffbd280 0x400d90c1:0x3ffbd2a0 0x400d90e8:0x3ffbd2d0 0x400d93a8:0x3ffbd2f0 0x401c5b0b:0x3ffbd360 0x4008a2f9:0x3ffbd390
??
??:0
vPortClearInterruptMaskFromISR
C:/Users/mdbra/.espressif/frameworks/esp-idf-v5.2/components/freertos/FreeRTOS-Kernel/portable/xtensa/include/freertos/portmacro.h:564
shared_intr_isr
C:/Users/-/.espressif/frameworks/esp-idf-v5.2/components/esp_hw_support/intr_alloc.c:452
_xt_lowint1
C:/Users/-/.espressif/frameworks/esp-idf-v5.2/components/xtensa/xtensa_vectors.S:1240
i2c_ll_master_set_filter
C:/Users/-/.espressif/frameworks/esp-idf-v5.2/components/hal/esp32/include/hal/i2c_ll.h:551
_ZN5Point10switchInitEv
C:/Users/-/Documents/Projects/ESP32/BD-PTB/main/Point.cpp:48
_ZN6PointsC2Ev
C:/Users/-/Documents/Projects/ESP32/BD-PTB/include/Points.hpp:50
_ZN6Points8instanceEv
C:/Users/-/Documents/Projects/ESP32/BD-PTB/include/Points.hpp:22 (discriminator 2)
app_main
C:/Users/-/Documents/Projects/ESP32/BD-PTB/main/main.cpp:297
main_task
C:/Users/-/.espressif/frameworks/esp-idf-v5.2/components/freertos/app_startup.c:208
vPortTaskWrapper
C:/Users/-/.espressif/frameworks/esp-idf-v5.2/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:134

Perhaps the library call has an issue in a C++ build environment?

@mythbuster5
Copy link
Collaborator

Hello @mbratch If you are using esp32. I guess you didn't set enable_internal_pullup to true or no external pull-up. Yes, ESP32 has a very weird behavior, it will fail in a very incomprehensible way if no any pull-ups on pins when initialize the bus. So, please modify your enable_internal_pullup to true please.

@mbratch
Copy link

mbratch commented Mar 8, 2024

Hello @mbratch If you are using esp32. I guess you didn't set enable_internal_pullup to true or no external pull-up. Yes, ESP32 has a very weird behavior, it will fail in a very incomprehensible way if no any pull-ups on pins when initialize the bus. So, please modify your enable_internal_pullup to true please.

@mythbuster5 I have sufficient external pull-ups in my system design. As I mentioned in my post it has been working fine for quite some time with the old i2c API. Setting internal pullups to true will cause it to fail because the resistance of the internal pullups is quite high. My design requires the external values I use.

There is a problem with the new API in some circumstances as is seen by several different reports. I can capture other stack traces if that will help. This problem is reproducible.

@babagreensheep
Copy link
Author

Hello @mbratch If you are using esp32. I guess you didn't set enable_internal_pullup to true or no external pull-up. Yes, ESP32 has a very weird behavior, it will fail in a very incomprehensible way if no any pull-ups on pins when initialize the bus. So, please modify your enable_internal_pullup to true please.

I doubt this is the cause since pull up is set to true on my various tests and the same issue persists.

@mythbuster5
Copy link
Collaborator

mythbuster5 commented Mar 8, 2024

I want to get more clue @babagreensheep @mbratch So I want to know this issue it's not related any transaction?

  1. The Guru Meditation happens just after i2c_new_master_bus is get called right?
  2. And you have already tried both external / internal pull-ups, but it doesn't work?
  3. And both of you use ESP32, not other chips like ESP32-C, ESP32-S?
  4. And both of you use Cpp environment?

@mbratch
Copy link

mbratch commented Mar 8, 2024

@mythbuster5

I want to get more clue @babagreensheep @mbratch So I want to know this issue it's not related any transaction?

There are other users in #12103 with the same problem symptoms you might also want to ask these details of. Their additional input could prove helpful.

  1. The Guru Meditation happens just after i2c_new_master_bus is get called right?

Correct. As shown in my stack trace i2c_new_master_bus does not return before the error occurs.

  1. And you have already tried both external / internal pull-ups, but it doesn't work?

I only use external pull-ups in this design. The internal pull-up values in the ESP32 are quite a bit higher than those recommended for I2C bus designs operating at 100-400Khz. The ESP-IDF I2C documentation also strongly recommends external pull-ups for higher speed bus selection. I designed my system a few years ago and the hardware design hasn't changed. The I2C has worked flawlessly since I started using ESP-IDF ver 4.3 and up through 5.2.0 using the prior I2C API. When I changed to the newest I2C API, I have this problem.

  1. And both of you use ESP32, not other chips like ESP32-C, ESP32-S?

True for me. I only have ESP32.

  1. And both of you use Cpp environment?

True for me.

I'll add that I am not using BLE. I am using wifi, but I am initializing i2c well before wifi is initialized and enabled.

When I have some time over the next couple of days I will capture additional stack traces to see if there's any variation at all.

@mbratch
Copy link

mbratch commented Mar 8, 2024

I am now running ESP-IDF 5.2.1, but the exact same issue still exists. I wasn't expecting it to go away, but just wanted to state the scenario.

I turned on I2C debug logging in the sdkconfig and let the device reboot a few times. Each time, the following error was displayed with exactly the same stack trace. So it's not timing dependent, but deterministic.

There are only two additional log messages from i2c debug.

I (2056) BD-PTB: [+0:00:01] Running SW version 1.5.3e5
I (2066) BD-PTB: Initializing point settings
D (2066) i2c.common: new bus(0) at 0x3ffc4880
I (2071) gpio: GPIO[21]| InputEn: 1| OutputEn: 1| OpenDrain: 1| Pullup: 0| Pulldown: 0| Intr:0
I (2081) gpio: GPIO[22]| InputEn: 1| OutputEn: 1| OpenDrain: 1| Pullup: 0| Pulldown: 0| Intr:0
D (2091) i2c.common: bus clock source frequency: 80000000hz
Guru Meditation Error: Core  0 panic'ed (Interrupt wdt timeout on CPU0).

Core  0 register dump:
PC      : 0x4008a55c  PS      : 0x00060034  A0      : 0x8008241d  A1      : 0x3ffb1fd0
A2      : 0x3ffb0060  A3      : 0xffffffff  A4      : 0x00000003  A5      : 0x00060023
A6      : 0x3ffc4880  A7      : 0x0000cdcd  A8      : 0x8008a51d  A9      : 0x0000cdcd
A10     : 0x00000001  A11     : 0x00000000  A12     : 0xb33fffff  A13     : 0x3ffbd170
A14     : 0x00000003  A15     : 0x3ffc4958  SAR     : 0x0000000f  EXCCAUSE: 0x00000005
EXCVADDR: 0x00000000  LBEG    : 0x4000c46c  LEND    : 0x4000c477  LCOUNT  : 0x00000000
Core  0 was running in ISR context:
EPC1    : 0x400d45fb  EPC2    : 0x00000000  EPC3    : 0x00000000  EPC4    : 0x4008a55c

Backtrace: 0x4008a559:0x3ffb1fd0 0x4008241a:0x3ffb2000 0x400834d1:0x3ffb2020 0x400f0fc6:0x3ffbd1f0 0         x400ded95:0x3ffbd240 0x400d90b9:0x3ffbd280 0x400d90e0:0x3ffbd2b0 0x400d93a0:0x3ffbd2d0 0x401c5e6b:0x         3ffbd340 0x4008a2f9:0x3ffbd370

With the corresponding translation:

C:\Users\-\.espressif\tools\xtensa-esp32-elf\esp-12.2.0_20230208\xtensa-esp32-elf\bin\xtensa-esp32-elf-addr2line.exe -fe build\BD-PTB.elf 0x4000bfe4:0x3ffb1fd0 0x4008a69b:0x3ffb1fe0 0x40082444:0x3ffb2000 0x400834d1:0x3ffb2020 0x400f0fc6:0x3ffbd1f0 0x400ded95:0x3ffbd240 0x400d90b9:0x3ffbd280 0x400d90e0:0x3ffbd2b0 0x400d93a0:0x3ffbd2d0 0x401c5e6b:0x3ffbd340 0x4008a2f9:0x3ffbd370
??
??:0
vPortClearInterruptMaskFromISR
C:/Users/-/.espressif/frameworks/esp-idf-v5.2.1/components/freertos/FreeRTOS-Kernel/portable/xtensa/include/freertos/portmacro.h:564
shared_intr_isr
C:/Users/-/.espressif/frameworks/esp-idf-v5.2.1/components/esp_hw_support/intr_alloc.c:452
_xt_lowint1
C:/Users/-/.espressif/frameworks/esp-idf-v5.2.1/components/xtensa/xtensa_vectors.S:1240
i2c_ll_master_set_filter
C:/Users/-/.espressif/frameworks/esp-idf-v5.2.1/components/hal/esp32/include/hal/i2c_ll.h:551
_ZN5Point10switchInitEv
C:/Users/-/Documents/Projects/ESP32/BD-PTB/main/Point.cpp:48
_ZN6PointsC2Ev
C:/Users/-/Documents/Projects/ESP32/BD-PTB/include/Points.hpp:50
_ZN6Points8instanceEv
C:/Users/-/Documents/Projects/ESP32/BD-PTB/include/Points.hpp:22 (discriminator 2)
app_main
C:/Users/-/Documents/Projects/ESP32/BD-PTB/main/main.cpp:297
main_task
C:/Users/-/.espressif/frameworks/esp-idf-v5.2.1/components/freertos/app_startup.c:208
vPortTaskWrapper
C:/Users/mdbra/.espressif/frameworks/esp-idf-v5.2.1/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:134

I was hoping there would be a little more debug that might help, but not the case.
Let me know if there's anything else I can try.

@babagreensheep
Copy link
Author

I want to get more clue @babagreensheep @mbratch So I want to know this issue it's not related any transaction?

  1. The Guru Meditation happens just after i2c_new_master_bus is get called right?

Yes - based on the stack trace.

  1. And you have already tried both external / internal pull-ups, but it doesn't work?

Yes.

  1. And both of you use ESP32, not other chips like ESP32-C, ESP32-S?

Think so? My report has this info.

  1. And both of you use Cpp environment?

Yes.

Thanks so much for looking into this it's been bugging me for really long.

@mbratch
Copy link

mbratch commented Mar 9, 2024

I've done a little digging on this and the wdt occurs in this call in i2c_new_master_bus, which just updates the interrupt mask.

    i2c_ll_enable_intr_mask(hal->dev, I2C_LL_MASTER_EVENT_INTR);

@mbratch
Copy link

mbratch commented Mar 10, 2024

Craziest thing.... my setup is now working. The problem is, I don't know why. I had just updated to 5.2.1 using the Windows offline installer and reconfigured the vscode plugin to use the new version. My most recent debugging attempts on this issue were using that. I had added some debug statements into i2c_master.c in the component library. Afterwards, to get rid of the debug statements, rather than hunt them down and remove them I decided to just pull the file from the 5.2.1 tagged repository to refresh it.

But then I noticed things started working. Honestly, I don't recall if I got some successes at the tail end of debug, or only after the refresh. So why is it working?

  • I have not changed anything at all in my init sequence or the configuration of the i2c, so that's not why
  • Perhaps having certain debug turned on in configuration makes it work? I think unlikely as I did turn anything off I might have turned on.
  • That leaves me with the notion that it's possible that, for whatever reason, the i2c_master.c I was using before wasn't quite right. I assume the Windows offline installer does the right thing. Unfortunately, I have no way to confirm this. Whenever I upgrade to a new ESP-IDF I always start with a "full clean" before build. After an ESP-IDF upgrade I always do a "full clean" of course to ensure no older components or binaries are around.

So here we are. I'm running ESP-IDF 5.2.1 and the new I2C API appears to be working fine on my ESP32 at this point. I am just uncomfortable when things just start working as if by magic. :p

I think this does prove that there's nothing intrinsically wrong with an ESP32 that would prevent the API from working. Nor is the problem related to pull-up resistors in my case. It may have to do with configuration or ensuring use of the latest version of ESP-IDF and the associated API on the ESP32. All speculative.

@mythbuster5
Copy link
Collaborator

@mbratch @babagreensheep Thanks guys. I found a very different behavior of timeout fsm on ESP32 compared with other esp chips. So I find the thing is when we call i2c_ll_enable_intr_mask this in initialization state, and the interrupt works, but at this moment, no I2C transaction starts. So the interrupt always reports timeout error without any following handle. So you can see guru meditation here.

I'm trying to delete i2c_ll_enable_intr_mask in initialize in initialize state, and seems it works better. So, can you do me a favor, delete i2c_ll_enable_intr_mask in i2c_new_master_bus and see the result. I would be appreciate if I get any response.

The changes can be:

ret = esp_intr_alloc_intrstatus(i2c_periph_signal[i2c_port_num].irq, isr_flags, (uint32_t)i2c_ll_get_interrupt_status_reg(hal->dev), I2C_LL_MASTER_EVENT_INTR, i2c_master_isr_handler_default, i2c_master, &i2c_master->base->intr_handle);
ESP_GOTO_ON_ERROR(ret, err, TAG, "install i2c master interrupt failed");
atomic_init(&i2c_master->status, I2C_STATUS_IDLE);

// **i2c_ll_enable_intr_mask(hal->dev, I2C_LL_MASTER_EVENT_INTR); This line!!!**
i2c_ll_master_set_filter(hal->dev, bus_config->glitch_ignore_cnt);

xSemaphoreGive(i2c_master->cmd_semphr);

@mbratch
Copy link

mbratch commented Mar 11, 2024

@mythbuster5 thanks for the update. Great find! Very much appreciate you digging into it.

I will try the change. Since my system seems to be working already (probably a tenuous situation -- luck of the draw), it may be difficult to confirm the positive effect in my case, but I can verify whether it works reliably with the change.

@mythbuster5
Copy link
Collaborator

Is it possible to get any update on your side? @babagreensheep

@babagreensheep
Copy link
Author

@mythbuster5 Sorry give me some time to test this and I'll let you know! I should add that the issue has been intermittent on my end so I'll need to run a number of tests to confirm.

@mythbuster5
Copy link
Collaborator

@babagreensheep Thanks. If we can confirm this is the root cause, this issues can be solved quickly~

@ceribus
Copy link

ceribus commented Mar 14, 2024

@mythbuster5 I applied the fix you provided. Now it no longer fails at i2c_new_master_bus, but the task watchdog gets triggered by i2c_master_probe, even if xfer_timeout_ms is set to zero.
I'm running v5.2.1 on an ESP32-D0WDQ6 rev. 1 and could not reproduce the issue on an ESP32-C3 rev. 3 & rev. 4.
image

@mythbuster5
Copy link
Collaborator

@ceribus Thanks, ESP32 is always the special, let me test.

@mythbuster5
Copy link
Collaborator

mythbuster5 commented Mar 14, 2024

@ceribus Try this. Find function i2c_master_probe add one line i2c_ll_enable_intr_mask(hal->dev, I2C_LL_MASTER_EVENT_INTR); just before i2c_ll_update(hal->dev); to see it works or not.

Sorry for the inconvenience, esp32 really has a different behavior compared with others, I'm also collection issues and fix them as soon as possible. Glad to hear from you soon, Thanks.

@ceribus
Copy link

ceribus commented Mar 14, 2024

@mythbuster5 Thanks for the quick reply. It actually fixes the error, but i2c_master_probe now always returns ESP_OK the first time it is called, regardless of whether the device is connected or not.

EDIT: I tried it again to compare the behavior of the esp32 against the esp32c3, probing for 3 different addresses that are not connected. On the esp32c3 it returns ESP_ERR_NOT_FOUND every time. But on the esp32 it returns ESP_OK the first time and ESP_ERR_TIMEOUT the other two times.

@mythbuster5
Copy link
Collaborator

mythbuster5 commented Mar 15, 2024

@ceribus That is very strange, because it works good on my side. Can you provide the picture of ESP32 probe logic analyzer and let me see what happens on data line. (PS, the expected behavior is address+nack if device not connected.) For timeout, did you check the xfer_timeout_ms parameter? And could you please check pull-ups on board. (you can also set .flag.enable_internal_pullup = true to have a try)

Analyzer picture is very important for me to go head, but now I haven't got one. If any clue you can provide, that is very thankful.~~

EDIT: And you said you are on v5.2.1, can you show me your latest commit?

@mythbuster5
Copy link
Collaborator

@ceribus I reproduced issue on my ESP32 without any pull-up connected. So I guess your I2C pins are not given by proper pull-ups.

@ceribus
Copy link

ceribus commented Mar 15, 2024

@mythbuster5 You're right, I forgot to enable internal pullups. It works when they're enabled.
After connecting and writing a command to the sensor, i get the following error message
image
I'll connect the logic analyzer later and send you a screenshot then.
Yes, I'm using 5.2.1, git log -1shows commit a322e6bdad4b6675d4597fb2722eea2851ba88cb (HEAD, tag: v5.2.1)

@mythbuster5
Copy link
Collaborator

@ceribus writing a command with probe or i2c_master_transmit? Yeah.. I wanna see on logic analyzer. And your usage code maybe? ~

@mythbuster5
Copy link
Collaborator

@ceribus I guess it fails here https://github.com/espressif/esp-idf/blob/v5.2.1/components/driver/i2c/i2c_master.c#L512 Could you please help me check what happens here? (because I haven't reproduced then..) I'm going to enhance logic here. If my guess is correct, you calls probe or something before transmit, and probe fails (maybe timeout or sth else). The status kept the previous logic and when calling transmit, it continues the logic, so it gonna fail.

@ceribus
Copy link

ceribus commented Mar 15, 2024

@mythbuster5 You're right, it fails in i2c_master.c:512..
I added logging right before L512 and it shows that i2c_master->status is always I2C_STATUS_TIMEOUT.
Purely by chance, I found out that adding avTaskDelay(1); right after L395 (beginning of the while loop in s_i2c_send_commands) fixes the issue. I honestly have no idea why, but it works now..

EDIT: I tried with and without probing first, but that doesn't seem to make a difference.
My code calls i2c_new_master_bus then i2c_master_bus_add_device for three devices of which one is present on the bus. It then probes for all three devices and starts communication with the first device using i2c_master_transmit.

@mythbuster5
Copy link
Collaborator

Fine @ceribus thanks for giving the new clue, the timeout will only happen in one situation on v5.2.1. That is something wrong on SCL/SDA bus, the hardware triggers a timeout interrupt or arbitration interrupt https://github.com/espressif/esp-idf/blob/v5.2.1/components/driver/i2c/i2c_master.c#L580. You might need to check the timeout value you gave and what happened on bus.

@ceribus
Copy link

ceribus commented Mar 18, 2024

@mythbuster5 I attached the logic analyzer to the bus and it seems that the driver fails with I2C_STATUS_TIMEOUT because the slave sometimes does ~8ms of clock stretching during i2c_master_transmit
Screenshot 2024-03-18 112306
Screenshot 2024-03-18 112415

With the old i2c driver I avoided this issue by calling i2c_set_timeout with I2C_LL_MAX_TIMEOUT.

@mythbuster5
Copy link
Collaborator

@ceribus So i2c_set_timeout in old driver can solve your problem right? If so, I will add an option in new driver. Thanks for giving such clue!

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: NA Issue resolution is unavailable and removed Status: Opened Issue is new labels Mar 19, 2024
nebkat pushed a commit to nebkat/esp-idf that referenced this issue Apr 9, 2024
…_probe issue,

and probe might failed.
Fixed I2C cannot return err code when nack detected
Closes espressif#13213,
Closes espressif#12929,
Closes espressif#13398,
espressif-bot pushed a commit that referenced this issue Apr 9, 2024
…_probe issue,

and probe might failed.
Fixed I2C cannot return err code when nack detected
Closes #13213,
Closes #12929,
Closes #13398,
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally Type: Bug bugs in IDF
Projects
None yet
Development

No branches or pull requests

6 participants