Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lan8710a ok in v3 but unstable in v4, corrupt heap errors (IDFGH-2317) #4454

Closed
will-emmerson opened this issue Dec 6, 2019 · 5 comments
Closed
Assignees

Comments

@will-emmerson
Copy link

Environment

  • Development Kit: Olimex-POE-ISO rev B, Olimex Gateway rev E
  • Module or chip used: ESP32-WROOM-32
  • IDF version: v3.3 or master
  • Build System: CMake
  • Operating System: Linux
  • Power Supply: USB

Problem Description

I have been trying to debug a strange problem with ethernet, where I intermittently get corrupt heap errors such as the ones below when I try to do OTA update or download a file, using any v4:

I (11359) esp_https_ota: Starting OTA...
I (11359) esp_https_ota: Writing to partition subtype 16 at offset 0x120000
CORRUPT HEAP: multi_heap.c:432 detected at 0x3ffc7010
abort() was called at PC 0x400879e7 on core 0
0x400879e7: multi_heap_assert at /home/willemmerson/esp/esp-idf/components/heap/multi_heap_platform.h:59
 (inlined by) multi_heap_malloc_impl at /home/willemmerson/esp/esp-idf/components/heap/multi_heap.c:432


ELF file SHA256: 3e9953136fce01fad133518a7498c10d864b1e00b2410f31fbe5025711540b49

Backtrace: 0x40085475:0x3ffbed30 0x4008586d:0x3ffbed50 0x400879e7:0x3ffbed70 0x40087d95:0x3ffbed90 0x4008224d:0x3ffbedb0 0x4008227d:0x3ffbedd0 0x4008ad7d:0x3ffbedf0 0x400e9475:0x3ffbee10 0x400e94eb:0x3ffbee30 0x400e955c:0x3ffbee50 0x400e9f59:0x3ffbee70 0x400ea131:0x3ffbee90 0x400f9a0a:0x3ffbeec0 0x40126199:0x3ffbeee0 0x400d94d1:0x3ffbef00 0x40127ad6:0x3ffbef20 0x40087f3d:0x3ffbef50
0x40085475: invoke_abort at /home/willemmerson/esp/esp-idf/components/esp32/panic.c:157

0x4008586d: abort at /home/willemmerson/esp/esp-idf/components/esp32/panic.c:174

0x400879e7: multi_heap_assert at /home/willemmerson/esp/esp-idf/components/heap/multi_heap_platform.h:59
 (inlined by) multi_heap_malloc_impl at /home/willemmerson/esp/esp-idf/components/heap/multi_heap.c:432

0x40087d95: multi_heap_malloc at /home/willemmerson/esp/esp-idf/components/heap/multi_heap_poisoning.c:191

0x4008224d: heap_caps_malloc at /home/willemmerson/esp/esp-idf/components/heap/heap_caps.c:115

0x4008227d: heap_caps_malloc_default at /home/willemmerson/esp/esp-idf/components/heap/heap_caps.c:144

0x4008ad7d: malloc at /home/willemmerson/esp/esp-idf/components/newlib/heap.c:32

0x400e9475: mem_malloc at /home/willemmerson/esp/esp-idf/components/lwip/lwip/src/core/mem.c:237

0x400e94eb: do_memp_malloc_pool at /home/willemmerson/esp/esp-idf/components/lwip/lwip/src/core/memp.c:254

0x400e955c: memp_malloc at /home/willemmerson/esp/esp-idf/components/lwip/lwip/src/core/memp.c:350 (discriminator 2)

0x400e9f59: pbuf_alloc_reference at /home/willemmerson/esp/esp-idf/components/lwip/lwip/src/core/pbuf.c:336

0x400ea131: pbuf_alloc at /home/willemmerson/esp/esp-idf/components/lwip/lwip/src/core/pbuf.c:237

0x400f9a0a: ethernetif_input at /home/willemmerson/esp/esp-idf/components/lwip/port/esp32/netif/ethernetif.c:166

0x40126199: esp_netif_receive at /home/willemmerson/esp/esp-idf/components/esp_netif/lwip/esp_netif_lwip.c:660

0x400d94d1: eth_stack_input at /home/willemmerson/esp/esp-idf/components/esp_eth/src/esp_eth.c:89

0x40127ad6: emac_esp32_rx_task at /home/willemmerson/esp/esp-idf/components/esp_eth/src/esp_eth_mac_esp32.c:255

0x40087f3d: vPortTaskWrapper at /home/willemmerson/esp/esp-idf/components/freertos/port.c:143

Or this one:

CORRUPT HEAP: Bad tail at 0x3ffccf0a. Expected 0xbaad5678 got 0xf9df9a30
abort() was called at PC 0x40082b4e on core 0
0x40082b4e: lock_acquire_generic at /home/willemmerson/esp/esp-idf/components/newlib/locks.c:143


ELF file SHA256: d91a330eb1636750deae43532c4496f7d0ead8dd4cf6c06973b17037c18b16c0

Backtrace: 0x4008fb09:0x3ffbe710 0x4008ff01:0x3ffbe730 0x40082b4e:0x3ffbe750 0x40082c71:0x3ffbe780 0x40150add:0x3ffbe7a0 0x4014c649:0x3ffbea60 0x4014c5d1:0x3ffbeab0 0x40091dd3:0x3ffbeae0 0x4008246e:0x3ffbeb00 0x400958f5:0x3ffbeb20 0x4012f0a9:0x3ffbeb40 0x4011f683:0x3ffbeb60 0x40123d88:0x3ffbeb80 0x40129216:0x3ffbebb0 0x4012ea7a:0x3ffbebd0 0x4011d269:0x3ffbebf0 0x4011d2e8:0x3ffbec10 0x400925d1:0x3ffbec40
0x4008fb09: invoke_abort at /home/willemmerson/esp/esp-idf/components/esp32/panic.c:157

0x4008ff01: abort at /home/willemmerson/esp/esp-idf/components/esp32/panic.c:174

0x40082b4e: lock_acquire_generic at /home/willemmerson/esp/esp-idf/components/newlib/locks.c:143

0x40082c71: _lock_acquire_recursive at /home/willemmerson/esp/esp-idf/components/newlib/locks.c:171

0x40150add: _vfiprintf_r at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vfprintf.c:853 (discriminator 2)

0x4014c649: fiprintf at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/fiprintf.c:48

0x4014c5d1: __assert_func at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdlib/assert.c:58 (discriminator 8)

0x40091dd3: multi_heap_free at /home/willemmerson/esp/esp-idf/components/heap/multi_heap_poisoning.c:214 (discriminator 1)

0x4008246e: heap_caps_free at /home/willemmerson/esp/esp-idf/components/heap/heap_caps.c:272

0x400958f5: free at /home/willemmerson/esp/esp-idf/components/newlib/heap.c:47

0x4012f0a9: ethernet_free_rx_buf_l2 at /home/willemmerson/esp/esp-idf/components/lwip/port/esp32/netif/ethernetif.c:67

0x4011f683: pbuf_free at /home/willemmerson/esp/esp-idf/components/lwip/lwip/src/core/pbuf.c:783

0x40123d88: tcp_input at /home/willemmerson/esp/esp-idf/components/lwip/lwip/src/core/tcp_in.c:582

0x40129216: ip4_input at /home/willemmerson/esp/esp-idf/components/lwip/lwip/src/core/ipv4/ip4.c:760

0x4012ea7a: ethernet_input at /home/willemmerson/esp/esp-idf/components/lwip/lwip/src/netif/ethernet.c:186

0x4011d269: tcpip_thread_handle_msg at /home/willemmerson/esp/esp-idf/components/lwip/lwip/src/api/tcpip.c:180

0x4011d2e8: tcpip_thread at /home/willemmerson/esp/esp-idf/components/lwip/lwip/src/api/tcpip.c:154

0x400925d1: vPortTaskWrapper at /home/willemmerson/esp/esp-idf/components/freertos/port.c:143

The strange thing is that it seems to be a heat-related problem, it only happens once the board has been used for a while, and if I put it in the freezer (yes I did that) then the problem goes away until the board heats up again. But it's not a hardware problem because it works fine in v3.3.

The problem can be made to happen quicker by enabling softap, but it's not the cause of the problem because it still happens when disabling.

I am guessing that the problem is to do with clock source, that it's setup incorrectly in v4. Or perhaps lan8710a isn't supported anymore? The working ethernet code I used in v3.3 is just taken from ethernet example:

#define DEFAULT_ETHERNET_PHY_CONFIG phy_lan8720_default_ethernet_config
#define PIN_SMI_MDC 23
#define PIN_SMI_MDIO 18
#define PIN_PHY_POWER 12
eth_phy_base_t PHY_ADDRESS = PHY0;
eth_clock_mode_t CLOCK_MODE = ETH_CLOCK_GPIO17_OUT; 

static void phy_device_power_enable_via_gpio(bool enable)
{
	if (!enable)
		phy_lan8720_default_ethernet_config.phy_power_enable(false);
	gpio_set_level(PIN_PHY_POWER, (int)enable);
	// Allow the power up/down to take effect, min 300us
	vTaskDelay(1);
	if (enable)
		phy_lan8720_default_ethernet_config.phy_power_enable(true);
}

static void eth_gpio_config_rmii(void) {
    phy_rmii_configure_data_interface_pins();
    phy_rmii_smi_configure_pins(PIN_SMI_MDC, PIN_SMI_MDIO);
}

void ethernet_init() {
    gpio_pad_select_gpio(PIN_PHY_POWER);
    gpio_set_direction(PIN_PHY_POWER,GPIO_MODE_OUTPUT);
    eth_config_t config = DEFAULT_ETHERNET_PHY_CONFIG;
    config.phy_addr = PHY_ADDRESS;
    config.gpio_config = eth_gpio_config_rmii;
    config.tcpip_input = tcpip_adapter_eth_input;
    config.clock_mode = CLOCK_MODE;
    config.phy_power_enable = phy_device_power_enable_via_gpio;
    ESP_ERROR_CHECK(esp_eth_init(&config));
    ESP_ERROR_CHECK(esp_eth_enable());
    ESP_LOGI(TAG, "Waiting for ethernet");
}

The non-working code for v4 is mostly taken from ethernet example:

static esp_err_t phy_device_power_enable_via_gpio() {
    gpio_pad_select_gpio(PIN_PHY_POWER);
    gpio_set_direction(PIN_PHY_POWER, GPIO_MODE_OUTPUT);
    gpio_set_level(PIN_PHY_POWER, 1);
    vTaskDelay(100 / portTICK_PERIOD_MS);
    return ESP_OK;
}

void app_main(void) {
    esp_netif_init();
    ESP_ERROR_CHECK(esp_event_loop_create_default());
    esp_netif_config_t cfg = ESP_NETIF_DEFAULT_ETH();
    esp_netif_t *eth_netif = esp_netif_new(&cfg);
    ESP_ERROR_CHECK(esp_eth_set_default_handlers(eth_netif));
    ESP_ERROR_CHECK(esp_event_handler_register(IP_EVENT, IP_EVENT_ETH_GOT_IP, &got_ip_event_handler, NULL));

    eth_mac_config_t mac_config = ETH_MAC_DEFAULT_CONFIG();
    esp_eth_mac_t *mac = esp_eth_mac_new_esp32(&mac_config);

    eth_phy_config_t phy_config = ETH_PHY_DEFAULT_CONFIG();
    phy_config.phy_addr = 0;
    esp_eth_phy_t *phy = esp_eth_phy_new_lan8720(&phy_config);

    esp_eth_config_t config = ETH_DEFAULT_CONFIG(mac, phy);
    config.on_lowlevel_init_done = phy_device_power_enable_via_gpio;

    esp_eth_handle_t eth_handle = NULL;
    ESP_ERROR_CHECK(esp_eth_driver_install(&config, &eth_handle));
    ESP_ERROR_CHECK(esp_netif_attach(eth_netif, eth_handle));
}

sdkconfig.defaults:

# Ethernet
CONFIG_ETH_ENABLED=y
CONFIG_ETH_USE_ESP32_EMAC=y
CONFIG_ETH_PHY_INTERFACE_RMII=y
# CONFIG_ETH_PHY_INTERFACE_MII is not set
# CONFIG_ETH_RMII_CLK_INPUT is not set
CONFIG_ETH_RMII_CLK_OUTPUT=y
# CONFIG_ETH_RMII_CLK_OUTPUT_GPIO0 is not set
CONFIG_ETH_RMII_CLK_OUT_GPIO=17
CONFIG_ETH_SMI_MDC_GPIO=23
CONFIG_ETH_SMI_MDIO_GPIO=18
# CONFIG_ETH_PHY_USE_RST is not set
CONFIG_ETH_DMA_BUFFER_SIZE=512
CONFIG_ETH_DMA_RX_BUFFER_NUM=10
CONFIG_ETH_DMA_TX_BUFFER_NUM=10
# CONFIG_ETH_USE_SPI_ETHERNET is not set
# CONFIG_ETH_USE_OPENETH is not set

CONFIG_ESPTOOLPY_FLASHSIZE_4MB=y
CONFIG_PARTITION_TABLE_TWO_OTA=y            
CONFIG_ESP32_PANIC_PRINT_HALT=y     
CONFIG_HEAP_POISONING_COMPREHENSIVE=y       
CONFIG_ESP32_ENABLE_COREDUMP_TO_FLASH=y    
CONFIG_FREERTOS_USE_TRACE_FACILITY=y
CONFIG_FREERTOS_USE_STATS_FORMATTING_FUNCTIONS=y

Using phy_device_power_enable_via_gpio was the only way I could get it working but I don't think this is the issue because this code isn't even needed on Olimex Gateway yet it still has the same problem.

Code to reproduce this issue

https://gist.github.com/will-emmerson/557ce34c456358997ae3c1ff240311f2

@github-actions github-actions bot changed the title Lan8710a ok in v3 but unstable in v4, corrupt heap errors Lan8710a ok in v3 but unstable in v4, corrupt heap errors (IDFGH-2317) Dec 6, 2019
@szmodz
Copy link

szmodz commented Dec 6, 2019

probably related:
#4406

@suda-morris suda-morris self-assigned this Dec 9, 2019
@suda-morris
Copy link
Collaborator

suda-morris commented Dec 9, 2019

@will-emmerson
Thanks for reporting this issue and your analysis, let's firstly enlarge ETH_DMA_BUFFER_SIZE to a larger size (e.g. 1524)

@will-emmerson
Copy link
Author

will-emmerson commented Dec 9, 2019

Thanks, that appears to have fixed it. Not entirely sure why it seemed to be heat related though.

@benpeoples
Copy link

@will-emmerson
Thanks for reporting this issue and your analysis, let's firstly enlarge ETH_DMA_BUFFER_SIZE to a larger size (e.g. 1524)

Hello! So this just fixed my similar issue. Can we look at pushing this into the documentation?

@bbulkow
Copy link

bbulkow commented Jan 6, 2021

Note: this problem still exists in ESP-IDF latest as of 1/6/21. The shipping default for the DMA buffer is still 512. Would like to know if this is a known issue only with certain Ethernet PY or just with the LAN8710a, or whether there was an underlying driver issue resolved and this config change is no longer needed.

0xFEEDC0DE64 pushed a commit to 0xFEEDC0DE64/esp-idf that referenced this issue May 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants