Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

httpclient has unexpected and unhadled error that causes the call to not return during esp_https_ota (IDFGH-4543) #6364

Closed
arntdr opened this issue Jan 6, 2021 · 33 comments

Comments

@arntdr
Copy link

arntdr commented Jan 6, 2021

Environment

  • Development Kit: [none]
  • Module or chip used: [ESP32-WROOM-32D]
  • IDF version (run git describe --tags to find it):
    v4.2-47-g2532ddd9f
  • Build System: [idf.py]
  • Compiler version (run xtensa-esp32-elf-gcc --version to find it):
    xtensa-esp32-elf-gcc (crosstool-NG esp-2020r3) 8.4.0
  • Operating System: [Linux]
  • Using an IDE?: [No]
  • Power Supply: [external 3.3V]

Problem Description

Sometimes when I call esp_err_t ret = esp_https_ota(&config); the function never returns. It prints a few log messages referencing EAGAIN. There is no more OTA relevant information in the log.

Expected Behavior

  1. esp_https_ota should not block for 10+ hours
  2. As a user of esp_https_ota i expect the library to handle any errors or return a suitable esp_err_t
  3. It looks like EAGAIN is related to the non-blocking socket API, but esp_https_ota is blocking (as noted here). Thus it seems EAGAIN should not be a possible errno even in the internal implementation of esp_https_ota

Actual Behavior

esp_http_client_config_t config = {
        .url = image_location,
        .cert_pem = (char *)server_cert_pem_start,
        .event_handler = _http_event_handler,
	.timeout_ms = 20000,
};
esp_err_t ret = esp_https_ota(&config);
ESP_LOGI(TAG, "sometimes the above call never returns");

When esp_https_ota does not return, I get log output that looks like this:

D (08:01:36.759) esp_https_ota: Written image length 1104558
D (08:01:36.766) esp_https_ota: Written image length 1104847
D (08:01:36.772) esp_https_ota: Written image length 1105136
D (08:01:36.779) esp_https_ota: Written image length 1105425
D (08:01:36.785) esp_https_ota: Written image length 1105714
E (08:01:56.782) TRANS_SSL: esp_tls_conn_read error, errno=No more processes
W (08:01:56.783) HTTP_CLIENT: esp_transport_read returned:-26880 and errno:11 
D (08:01:56.792) esp_https_ota: Written image length 1105920

No more OTA logs are printed. The MQTT client on the system keeps working.

The issue only occurs on the office Wi-Fi network. When using an alternative wifi network, that is used for development only, the issue went away.

Steps to reproduce

I have not found a good way to reproduce this issue. As it is dependent on the wifi network used, I expect that this is tricky to reproduce. When testing in my environment I ran the code above on startup, patched esp_https_ota to do esp_restart before esp_https_ota_finish. This results in a system that downloads the ota image, then reboots. Eventually, it will hang in esp_https_ota as described above.

Other items

It may be that the issue is a lack of error handling in esp_https_ota. Alternatively, there is an error in the HTTP client that should not occur.

Please let me know if there is any additional information I can provide.

@github-actions github-actions bot changed the title httpclient has unexpected and unhadled error that causes the call to not return during esp_https_ota httpclient has unexpected and unhadled error that causes the call to not return during esp_https_ota (IDFGH-4543) Jan 6, 2021
@Alvin1Zhang
Copy link
Collaborator

Thanks for reporting and the detailed report, sorry for the inconvenience, we will look into.

@arntdr
Copy link
Author

arntdr commented Jan 7, 2021

Thank you @Alvin1Zhang . I disconnected the internet cable from by development router while the system was downloading the OTA image. And got the same error as described above. This may be a way for you to reproduce the issue.

D (08:57:45.691) esp_https_ota: Written image length 380035
D (08:57:45.697) esp_https_ota: Written image length 380324
D (08:57:45.702) esp_https_ota: Written image length 380613
D (08:57:45.708) esp_https_ota: Written image length 380902 // disconnected ethernet from router here
E (08:58:05.708) TRANS_SSL: esp_tls_conn_read error, errno=No more processes
W (08:58:05.709) HTTP_CLIENT: esp_transport_read returned:-26880 and errno:11
D (08:58:05.714) esp_https_ota: Written image length 380928

@shubhamkulkarni97
Copy link
Contributor

Hi @arntdr, thanks for reporting this issue.

I have attached a patch below, which should fix the issue. Please try the patch and let me know if it fixes the issue.

While running the example after applying the patch please enable keep_alive config in esp_http_client_config_t, by adding following line:

.keep_alive = true,

0001-esp_tls-Add-option-to-enable-keep_alive-timeout.patch.zip

Thanks,
Shubham

@AxelLin
Copy link
Contributor

AxelLin commented Jan 12, 2021

@shubhamkulkarni97
I'm wondering if this patch works or not if using http rather than htttps for OTA.

@shubhamkulkarni97
Copy link
Contributor

@AxelLin, this patch works only with HTTPS, changes for HTTP are under development.

@AxelLin
Copy link
Contributor

AxelLin commented Jan 12, 2021

BTW, below logic looks strange to me:

+    int keep_idle = timeout_ms > 1000 ? timeout_ms / 1000 : cfg->keep_idle;

The real keep_idle setting may be different from user provided keep_idle setting.
What's wrong with int keep_idle = cfg->keep_idle;?

@shubhamkulkarni97
Copy link
Contributor

shubhamkulkarni97 commented Jan 12, 2021

@AxelLin keep_idle time should be ideally equal to connection timeout, that's why it was added this way. However, I'll consider this point in internal discussion

@AxelLin
Copy link
Contributor

AxelLin commented Jan 12, 2021

@AxelLin keep_idle time should be ideally equal to connection timeout, that's why it was added this way. However, I'll consider this point in internal discussion

keep_idle time is nothing to do with connection timeout.
keep_idle will be checked when pcb is in ESTABLISHED or CLOSE_WAIT state.

@arntdr
Copy link
Author

arntdr commented Jan 13, 2021

Thank you for looking in to this @shubhamkulkarni97. Git could not apply your patch to v4.2-47-g2532ddd9f. I was able to apply it to master (commit 2bfdd03), but my project crashes on boot with this SDK version:

I (1077) boot: Loaded app from partition at offset 0x20000
I (1077) boot: Disabling RNG early entropy source...
D (1089) efuse: coding scheme 0
/home/arnt/.local/esp_v4_2/esp-idf/components/freertos/queue.c:1441 (xQueueSemaphoreTake)- assert failed!
Re-enable cpu cache.

abort() was called at PC 0x4009a794 on core 0
Setting breakpoint at 0x400961f1 and returning...

@shubhamkulkarni97
Copy link
Contributor

@arntdr Sorry for not mentioning that the patch is for master branch.
Regarding crash,

  • Please check if all the git submodules are updated
  • Please remove build directory, sdkconfig file and try performing a clean build

@arntdr
Copy link
Author

arntdr commented Jan 13, 2021

@shubhamkulkarni97 regarding your suggestions:

  • I already did git submodule update in IDF and idf.py clean in the project (and I tried rm build now, for good measure).
  • I don't understand how our project is supposed to build (and run) correctly without the options we have enabled in menuconfig. When I deleted sdkconfig, the build failed with the error noted below
../components/ble/ble_gatt_server.c:11:10: fatal error: esp_bt.h: No such file or directory
 #include "esp_bt.h"

@mahavirj
Copy link
Member

@arntdr

/home/arnt/.local/esp_v4_2/esp-idf/components/freertos/queue.c:1441 (xQueueSemaphoreTake)- assert failed!
Re-enable cpu cache.
abort() was called at PC 0x4009a794 on core 0
Setting breakpoint at 0x400961f1 and returning..

Any chance you have log level set to DEBUG or VERBOSE ?

config LOG_DEFAULT_LEVEL_INFO
If yes, please revert it back to INFO. This looks like different issue with current master branch, will fix that separately.

@arntdr
Copy link
Author

arntdr commented Jan 13, 2021

Thanks, @mahavirj I was able to run the patch with debug level info.

It appears that the patch provided by @shubhamkulkarni97 solved the issue. This is much appreciated, thank you again.

When can we expect this fix to land on a public branch in the repo? What is the timeline for a release that includes this fix?

@mahavirj
Copy link
Member

@arntdr This has been merged internally, it will likely appear on github with next codebase sync event (automated process but it should happen in next few days). For release, this will be part of v4.3 release, expected (roughly) towards end of Feb timeline.

@mahavirj
Copy link
Member

mahavirj commented Feb 2, 2021

Fixed with ba1c8ce

@mahavirj mahavirj closed this as completed Feb 2, 2021
@softdel1003
Copy link

Hi Mahavirj,

I am at the head commit 47b96db (HEAD, origin/release/v4.3), and still getting same error message for OTA when using slower network.

E (19188) TRANS_SSL: esp_tls_conn_read error, errno=No more processes
W (19188) HTTP_CLIENT: esp_transport_read returned:-26880 and errno:11

I have below configuration for https client

esp_http_client_config_t config =
{
		.url = (char *) szUrl,
	        .timeout_ms		= 20000,
		.keep_alive_enable = true,
};

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Sep 6, 2021

@softdel1003 Could you please change timeout_ms to 30000. Maybe 20000 couldn't trigger TCP keep-alive abnormal by default TCP keep-alive configuration.

@softdel1003
Copy link

@ESP-YJM,

I have set the timeout_ms to 30000, but still getting same issue
E (538089) TRANS_SSL: esp_tls_conn_read error, errno=No more processes
W (538089) HTTP_CLIENT: esp_transport_read returned:-26880 and errno:11

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Sep 7, 2021

@softdel1003 Could you please enable debug log output level to Debug and also enable LWIP debug(Component config->LWIP->Enable LWIP debug->Enable socket debug messages and Enale TCP debug). After enable above and share your log file to me. Thanks.

@softdel1003
Copy link

@ESP-YJM please find attached log with debug log enable for socket debug log and TCP debug
log.gateway.20210907153946.txt

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Sep 8, 2021

@softdel1003 I think this print log in your log is normal. This log means that device want to read data, but no data to read. And your OTA process is still working. But for most cases, you can modify your code as following patch.

@@ -997,12 +997,16 @@ int esp_http_client_read(esp_http_client_handle_t client, char *buffer, int len)
                 }
                 ESP_LOG_LEVEL(sev, TAG, "esp_transport_read returned:%d and errno:%d ", rlen, errno);
             }
+            if (rlen == ESP_TLS_ERR_SSL_WANT_READ || errno == EAGAIN) {
+                // EAGAIN
+                ESP_LOGD(TAG, "Received EAGAIN! rlen = %d, errno %d", rlen, errno);
+                return ridx;
+            }
             if (rlen < 0 && ridx == 0 && !esp_http_client_is_complete_data_received(client)) {
                 http_dispatch_event(client, HTTP_EVENT_ERROR, esp_transport_get_error_handle(client->transport), 0);
                 return ESP_FAIL;
-            } else {
-                return ridx;
             }
+            return ridx;
         }
         res_buffer->output_ptr = buffer + ridx;
         http_parser_execute(client->parser, client->parser_settings, res_buffer->data, rlen);

@softdel1003
Copy link

@ESP-YJM Where shall I add this code, could you be more elaborate?

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Sep 8, 2021

@softdel1003
Copy link

@ESP-YJM I have done below changes in code, but still getting same issue. please find attached logs

    if (rlen <= 0) {
        if (errno != 0) {
            esp_log_level_t sev = ESP_LOG_WARN;
            /* On connection close from server, recv should ideally return 0 but we have error conversion
             * in `tcp_transport` SSL layer which translates it `-1` and hence below additional checks */
            if (rlen == -1 && errno == ENOTCONN && client->response->is_chunked) {
                /* Explicit call to parser for invoking `message_complete` callback */
                http_parser_execute(client->parser, client->parser_settings, res_buffer->data, 0);
                /* ...and lowering the message severity, as closed connection from server side is expected in chunked transport */
                sev = ESP_LOG_DEBUG;
            }
            ESP_LOG_LEVEL(sev, TAG, "esp_transport_read returned:%d and errno:%d ", rlen, errno);
        }
        if (rlen < 0 && ridx == 0 && !esp_http_client_is_complete_data_received(client)) {
        	 http_dispatch_event(client, HTTP_EVENT_ERROR, esp_transport_get_error_handle(client->transport), 0);
            return ESP_FAIL;
        } else {

log.gateway.20210908150055.txt

            return ridx;
        }
    }

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Sep 8, 2021

@softdel1003 Yes, i say this log is a normal issue. Does this log that causes the OTA fail?

@AxelLin
Copy link
Contributor

AxelLin commented Sep 8, 2021

@ESP-YJM please find attached log with debug log enable for socket debug log and TCP debug
log.gateway.20210907153946.txt

Above log shows: (This might be off-topic but this should not happen. It looks something wrong.)

�[0;32mI (168503) mbedtls�[0;31mE (168505) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:�[0m
�[0;31mE (168505) task_wdt: - IDLE (CPU 0)�[0m
�[0;31mE (168505) task_wdt: Tasks currently running:�[0m
�[0;31mE (168505) task_wdt: CPU 0: OTA Task�[0m
�[0;31mE (168505) task_wdt: Print CPU 0 (current core) backtrace�[0m

Backtrace:0x40111DEA:0x3FFB07E0 0x40082926:0x3FFB0800 0x400D5226:0x3FFF3050 0x400D5F5D:0x3FFF3070 0x400D47ED:0x3FFF3090 0x4000BD83:0x3FFF30B0 0x4000117D:0x3FFF30D0 0x400592FE:0x3FFF30F0 0x4005937A:0x3FFF3110 0x40058BBF:0x3FFF3130 0x40188D57:0x3FFF3160 0x4018ED86:0x3FFF3180 0x4018EF41:0x3FFF3490 0x4019A35D:0x3FFF34C0 0x4008EF5D:0x3FFF34F0 0x401609A5:0x3FFF3540 0x40161202:0x3FFF3570 0x401606B6:0x3FFF37C0 0x401610D7:0x3FFF37E0 0x4015563D:0x3FFF3800 0x40159F5B:0x3FFF3820 0x40195C9D:0x3FFF3840 0x40150F8C:0x3FFF3860 0x400E7756:0x3FFF38A0 0x400E7A65:0x3FFF3990 0x400E7B70:0x3FFF39E0 0x4008CC89:0x3FFF3A10
0x40111dea: task_wdt_isr at /home/sachink/project4.2/esp-idf/components/esp_common/src/task_wdt.c:189
0x40082926: _xt_lowint1 at /home/sachink/project4.2/esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1105
0x400d5226: uart_tx_char at /home/sachink/project4.2/esp-idf/components/vfs/vfs_uart.c:160 (discriminator 1)
0x400d5f5d: uart_write at /home/sachink/project4.2/esp-idf/components/vfs/vfs_uart.c:217
0x400d47ed: esp_vfs_write at /home/sachink/project4.2/esp-idf/components/vfs/vfs.c:423 (discriminator 4)
0x40188d57: __sprint_r at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vfprintf.c:433
(inlined by) __sprint_r at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vfprintf.c:403
0x4018ed86: _vfprintf_r at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vfprintf.c:1773 (discriminator 1)
0x4018ef41: vprintf at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vprintf.c:34 (discriminator 5)
0x4019a35d: esp_log_writev at /home/sachink/project4.2/esp-idf/components/log/log.c:189
0x4008ef5d: esp_log_write at /home/sachink/project4.2/esp-idf/components/log/log.c:199
0x401609a5: mbedtls_esp_debug at /home/sachink/project4.2/esp-idf/components/mbedtls/port/mbedtls_debug.c:81 (discriminator 13)
0x40161202: debug_send_line at /home/sachink/project4.2/esp-idf/components/mbedtls/mbedtls/library/debug.c:102
(inlined by) mbedtls_debug_print_msg at /home/sachink/project4.2/esp-idf/components/mbedtls/mbedtls/library/debug.c:145
0x401606b6: mbedtls_ssl_read at /home/sachink/project4.2/esp-idf/components/mbedtls/mbedtls/library/ssl_tls.c:8681
0x401610d7: __wrap_mbedtls_ssl_read at /home/sachink/project4.2/esp-idf/components/mbedtls/port/dynamic/esp_ssl_tls.c:99 (discriminator 9)
0x4015563d: esp_mbedtls_read at /home/sachink/project4.2/esp-idf/components/esp-tls/esp_tls_mbedtls.c:161
0x40159f5b: esp_tls_conn_read at /home/sachink/project4.2/esp-idf/components/esp-tls/esp_tls.h:496
(inlined by) ssl_read at /home/sachink/project4.2/esp-idf/components/tcp_transport/transport_ssl.c:168
0x40195c9d: esp_transport_read at /home/sachink/project4.2/esp-idf/components/tcp_transport/transport.c:170
0x40150f8c: esp_http_client_read at /home/sachink/project4.2/esp-idf/components/esp_http_client/esp_http_client.c:950
0x400e7756: ReadWriteOTAFile at /home/sachink/project4.2/esp-idf_GATEWAY/components/OTA/ota.c:319
0x400e7a65: PerformFirmwareUpdate at /home/sachink/project4.2/esp-idf_GATEWAY/components/OTA/ota.c:549
0x400e7b70: TaskFirmwareUpdate at /home/sachink/project4.2/esp-idf_GATEWAY/components/OTA/ota.c:688
0x4008cc89: vPortTaskWrapper at /home/sachink/project4.2/esp-idf/components/freertos/port/xtensa/port.c:168

@softdel1003
Copy link

@ESP-YJM If you could check the logs then on line number 10947, you can see below logs where OTA fails

D (137979) SSL TLS: RX left 16408 bytes�[0m
�[0;31mE (137983) TRANS_SSL: esp_tls_conn_read error, errno=No more processes�[0m
D (137991) HTTP_CLIENT: need_read=7, byte_to_read=7, rlen=-26880, ridx=121�[0m
�[0;33mW (137997) HTTP_CLIENT: esp_transport_read returned:-26880 and errno:11 �[0m
D (138005) HTTP_CLIENT: is_data_remain=1, is_chunked=0, content_length=1974083�[0m
lwip_select(63, 0x3fff4c58, 0x0, 0x3fff4c60, tvsec=3 tvusec=0)
tcp_slowtmr: processing active pcb
tcp_slowtmr: processing active pcb
tcp_slowtmr: polling application
tcp_slowtmr: processing active pcb
tcp_slowtmr: processing active pcb
tcp_slowtmr: polling application
tcp_slowtmr: processing active pcb
tcp_keepalive: sending KEEPALIVE probe to 20.73.224.89
tcp_keepalive: tcp_ticks 222 pcb->tmr 211 pcb->keep_cnt_sent 0
tcp_keepalive: seqno 1887117953 ackno 1290708190 err 0.
lwip_selscan: fd=62 ready for reading

@shubhamkulkarni97
Copy link
Contributor

shubhamkulkarni97 commented Sep 8, 2021

Hi @softdel1003,

  • How frequently is the issue observed? Are these intermittent failures or it fails every time?
  • I have pushed a branch in my fork which has same fix as suggested by @ESP-YJM. Can you try pulling the branch and try running OTA again? Just to check if there were some issues while applying the patch.
  • Please share logs if you are facing OTA failures on above branch as well.

Also, instead of running custom OTA implementation, please try running default OTA example that is bundled with IDF.

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Sep 9, 2021

@ESP-YJM If you could check the logs then on line number 10947, you can see below logs where OTA fails

D (137979) SSL TLS: RX left 16408 bytes�[0m
�[0;31mE (137983) TRANS_SSL: esp_tls_conn_read error, errno=No more processes�[0m
D (137991) HTTP_CLIENT: need_read=7, byte_to_read=7, rlen=-26880, ridx=121�[0m
�[0;33mW (137997) HTTP_CLIENT: esp_transport_read returned:-26880 and errno:11 �[0m
D (138005) HTTP_CLIENT: is_data_remain=1, is_chunked=0, content_length=1974083�[0m
lwip_select(63, 0x3fff4c58, 0x0, 0x3fff4c60, tvsec=3 tvusec=0)
tcp_slowtmr: processing active pcb
tcp_slowtmr: processing active pcb
tcp_slowtmr: polling application
tcp_slowtmr: processing active pcb
tcp_slowtmr: processing active pcb
tcp_slowtmr: polling application
tcp_slowtmr: processing active pcb
tcp_keepalive: sending KEEPALIVE probe to 20.73.224.89
tcp_keepalive: tcp_ticks 222 pcb->tmr 211 pcb->keep_cnt_sent 0
tcp_keepalive: seqno 1887117953 ackno 1290708190 err 0.
lwip_selscan: fd=62 ready for reading

@softdel1003 I don't think the OTA fail, you can check line 11076. OTA still works.

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Sep 9, 2021

@ESP-YJM please find attached log with debug log enable for socket debug log and TCP debug
log.gateway.20210907153946.txt

Above log shows: (This might be off-topic but this should not happen. It looks something wrong.)

�[0;32mI (168503) mbedtls�[0;31mE (168505) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:�[0m
�[0;31mE (168505) task_wdt: - IDLE (CPU 0)�[0m
�[0;31mE (168505) task_wdt: Tasks currently running:�[0m
�[0;31mE (168505) task_wdt: CPU 0: OTA Task�[0m
�[0;31mE (168505) task_wdt: Print CPU 0 (current core) backtrace�[0m

Backtrace:0x40111DEA:0x3FFB07E0 0x40082926:0x3FFB0800 0x400D5226:0x3FFF3050 0x400D5F5D:0x3FFF3070 0x400D47ED:0x3FFF3090 0x4000BD83:0x3FFF30B0 0x4000117D:0x3FFF30D0 0x400592FE:0x3FFF30F0 0x4005937A:0x3FFF3110 0x40058BBF:0x3FFF3130 0x40188D57:0x3FFF3160 0x4018ED86:0x3FFF3180 0x4018EF41:0x3FFF3490 0x4019A35D:0x3FFF34C0 0x4008EF5D:0x3FFF34F0 0x401609A5:0x3FFF3540 0x40161202:0x3FFF3570 0x401606B6:0x3FFF37C0 0x401610D7:0x3FFF37E0 0x4015563D:0x3FFF3800 0x40159F5B:0x3FFF3820 0x40195C9D:0x3FFF3840 0x40150F8C:0x3FFF3860 0x400E7756:0x3FFF38A0 0x400E7A65:0x3FFF3990 0x400E7B70:0x3FFF39E0 0x4008CC89:0x3FFF3A10
0x40111dea: task_wdt_isr at /home/sachink/project4.2/esp-idf/components/esp_common/src/task_wdt.c:189
0x40082926: _xt_lowint1 at /home/sachink/project4.2/esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1105
0x400d5226: uart_tx_char at /home/sachink/project4.2/esp-idf/components/vfs/vfs_uart.c:160 (discriminator 1)
0x400d5f5d: uart_write at /home/sachink/project4.2/esp-idf/components/vfs/vfs_uart.c:217
0x400d47ed: esp_vfs_write at /home/sachink/project4.2/esp-idf/components/vfs/vfs.c:423 (discriminator 4)
0x40188d57: __sprint_r at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vfprintf.c:433
(inlined by) __sprint_r at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vfprintf.c:403
0x4018ed86: _vfprintf_r at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vfprintf.c:1773 (discriminator 1)
0x4018ef41: vprintf at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vprintf.c:34 (discriminator 5)
0x4019a35d: esp_log_writev at /home/sachink/project4.2/esp-idf/components/log/log.c:189
0x4008ef5d: esp_log_write at /home/sachink/project4.2/esp-idf/components/log/log.c:199
0x401609a5: mbedtls_esp_debug at /home/sachink/project4.2/esp-idf/components/mbedtls/port/mbedtls_debug.c:81 (discriminator 13)
0x40161202: debug_send_line at /home/sachink/project4.2/esp-idf/components/mbedtls/mbedtls/library/debug.c:102
(inlined by) mbedtls_debug_print_msg at /home/sachink/project4.2/esp-idf/components/mbedtls/mbedtls/library/debug.c:145
0x401606b6: mbedtls_ssl_read at /home/sachink/project4.2/esp-idf/components/mbedtls/mbedtls/library/ssl_tls.c:8681
0x401610d7: __wrap_mbedtls_ssl_read at /home/sachink/project4.2/esp-idf/components/mbedtls/port/dynamic/esp_ssl_tls.c:99 (discriminator 9)
0x4015563d: esp_mbedtls_read at /home/sachink/project4.2/esp-idf/components/esp-tls/esp_tls_mbedtls.c:161
0x40159f5b: esp_tls_conn_read at /home/sachink/project4.2/esp-idf/components/esp-tls/esp_tls.h:496
(inlined by) ssl_read at /home/sachink/project4.2/esp-idf/components/tcp_transport/transport_ssl.c:168
0x40195c9d: esp_transport_read at /home/sachink/project4.2/esp-idf/components/tcp_transport/transport.c:170
0x40150f8c: esp_http_client_read at /home/sachink/project4.2/esp-idf/components/esp_http_client/esp_http_client.c:950
0x400e7756: ReadWriteOTAFile at /home/sachink/project4.2/esp-idf_GATEWAY/components/OTA/ota.c:319
0x400e7a65: PerformFirmwareUpdate at /home/sachink/project4.2/esp-idf_GATEWAY/components/OTA/ota.c:549
0x400e7b70: TaskFirmwareUpdate at /home/sachink/project4.2/esp-idf_GATEWAY/components/OTA/ota.c:688
0x4008cc89: vPortTaskWrapper at /home/sachink/project4.2/esp-idf/components/freertos/port/xtensa/port.c:168

@AxelLin Yes, it is off-topic. It seems task watchdog be triggered. I think it maybe open the log output to Debug level and log output is too much to cost much time to print, and cause the task watchdog. Maybe not a problem, need disable the debug log and test it.

@softdel1003
Copy link

Hi @shubhamkulkarni97,

I have matched the my local branch esp_http_client.txt(.c) with yours same has been attatched here, but still getting same error
�[0;31mE (380495) TRANS_SSL: esp_tls_conn_read error, errno=No more processes�[0m
D (380503) HTTP_CLIENT: need_read=7, byte_to_read=7, rlen=-26880, ridx=121�[0m
�[0;33mW (380509) HTTP_CLIENT: esp_transport_read returned:-26880 and errno:11 �[0m
D (380517) HTTP_CLIENT: Received EAGAIN! rlen = -26880, errno 11�[0m
D (380523) HTTP_CLIENT: is_data_remain=1, is_chunked=0, content_length=1974083�[0m
lwip_select(63, 0x3fff4be8, 0x0, 0x3fff4bf0, tvsec=3 tvusec=0)
tcp_slowtmr: processing active pcb
tcp_slowtmr: processing active pcb
tcp_slowtmr: polling application
tcp_slowtmr: processing active pcb
tcp_slowtmr: processing active pcb
tcp_slowtmr: polling application
tcp_slowtmr: processing active pcb
tcp_keepalive: sending KEEPALIVE probe to 20.73.224.89
tcp_keepalive: tcp_ticks 355 pcb->tmr 344 pcb->keep_cnt_sent 0
tcp_keepalive: seqno 732374796 ackno 3337200867 err
log.gateway.20210908184132.txt

esp_http_client.txt

@AxelLin
Copy link
Contributor

AxelLin commented Sep 13, 2021

@softdel1003 I think this print log in your log is normal. This log means that device want to read data, but no data to read. And your OTA process is still working. But for most cases, you can modify your code as following patch.

@@ -997,12 +997,16 @@ int esp_http_client_read(esp_http_client_handle_t client, char *buffer, int len)
                 }
                 ESP_LOG_LEVEL(sev, TAG, "esp_transport_read returned:%d and errno:%d ", rlen, errno);
             }
+            if (rlen == ESP_TLS_ERR_SSL_WANT_READ || errno == EAGAIN) {
+                // EAGAIN
+                ESP_LOGD(TAG, "Received EAGAIN! rlen = %d, errno %d", rlen, errno);
+                return ridx;
+            }
             if (rlen < 0 && ridx == 0 && !esp_http_client_is_complete_data_received(client)) {
                 http_dispatch_event(client, HTTP_EVENT_ERROR, esp_transport_get_error_handle(client->transport), 0);
                 return ESP_FAIL;
-            } else {
-                return ridx;
             }
+            return ridx;
         }
         res_buffer->output_ptr = buffer + ridx;
         http_parser_execute(client->parser, client->parser_settings, res_buffer->data, rlen);

Hi @ESP-YJM
Just want to point out the v4.3 branch also needs this fix: 1c57dbc

BTW, sometimes I got ESP_ERR_HTTP_CONNECT error with errno 11 (EAGAIN).
I'm not sure if such case is related to this fix or it's a different issue.

@ESP-YJM
Copy link
Collaborator

ESP-YJM commented Sep 13, 2021

@AxelLin Yes, we have created backport to v4.3 internal. I think the error ESP_ERR_HTTP_CONNECT is not related to this fix.
But you can merge this patch and test. If you still find the error, you can rasie a issue to discuss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants