-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
httpclient has unexpected and unhadled error that causes the call to not return during esp_https_ota (IDFGH-4543) #6364
Comments
Thanks for reporting and the detailed report, sorry for the inconvenience, we will look into. |
Thank you @Alvin1Zhang . I disconnected the internet cable from by development router while the system was downloading the OTA image. And got the same error as described above. This may be a way for you to reproduce the issue.
|
Hi @arntdr, thanks for reporting this issue. I have attached a patch below, which should fix the issue. Please try the patch and let me know if it fixes the issue. While running the example after applying the patch please enable keep_alive config in esp_http_client_config_t, by adding following line:
0001-esp_tls-Add-option-to-enable-keep_alive-timeout.patch.zip Thanks, |
@shubhamkulkarni97 |
@AxelLin, this patch works only with HTTPS, changes for HTTP are under development. |
BTW, below logic looks strange to me:
The real keep_idle setting may be different from user provided keep_idle setting. |
@AxelLin keep_idle time should be ideally equal to connection timeout, that's why it was added this way. However, I'll consider this point in internal discussion |
keep_idle time is nothing to do with connection timeout. |
Thank you for looking in to this @shubhamkulkarni97. Git could not apply your patch to v4.2-47-g2532ddd9f. I was able to apply it to master (commit 2bfdd03), but my project crashes on boot with this SDK version:
|
@arntdr Sorry for not mentioning that the patch is for master branch.
|
@shubhamkulkarni97 regarding your suggestions:
|
Any chance you have log level set to esp-idf/components/log/Kconfig Line 22 in 2bfdd03
INFO . This looks like different issue with current master branch, will fix that separately.
|
Thanks, @mahavirj I was able to run the patch with debug level info. It appears that the patch provided by @shubhamkulkarni97 solved the issue. This is much appreciated, thank you again. When can we expect this fix to land on a public branch in the repo? What is the timeline for a release that includes this fix? |
@arntdr This has been merged internally, it will likely appear on github with next codebase sync event (automated process but it should happen in next few days). For release, this will be part of v4.3 release, expected (roughly) towards end of Feb timeline. |
Fixed with ba1c8ce |
Hi Mahavirj, I am at the head commit 47b96db (HEAD, origin/release/v4.3), and still getting same error message for OTA when using slower network. E (19188) TRANS_SSL: esp_tls_conn_read error, errno=No more processes I have below configuration for https client
|
@softdel1003 Could you please change |
I have set the timeout_ms to 30000, but still getting same issue |
@softdel1003 Could you please enable debug log output level to |
@ESP-YJM please find attached log with debug log enable for socket debug log and TCP debug |
@softdel1003 I think this print log in your log is normal. This log means that device want to read data, but no data to read. And your OTA process is still working. But for most cases, you can modify your code as following patch.
|
@ESP-YJM Where shall I add this code, could you be more elaborate? |
@softdel1003 Sorry for that. You can add code before https://github.com/espressif/esp-idf/blob/master/components/esp_http_client/esp_http_client.c#L1000. |
@ESP-YJM I have done below changes in code, but still getting same issue. please find attached logs
log.gateway.20210908150055.txt
|
@softdel1003 Yes, i say this log is a normal issue. Does this log that causes the OTA fail? |
Above log shows: (This might be off-topic but this should not happen. It looks something wrong.) �[0;32mI (168503) mbedtls�[0;31mE (168505) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:�[0m Backtrace:0x40111DEA:0x3FFB07E0 0x40082926:0x3FFB0800 0x400D5226:0x3FFF3050 0x400D5F5D:0x3FFF3070 0x400D47ED:0x3FFF3090 0x4000BD83:0x3FFF30B0 0x4000117D:0x3FFF30D0 0x400592FE:0x3FFF30F0 0x4005937A:0x3FFF3110 0x40058BBF:0x3FFF3130 0x40188D57:0x3FFF3160 0x4018ED86:0x3FFF3180 0x4018EF41:0x3FFF3490 0x4019A35D:0x3FFF34C0 0x4008EF5D:0x3FFF34F0 0x401609A5:0x3FFF3540 0x40161202:0x3FFF3570 0x401606B6:0x3FFF37C0 0x401610D7:0x3FFF37E0 0x4015563D:0x3FFF3800 0x40159F5B:0x3FFF3820 0x40195C9D:0x3FFF3840 0x40150F8C:0x3FFF3860 0x400E7756:0x3FFF38A0 0x400E7A65:0x3FFF3990 0x400E7B70:0x3FFF39E0 0x4008CC89:0x3FFF3A10 |
@ESP-YJM If you could check the logs then on line number 10947, you can see below logs where OTA fails D (137979) SSL TLS: RX left 16408 bytes�[0m |
Hi @softdel1003,
Also, instead of running custom OTA implementation, please try running default OTA example that is bundled with IDF. |
@softdel1003 I don't think the OTA fail, you can check line 11076. OTA still works. |
@AxelLin Yes, it is off-topic. It seems task watchdog be triggered. I think it maybe open the log output to Debug level and log output is too much to cost much time to print, and cause the task watchdog. Maybe not a problem, need disable the debug log and test it. |
I have matched the my local branch esp_http_client.txt(.c) with yours same has been attatched here, but still getting same error |
Hi @ESP-YJM BTW, sometimes I got ESP_ERR_HTTP_CONNECT error with errno 11 (EAGAIN). |
@AxelLin Yes, we have created backport to v4.3 internal. I think the error |
Environment
git describe --tags
to find it):v4.2-47-g2532ddd9f
xtensa-esp32-elf-gcc --version
to find it):xtensa-esp32-elf-gcc (crosstool-NG esp-2020r3) 8.4.0
Problem Description
Sometimes when I call
esp_err_t ret = esp_https_ota(&config);
the function never returns. It prints a few log messages referencingEAGAIN
. There is no more OTA relevant information in the log.Expected Behavior
esp_https_ota
should not block for 10+ hoursesp_https_ota
i expect the library to handle any errors or return a suitableesp_err_t
EAGAIN
is related to the non-blocking socket API, butesp_https_ota
is blocking (as noted here). Thus it seemsEAGAIN
should not be a possibleerrno
even in the internal implementation ofesp_https_ota
Actual Behavior
When
esp_https_ota
does not return, I get log output that looks like this:No more OTA logs are printed. The MQTT client on the system keeps working.
The issue only occurs on the office Wi-Fi network. When using an alternative wifi network, that is used for development only, the issue went away.
Steps to reproduce
I have not found a good way to reproduce this issue. As it is dependent on the wifi network used, I expect that this is tricky to reproduce. When testing in my environment I ran the code above on startup, patched
esp_https_ota
to do esp_restart beforeesp_https_ota_finish
. This results in a system that downloads the ota image, then reboots. Eventually, it will hang inesp_https_ota
as described above.Other items
It may be that the issue is a lack of error handling in
esp_https_ota
. Alternatively, there is an error in the HTTP client that should not occur.Please let me know if there is any additional information I can provide.
The text was updated successfully, but these errors were encountered: