-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
advanced https ota failed oftenly (IDFGH-2249) #4394
Comments
|
I try 101 times and 22 of that failed. The target server is on amazon s3, I do not know that did it support chunked-encoding actually. The following log show more detail. It looks like the ota procedure read a part of data and after that can not read anymore. D (66873) esp_https_ota: Written image length 277570 W (81523) mbedtls: ssl_tls.c:4350 ssl_get_next_record() returned -26624 (-0x6800) W (81523) mbedtls: ssl_tls.c:8341 mbedtls_ssl_read_record() returned -26624 (-0x6800) E (81533) esp-tls: read error :-26624: W (93193) mbedtls: ssl_tls.c:4350 ssl_get_next_record() returned -26624 (-0x6800) W (93193) mbedtls: ssl_tls.c:8341 mbedtls_ssl_read_record() returned -26624 (-0x6800) E (93203) esp-tls: read error :-26624: |
which toolchain are you using? tls issues should be solved in 2019r2 |
how to check the toolchain version? |
that toolchain is the old one. |
So is that the root cause which causing the ota failed?(the old toolchain version). Are there no need to modify or upgrade the esp-idf version? (v4.0-released now). Here is the log with 'debug' levle: --- idf_monitor on /dev/ttyUSB0 115200 --- rst:0x1 (POWERON_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT) I (178) esp_image: segment 3: paddr=0x00035938 vaddr=0x40080400 size=0x0a6d8 ( 42712) load I (418) esp_image: segment 5: paddr=0x000d1dc8 vaddr=0x4008aad8 size=0x05cd0 ( 23760) load I (439) boot: Loaded app from partition at offset 0x10000 I (0) cpu_start: App cpu up. D (532) memory_layout: Reserved memory range 0x40080000 - 0x400907a5 D (539) memory_layout: Building list of available memory regions: I (2807) wifi: new:<11,2>, old:<1,0>, ap:<255,255>, sta:<11,2>, prof:1 D (3807) wpa: EAPOL-Key type=2 D (3807) wpa: key_info 0x8a (ver=2 keyidx=0 rsvd=0 Pairwise Ack) D (3807) wpa: key_length=16 key_data_length=22 D (3817) wpa: WPA: RX EAPOL-Key - hexdump(len=121): D (3817) wpa: 01 D (3857) wpa: 00 D (3887) wpa: dd D (3927) wpa: 8f D (3957) wpa: 00 D (3987) wpa: 00 D (4027) wpa: 00 D (4057) wpa: 00 D (4077) wpa: WPA 1/4-Way Handshake D (4077) wpa: RSN: msg 1/4 key data - hexdump(len=22): D (4087) wpa: dd D (4117) wpa: 00 D (4137) wpa: WPA: PTK derivation - A1=30:ae:a4:08:93:b8 A2=78:44:76:f8:b4:44 D (4147) wpa: WPA: PMK - hexdump(len=32): D (4147) wpa: fe D (4177) wpa: d9 D (4217) wpa: D (4217) wpa: WPA: PTK - hexdump(len=48): D (4217) wpa: 0c D (4257) wpa: 4c D (4287) wpa: df D (4317) wpa: D (4327) wpa: WPA: WPA IE for msg 2/4
D (4327) wpa: 30 D (4367) wpa: 00 D (4377) wpa: WPA Send EAPOL-Key 2/4 D (4377) wpa: WPA: TX EAPOL-Key - hexdump(len=121): D (4387) wpa: 01 D (4417) wpa: 00 D (4457) wpa: 7b D (4487) wpa: e0 D (4517) wpa: 00 D (4557) wpa: 00 D (4587) wpa: 6c D (4627) wpa: 04 D (4657) wpa: IEEE 802.1X RX: version=1 type=3 length=151 D (4657) wpa: EAPOL-Key type=2 D (4657) wpa: key_info 0x13ca (ver=2 keyidx=0 rsvd=0 Pairwise Install Ack MIC Secure Encr) D (4657) wpa: key_length=16 key_data_length=56 D (4667) wpa: WPA: RX EAPOL-Key - hexdump(len=155): D (4667) wpa: 01 D (4707) wpa: 01 D (4737) wpa: dd D (4777) wpa: 8f D (4807) wpa: dd D (4837) wpa: 00 D (4877) wpa: bc D (4907) wpa: 8c D (4937) wpa: 56 D (4977) wpa: 69 D (4997) wpa: WPA 3/4-Way Handshake D (5007) wpa: WPA: No WPA/RSN IE for this AP known. Trying to get from scan results D (5007) wpa: WPA: Found the current AP from updated scan results D (5017) wpa: tx 4/4 txcb_flags=8192 D (5017) wpa: WPA Send EAPOL-Key 4/4 D (5027) wpa: WPA: TX EAPOL-Key - hexdump(len=99): D (5027) wpa: 01 D (5067) wpa: 01 D (5097) wpa: 00 D (5127) wpa: 00 D (5167) wpa: 00 D (5197) wpa: 00 D (5237) wpa: 00 D (5247) wpa: WPA: Installing PTK to the driver. D (5247) wpa: WPA: Installing GTK to the driver (keyidx=1 tx=0 len=16). D (5257) wpa: WPA: Key negotiation completed with 78:44:76:f8:b4:44 [PTK=CCMP GTK=CCMP] I (5257) wifi: connected with test, channel 11, 40D, bssid = 78:44:76:f8:b4:44 D (5267) system_event: SYSTEM_EVENT_STA_CONNECTED, ssid:test, ssid_len:4, bssid:78:44:76:f8:b4:44, channel:11, authmode:3 D (5287) tcpip_adapter: dhcp client init ip/mask/gw to all-0 D (6967) system_event: SYSTEM_EVENT_AP_STA_GOT_IP6 address fe80:0000:0000:0000:32ae:a4ff:fe08:93b8 D (7467) tcpip_adapter: if0 dhcpc cb I (7477) tcpip_adapter: sta ip: 192.168.200.9, mask: 255.255.255.0, gw: 192.168.200.1 I (7497) example_connect: Connected to test D (10877) HTTP_CLIENT: on_message_begin |
@enricop Any specific reason why do you think this could be toolchain version related issue? @huazhenjiang Error code from earlier log |
I have been tried to add 10 seconds timeout for testing this issue, but it looks not work. the following log show the ota failed with timeout : [2019-11-29 14:38:36.120] lwip_recv_tcp: top while sock->lastdata=0x3ffd0d5c |
@huazhenjiang Is it possible to share reference code (this will need your S3 url to be public), I will try to reproduce at my end? One more suggestion that could be tried out is to modify application code per https://github.com/espressif/esp-idf/blob/master/examples/system/ota/advanced_https_ota/main/advanced_https_ota_example.c#L137 (not sure it will help or not but worth trying) |
OK, I could share all of the testing files including the firmware bin, s3 public url, log and the sniffing log with wireshark ... etc. |
@mahavirj I have the same question. IDF version is V3.2 286202c。I not set the timeout_ms(actually if i set timeout_ms=5000, the error still appear) see the log: |
@huazhenjiang I tried updating from the link that you have mentioned in readme.txt, with timeout set to 60000. The example worked fine for around 100 iterations. Can you please try setting timeout to 60000? This is a workaround, however we are trying to dig deeper to check is any bug exists. I have attached a patch for setting the timeout, for your reference. |
@shubhamkulkarni97 |
@huazhenjiang @hehao9051 I have attached a patch to fix advanced_https_ota example. It has two changes,
Can you please try running the example on release/4.0 branch after applying this patch? Feel free to get-back if you face any errors. |
I have been tried this patch and it works. The success rate of ota is improved to 97 %. |
@shubhamkulkarni97 thanks your work, but the error is still appear, when data_read == 0 && errno == EAGAIN, the next esp_https_ota_perform will return data_read == 0 && errno != EAGAIN. |
@huazhenjiang Issue will get closed once fix is officially merged in github repository. Final fix will depend on some additional tests and applicability across different ESP-IDF release (v3.x/v4.x). @hehao9051 We are trying some additional tests, will keep you posted on this. |
@hehao9051 In which scenario do you observe this problem? Can you please explain in detail? |
@huazhenjiang @hehao9051 I have attached a patch, it should fix the issues related to timeout in OTA Update. We have added an option in Can you please try running the example after applying this patch? |
I am trying to apply your patch, but I am not successful. I am on newest commit of release/v.4.0 - 20e6d46. How can i apply it? git apply ../../Downloads/bugfix_for_ota.patch/bugfix_for_ota.patch |
@Arthedian I have attached another patch, which should be applied successfully. |
|
@shubhamkulkarni97 I tried the patch which you entered 10 days ago and it worked on WiFi for me, so I want to thank you for it. I am trying to use it with SIM800L but it is taking too long and still saying “HW FIFO Overflow”. I probably know where is the problem, and I already tried to tweak some uart setting but without any luck. Do you know how to fix it? |
@Arthedian OTA Update may take some time depending on data rates. For HW FIFO Overflow issue, can you please provide logs? |
@shubhamkulkarni97 I created a new issue for this problem here |
example_test.py is added to test advanced_https_ota_example and native ota_example. Closes #4394
example_test.py is added to test advanced_https_ota_example and native ota_example. Closes #4394
example_test.py is added to test advanced_https_ota_example and native ota_example. Closes #4394
Environment:
Development Kit: ESP32-DevKitC
Kit version (for WroverKit/PicoKit/DevKitC): v2
Module or chip used: ESP32-WROOM-32
IDF version (run git describe --tags to find it): v4.0-released
Operating System: Windows 10
Power Supply: USB
###Problem Description
I try the advanced https ota example to test ota . And I observer the ota procedure get low success rate. The ota get success sometimes, but worse oftenly. The following show the log :
###Expected Behavior
ota successful oftenly
###Actual Behavior
ota failed oftenly
###Steps to repropduce
Just copy a pem of https server and run the advanced https ota example
###Code to reproduce this issue
none.
###Debug Logs
I (1990) wifi: new:<11,2>, old:<1,0>, ap:<255,255>, sta:<11,2>, prof:1
I (2960) wifi: state: init -> auth (b0)
I (2970) wifi: state: auth -> assoc (0)
I (2970) wifi: state: assoc -> run (10)
I (2980) wifi: connected with test, channel 11, 40D, bssid = 78:44:76:f8:b4:44
I (2990) wifi: pm start, type: 1
I (5110) tcpip_adapter: sta ip: 192.168.200.18, mask: 255.255.255.0, gw: 192.168.200.1
I (5110) example_connect: Connected to test
I (5110) example_connect: IPv4 address: 192.168.200.18
I (5120) example_connect: IPv6 address: fe80:0000:0000:0000:266f:28ff:fed7:a8f4
I (5130) advanced_https_ota_example: Starting Advanced OTA example
I (12030) esp_https_ota: Starting OTA...
I (12030) esp_https_ota: Writing to partition subtype 16 at offset 0x110000
I (15450) advanced_https_ota_example: Running firmware version: 1.1.5-3-gebd9eab-dirty
E (23230) esp-tls: read error :-76:
E (23230) TRANS_SSL: esp_tls_conn_read error, errno=No more processes
E (213020) esp-tls: read error :-76:
E (213020) TRANS_SSL: esp_tls_conn_read error, errno=No more processes
E (259580) esp-tls: read error :-76:
E (259580) TRANS_SSL: esp_tls_conn_read error, errno=No more processes
E (264660) esp-tls: read error :-76:
E (264660) TRANS_SSL: esp_tls_conn_read error, errno=No more processes
E (319000) esp-tls: read error :-76:
E (319000) TRANS_SSL: esp_tls_conn_read error, errno=No more processes
E (336510) esp-tls: read error :-76:
E (336510) TRANS_SSL: esp_tls_conn_read error, errno=No more processes
I (341510) esp_https_ota: Connection closed, all data received
I (341510) esp_image: segment 0: paddr=0x00110020 vaddr=0x3f400020 size=0x2c480 (181376) map
I (341650) esp_image: segment 1: paddr=0x0013c4a8 vaddr=0x3ffb0000 size=0x03540 ( 13632)
I (341660) esp_image: segment 2: paddr=0x0013f9f0 vaddr=0x40080000 size=0x00400 ( 1024)
0x40080000: _WindowOverflow4 at /home/huazhen/Work/mmwave-esp32/esp-idf/components/freertos/xtensa_vectors.S:1778
I (341660) esp_image: segment 3: paddr=0x0013fdf8 vaddr=0x40080400 size=0x00218 ( 536)
I (341670) esp_image: segment 4: paddr=0x00140018 vaddr=0x400d0018 size=0xb96fc (759548) map
0x400d0018: _stext at ??:?
E (342250) esp_image: invalid segment length 0xffffffff
E (342260) advanced_https_ota_example: ESP_HTTPS_OTA upgrade failed...
I (343260) wifi: state: run -> init (0)
I (343260) wifi: pm stop, total sleep time: 299923894 us / 340271911 us
I (343260) wifi: new:<11,0>, old:<11,2>, ap:<255,255>, sta:<11,2>, prof:1
I (343270) example_connect: Wi-Fi disconnected, trying to reconnect...
E (343270) wifi: esp_wifi_connect 1151 wifi not start
ESP_ERROR_CHECK failed: esp_err_t 0x3002 (ESP_ERR_WIFI_NOT_STARTED)I (343280) wifi: flush txq
I (343280) wifi: stop sw txq
I (343290) wifi: lmac stop hw txq
I (12) boot: ESP-IDF 1.1.5-3-gebd9eab-dirty 2nd stage bootloader
I (12) boot: compile time 10:00:50
I (12) boot: Enabling RNG early entropy source...
I (17) boot: SPI Speed : 40MHz
I (21) boot: SPI Mode : DIO
I (25) boot: SPI Flash Size : 4MB
I (29) boot: Partition Table:
I (33) boot: ## Label Usage Type ST Offset Length
I (40) boot: 0 nvs WiFi data 01 02 00009000 00004000
I (47) boot: 1 otadata OTA data 01 00 0000d000 00002000
I (55) boot: 2 phy_init RF data 01 01 0000f000 00001000
I (62) boot: 3 factory factory app 00 00 00010000 00100000
I (70) boot: 4 ota_0 OTA app 00 10 00110000 00100000
I (77) boot: 5 ota_1 OTA app 00 11 00210000 00100000
I (85) boot: End of partition table
I (89) boot: Defaulting to factory image
I (94) esp_image: segment 0: paddr=0x00010020 vaddr=0x3f400020 size=0x1dafc (121596) map
I (146) esp_image: segment 1: paddr=0x0002db24 vaddr=0x3ffb0000 size=0x024ec ( 9452) load
I (150) esp_image: segment 2: paddr=0x00030018 vaddr=0x400d0018 size=0x8eb04 (584452) map
0x400d0018: _stext at ??:?
I (362) esp_image: segment 3: paddr=0x000beb24 vaddr=0x3ffb24ec size=0x00d60 ( 3424) load
I (363) esp_image: segment 4: paddr=0x000bf88c vaddr=0x40080000 size=0x00400 ( 1024) load
0x40080000: _WindowOverflow4 at /home/huazhen/Work/mmwave-esp32/esp-idf/components/freertos/xtensa_vectors.S:1778
I (369) esp_image: segment 5: paddr=0x000bfc94 vaddr=0x40080400 size=0x1036c ( 66412) load
I (415) boot: Loaded app from partition at offset 0x10000
I (416) boot: Disabling RNG early entropy source...
The text was updated successfully, but these errors were encountered: