-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Crash in OTA agent in function prvPublishStatusMessage() #3110
Comments
Thank you for reporting this. We are working on reproducing and fixing it. |
Hello @pvyawaha , I don't know if it's the same problem but the OTA agent keeps crashing after starting up. We are using AWS FreeRTOS (v202012.00) with an ESP32 board and the aws_iot_ota_agent.h.
System information
|
Hello @dFohlen, Are you running the OTA demo as it is or integrated with your application ? Are you also sharing the connection for other MQTT operations? |
Hello @bgklika , We are working on integrating the LTS OTA library in this repo and will be verifying the issue. |
Hi @pvyawaha, Thanks for the answer
Integrated, but it's mainly copied from the test sources:
Not currently (in this example), but we want to use the OTA agent for that. |
Hello , Do you still see the issues with latest version of the library or can this ticket closed? |
I have fixed that bug locally via patch before reporting the issue. If bug actually resolved in main repo you could close the issue. |
I've fixed it too. After updating to ESP-IDF 4.2 and using the MQTT + OTA agent, the problem no longer occurred. |
Thank you , closing this issue. |
Describe the bug
When we doing receiving of OTA job block
if is possible to use NULL pointer from pxAgentCtx->pcOTA_Singleton_ActiveJobName in prvPublishStatusMessage()
When couple of blocks are received (even when duplicate) but writing to file/flash failed in function prvProcessDataHandler() code block
/* Free any remaining string memory holding the job name since this job is done. */ if( xOTA_Agent.pcOTA_Singleton_ActiveJobName != NULL ) { vPortFree( xOTA_Agent.pcOTA_Singleton_ActiveJobName ); xOTA_Agent.pcOTA_Singleton_ActiveJobName = NULL; }
is called.
Same time next block can be already received and next task cycle will be processed by same prvProcessDataHandler() and code
xErr = xOTA_ControlInterface.prvUpdateJobStatus( &xOTA_Agent, eJobStatus_FailedWithVal, ( int32_t ) xCloseResult, ( int32_t ) xResult );
will called.
Unfortunately prvPublishStatusMessage() called from prvUpdateJobStatus_Mqtt() do not check pxAgentCtx->pcOTA_Singleton_ActiveJobName (and pxAgentCtx->pcThingName too ) before execute code block:
` /* Try to build the dynamic job status topic . */
ulTopicLen = ( uint32_t ) snprintf( pcTopicBuffer, /*lint -e586 Intentionally using snprintf. */
sizeof( pcTopicBuffer ),
pcOTA_JobStatus_TopicTemplate,
pxAgentCtx->pcThingName,
pxAgentCtx->pcOTA_Singleton_ActiveJobName );
`
In result exception is occured.
System information
Which hardware board or part numbers?
ESP32_DevKitC_V4
IDE used
ESP-IDF 3.3
Operating System [Windows|Linux|MacOS]
Window10
Version of FreeRTOS (run
git describe --tags
to find it)202012.00
Project
Custom Application
If your project is a Custom Application, please add the relevant code snippet in the section
Code to reproduce the bug
.No need special code
Expected behavior
Even OTA image is bad, or network connection is not stable application should not crash.
Screenshots or console output
Log output:
`3671 137397 [OTA Agent Task] [prvRequestFileBlock_Mqtt] OK: $aws/things/ac:67:b2:6e:81:4c/streams/AFR_OTA-61976ca5-1a73-4568-aebb-85536f0be3f1/get/cbor
3672 137397 [OTA Agent Task] [prvExecuteHandler] Called handler. Current State [WaitingForFileBlock] Event [RequestFileBlock] New state [WaitingForFileBlock]
3673 137494 [OTA Agent Task] [prvIngestDataBlock] Received file block 478, size 4096
3674 137505 [OTA Agent Task] [prvIngestDataBlock] Remaining: 103
3675 137505 [OTA Agent Task] [prvExecuteHandler] Called handler. Current State [WaitingForFileBlock] Event [ReceivedFileBlock] New state [WaitingForFileBlock]
3676 137508 [OTA Agent Task] [prvRequestFileBlock_Mqtt] OK: $aws/things/ac:67:b2:6e:81:4c/streams/AFR_OTA-61976ca5-1a73-4568-aebb-85536f0be3f1/get/cbor
3677 137508 [OTA Agent Task] [prvExecuteHandler] Called handler. Current State [WaitingForFileBlock] Event [RequestFileBlock] New state [WaitingForFileBlock]
3678 137596 [OTA Agent Task] [prvIngestDataBlock] Received file block 478, size 4096
3679 137597 [OTA Agent Task] [prvIngestDataBlock] block 478 is a DUPLICATE. 103 blocks remaining.
3680 137597 [OTA Agent Task] [prvExecuteHandler] Called handler. Current State [WaitingForFileBlock] Event [ReceivedFileBlock] New state [WaitingForFileBlock]
3681 137600 [OTA Agent Task] [prvRequestFileBlock_Mqtt] OK: $aws/things/ac:67:b2:6e:81:4c/streams/AFR_OTA-61976ca5-1a73-4568-aebb-85536f0be3f1/get/cbor
3682 137600 [OTA Agent Task] [prvExecuteHandler] Called handler. Current State [WaitingForFileBlock] Event [RequestFileBlock] New state [WaitingForFileBlock]
3683 137694 [OTA Agent Task] [prvIngestDataBlock] Received file block 479, size 4096
3684 137704 [OTA Agent Task] [prvIngestDataBlock] Remaining: 102
3685 137705 [OTA Agent Task] [prvExecuteHandler] Called handler. Current State [WaitingForFileBlock] Event [ReceivedFileBlock] New state [WaitingForFileBlock]
3686 137708 [OTA Agent Task] [prvRequestFileBlock_Mqtt] OK: $aws/things/ac:67:b2:6e:81:4c/streams/AFR_OTA-61976ca5-1a73-4568-aebb-85536f0be3f1/get/cbor
3687 137708 [OTA Agent Task] [prvExecuteHandler] Called handler. Current State [WaitingForFileBlock] Event [RequestFileBlock] New state [WaitingForFileBlock]
3688 137795 [OTA Agent Task] [prvIngestDataBlock] Received file block 479, size 4096
3689 137795 [OTA Agent Task] [prvIngestDataBlock] block 479 is a DUPLICATE. 102 blocks remaining.
3690 137796 [OTA Agent Task] [prvExecuteHandler] Called handler. Current State [WaitingForFileBlock] Event [ReceivedFileBlock] New state [WaitingForFileBlock]
3691 137799 [OTA Agent Task] [prvRequestFileBlock_Mqtt] OK: $aws/things/ac:67:b2:6e:81:4c/streams/AFR_OTA-61976ca5-1a73-4568-aebb-85536f0be3f1/get/cbor
3692 137799 [OTA Agent Task] [prvExecuteHandler] Called handler. Current State [WaitingForFileBlock] Event [RequestFileBlock] New state [WaitingForFileBlock]
E (138064) ota_pal: Couldn't flash at the offset 1966080
I (138065) ota_pal: prvPAL_SetPlatformImageState, 3
W (138065) ota_pal: Set image as invalid!
I (138070) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
I (138076) esp_ota_ops: [0] aflags/seq:0x2/0x1, pflags/seq:0xffffffff/0x0
I (138083) esp_ota_ops: aws_esp_ota_set_boot_flags: 3 0
I (138089) esp_ota_ops: [1] aflags/seq:0xffffffff/0x0, pflags/seq:0x2/0x1
3693 137898 [OTA Agent Task] [prvIngestDataBlock] Received file block 480, size 4096
3694 137900 [OTA Agent Task] [prvIngestDataBlock] Error (-1) writing file block
3695 137900 [OTA Agent Task] [prvStopRequestTimer] Stopping request timer.
3696 137900 [OTA Agent Task] [prvProcessDataMessage] Aborting due to IngestResult_t error -9
3697 138053 [OTA Agent Task] [prvPublishStatusMessage] Msg: {"status":"FAILED","statusDetails":{"reason":"0x27000000: 0xfffffff7"}}
E (138466) ota_pal: Couldn't flash at the offset 1966080
I (138468) ota_pal: prvPAL_SetPlatformImageState, 3
W (138468) ota_pal: Set image as invalid!
I (138472) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
I (138478) esp_ota_ops: [0] aflags/seq:0x2/0x1, pflags/seq:0x3/0x0
I (138485) esp_ota_ops: aws_esp_ota_set_boot_flags: 3 0
I (138491) esp_ota_ops: [1] aflags/seq:0x3/0x0, pflags/seq:0x2/0x1
3698 138295 [OTA Agent Task] [prvPublishStatusMessage] 'FAILED' to $aws/things/ac:67:b2:6e:81:4c/jobs/AFR_OTA-2c6e8e81-8e7b-46de-9fa5-c996dae3577d/update
3699 138300 [OTA Agent Task] [DEBUG][SPRKPLG_PROC][138300] Sparkplug context updated flag set
3700 138300 [OTA Agent Task] [INFO ][OTA MGR][138300] Received eOTA_JobEvent_Fail callback from OTA Agent.
3701 138300 [OTA Agent Task] [prvExecuteHandler] Called handler. Current State [WaitingForFileBlock] Event [ReceivedFileBlock] New state [WaitingForFileBlock]
3702 138301 [OTA Agent Task] [prvIngestDataBlock] Received file block 480, size 4096
3703 138302 [OTA Agent Task] [prvIngestDataBlock] Error (-1) writing file block
3704 138302 [OTA Agent Task] [prvStopRequestTimer] Stopping request timer.
3705 138303 [OTA Agent Task] [prvProcessDataMessage] Aborting due to IngestResult_t error -9
3706 138377 [appCoreTask] [DEBUG][SPRKPLG_PROC][138377] UpdateWiFiMetricsPeriodic
3707 138381 [appCoreTask] [DEBUG][SPRKPLG_PROC][138381] Publishing NDATA message. Attempt #1
Guru Meditation Error: Core 0 panic'ed (LoadProhibited). Exception was unhandled.
Core 0 register dump:
PC : 0x400014fd PS : 0x00060130 A0 : 0x800da444 A1 : 0x3fffbf70
A2 : 0x00000000 A3 : 0xfffffffc A4 : 0x000000ff A5 : 0x0000ff00
A6 : 0x00ff0000 A7 : 0xff000000 A8 : 0x00000000 A9 : 0x00000011
A10 : 0x3fffc35c A11 : 0x3ffbf7d8 A12 : 0x3fffc36d A13 : 0x00010000
A14 : 0x00000001 A15 : 0x00000005 SAR : 0x00000013 EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000000 LBEG : 0x400014fd LEND : 0x4000150d LCOUNT : 0xffffffff
ELF file SHA256: 40c8cf172026c04b
Backtrace: 0x400014fd:0x3fffbf70 0x400da441:0x3fffbf80 0x400d84da:0x3fffc290 0x4011abec:0x3fffc350 0x4011aed3:0x3fffc470 0x40118f85:0x3fffc510 0x40117b11:0x3fffc540 0x40118d76:0x3fffc560
0x400da441: _svfprintf_r at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdio/../../../.././newlib/libc/stdio/vfprintf.c:1529
0x400d84da: snprintf at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdio/../../../.././newlib/libc/stdio/snprintf.c:116
0x4011abec: prvPublishStatusMessage at H:/Work/Klika-tech/Cubic/sources/crrr-cbcq/esp32_firmware/src/amazon-freertos/libraries/freertos_plus/aws/ota/src/mqtt/aws_iot_ota_mqtt.c:395
0x4011aed3: prvUpdateJobStatus_Mqtt at H:/Work/Klika-tech/Cubic/sources/crrr-cbcq/esp32_firmware/src/amazon-freertos/libraries/freertos_plus/aws/ota/src/mqtt/aws_iot_ota_mqtt.c:768
0x40118f85: prvProcessDataHandler at H:/Work/Klika-tech/Cubic/sources/crrr-cbcq/esp32_firmware/src/amazon-freertos/libraries/freertos_plus/aws/ota/src/aws_iot_ota_agent.c:3331
0x40117b11: prvExecuteHandler at H:/Work/Klika-tech/Cubic/sources/crrr-cbcq/esp32_firmware/src/amazon-freertos/libraries/freertos_plus/aws/ota/src/aws_iot_ota_agent.c:3331
0x40118d76: prvOTAAgentTask at H:/Work/Klika-tech/Cubic/sources/crrr-cbcq/esp32_firmware/src/amazon-freertos/libraries/freertos_plus/aws/ota/src/aws_iot_ota_agent.c:3331
Rebooting...
`
To reproduce
Steps to reproduce the behavior:
Code to FIX the bug
Please find the patch here https://gist.github.com/bgklika/ceeaa6fa35c1bdfa31a9e67e220b9c21
The text was updated successfully, but these errors were encountered: