-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] No signal to coreMQTT-Agent network manager that OTA job has stopped, once OTA is successful #73
Comments
@NightSkySK Thanks for the bug report. We will look into it. |
@paulbartell Did you managed to look or replicate this issue |
Is the same thing I am seeing. I have successfully run the OTA update, but the AWS Jobs console always reports it as a failure, as the expected "SUCCESS" update to job/{job_id}/update never happens. This can be seen here:
The last thing AWS hears (at timestamp 190624) is that the job is still in progress! |
I've finally found some time to catch the exact logs of the situation which I've observed. Full log: Initial device configuartion: I (871) App_Version: Version: t0.2.4
I (871) App_Version: Project Name: mSpaceGate
I (871) App_Version: Compile Time: 13:31:44
I (881) App_Version: Compile Date: Mar 2 2024
I (881) App_Version: SOFTWARE VERSION: 0.2.4
I (911) ota_over_mqtt: OTA over MQTT demo, Application version 0.2.4 Once new job is received the other task are suspended ( I (17291) ota_over_mqtt: Received job message callback, size 492.
I (17301) AWS_OTA: Extracted parameter: [key: value]=[execution.jobId: AFR_OTA-mSpaceGate_t0_2_5]
I (17311) AWS_OTA: Extracted parameter: [key: value]=[execution.jobDocument.afr_ota.streamname: AFR_OTA-a5ccc605-4e35-4db8-a17f-ed918d372f7c]
I (17331) AWS_OTA: Extracted parameter: [key: value]=[execution.jobDocument.afr_ota.protocols: ["MQTT"]]
I (17341) AWS_OTA: Extracted parameter: [key: value]=[filepath: /]
I (17341) AWS_OTA: Extracted parameter: [key: value]=[filesize: 1312496]
I (17351) AWS_OTA: Extracted parameter: [key: value]=[fileid: 0]
I (17361) AWS_OTA: Extracted parameter: [key: value]=[certfile: Code Verify Key]
I (17361) AWS_OTA: Extracted parameter [ sig-sha256-ecdsa: MEUCIQDsSDL3R+0JJpIK0HfC/mrhfjg1... ]
I (17371) AWS_OTA: Job document was accepted. Attempting to begin the update.
I (17381) AWS_OTA: Job parsing success: OtaJobParseErr_t=OtaJobParseErrNone, Job name=AFR_OTA-mSpaceGate_t0_2_5
I (17391) ota_over_mqtt: Received OtaJobEventReceivedJob callback from OTA Agent.
I (17401) AWS_OTA: otaPal_GetPlatformImageState
I (17401) report: OTA started. Preventing coreMQTT-Agent commands from being enqueued.
I (17401) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
I (17411) sub_pub_unsub: OTA started. Preventing coreMQTT-Agent commands from being enqueued.
I (17421) esp_ota_ops: [0] aflags/seq:0x2/0x1, pflags/seq:0xffffffff/0x0
I (17431) core_mqtt_agent_manager: OTA started. After new firmware is fully downloaded device reboot already with new SW version ( I (225191) AWS_OTA: Number of blocks remaining: 1
I (225191) ota_over_mqtt: OTA Event processing completed. Freeing the event buffer to pool.
I (225201) AWS_OTA: Current State=[WaitingForFileBlock], Event=[ReceivedFileBlock], New state=[WaitingForFileBlock]
I (226171) coreMQTT: Publishing message to $aws/things/GA913F0001/streams/AFR_OTA-a5ccc605-4e35-4db8-a17f-ed918d372f7c/get/cbor.
I (226171) ota_over_mqtt: Sent PUBLISH packet to broker $aws/things/GA913F0001/streams/AFR_OTA-a5ccc605-4e35-4db8-a17f-ed918d372f7c/get/cbor to broker.
I (226191) AWS_OTA: Published to MQTT topic to request the next block: topic=$aws/things/GA913F0001/streams/AFR_OTA-a5ccc605-4e35-4db8-a17f-ed918d372f7c/get/cbor
I (226201) AWS_OTA: Current State=[WaitingForFileBlock], Event=[RequestFileBlock], New state=[WaitingForFileBlock]
I (227291) coreMQTT: De-serialized incoming PUBLISH packet: DeserializerResult=MQTTSuccess.
I (227291) coreMQTT: State record updated. New state=MQTTPublishDone.
I (227301) AWS_OTA: Received valid file block: Block index=320, Size=1776
I (227311) AWS_OTA: Received final block of the update.
I (227931) AWS_OTA: Signature verification succeeded.
I (227931) AWS_OTA: Received entire update and validated the signature.
I (228301) coreMQTT: Publishing message to $aws/things/GA913F0001/jobs/AFR_OTA-mSpaceGate_t0_2_5/update.
I (229541) coreMQTT: Ack packet deserialized with result: MQTTSuccess.
I (229541) coreMQTT: State record updated. New state=MQTTPublishDone.
I (229541) ota_over_mqtt: Sent PUBLISH packet to broker $aws/things/GA913F0001/jobs/AFR_OTA-mSpaceGate_t0_2_5/update to broker.
I (229551) ota_over_mqtt: Received OtaJobEventActivate callback from OTA Agent.
I (229541) coreMQTT: De-serialized incoming PUBLISH packet: DeserializerResult=MQTTSuccess.
I (229561) esp_image: segment 0: paddr=001f0020 vaddr=3f400020 size=3aecch (241356) map
I (229571) coreMQTT: State record updated. New state=MQTTPublishDone.
I (229591) ota_over_mqtt: Received update response: $aws/things/GA913F0001/jobs/AFR_OTA-mSpaceGate_t0_2_5/update/accepted{"timestamp":1709385169}af.
I (229651) esp_image: segment 1: paddr=0022aef4 vaddr=3ffbdb60 size=05124h ( 20772)
I (229671) esp_image: segment 2: paddr=00230020 vaddr=400d0020 size=e0210h (918032) map
I (229951) esp_image: segment 3: paddr=00310238 vaddr=3ffc2c84 size=00964h ( 2404)
I (229951) esp_image: segment 4: paddr=00310ba4 vaddr=40080000 size=1fb24h (129828)
I (229991) esp_image: segment 0: paddr=001f0020 vaddr=3f400020 size=3aecch (241356) map
I (230071) esp_image: segment 1: paddr=0022aef4 vaddr=3ffbdb60 size=05124h ( 20772)
I (230081) esp_image: segment 2: paddr=00230020 vaddr=400d0020 size=e0210h (918032) map
I (230351) esp_image: segment 3: paddr=00310238 vaddr=3ffc2c84 size=00964h ( 2404)
I (230361) esp_image: segment 4: paddr=00310ba4 vaddr=40080000 size=1fb24h (129828)
I (230961) wifi:state: run -> init (0)
I (230961) wifi:pm stop, total sleep time: 163061635 us / 227233436 us
I (230961) wifi:<ba-del>idx:0, tid:0
I (230961) wifi:new:<11,0>, old:<11,2>, ap:<255,255>, sta:<11,2>, prof:1
E (230971) wifi:NAN WiFi stop
I (230971) core_mqtt_agent_manager: WiFi disconnected.
I (230971) NETWORKING: Disconnected. Connecting to the AP again...
I (230981) NETWORKING: Disconnected. Connecting to the AP again...
I (231041) wifi:flush txq
I (231041) wifi:stop sw txq
I (231041) wifi:lmac stop hw txq
ets Jul 29 2019 12:21:46
rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:7128
load:0x40078000,len:15692
ho 0 tail 12 room 4
load:0x40080400,len:4
0x40080400: _init at ??:?
load:0x40080404,len:3876
entry 0x4008064c
I (31) boot: ESP-IDF v5.1.2-684-g7cbee80fb9 2nd stage bootloader
I (31) boot: compile time Mar 2 2024 13:10:08
I (33) boot: Multicore bootloader
I (37) boot: chip revision: v3.1
I (40) boot.esp32: SPI Speed : 40MHz
I (45) boot.esp32: SPI Mode : DIO
I (50) boot.esp32: SPI Flash Size : 4MB
I (54) boot: Enabling RNG early entropy source...
I (60) boot: Partition Table:
I (63) boot: ## Label Usage Type ST Offset Length
I (71) boot: 0 nvs WiFi data 01 02 00010000 00006000
I (78) boot: 1 otadata OTA data 01 00 00016000 00002000
I (85) boot: 2 phy_init RF data 01 01 00018000 00001000
I (93) boot: 3 ota_0 OTA app 00 10 00020000 001c2000
I (100) boot: 4 ota_1 OTA app 00 11 001f0000 001c2000
I (108) boot: 5 storage WiFi data 01 02 003b2000 00009000
I (115) boot: 6 settings WiFi data 01 02 003bb000 00010000
I (123) boot: End of partition table
I (167) esp_image: segment 0: paddr=001f0020 vaddr=3f400020 size=3aecch (241356) map
I (255) esp_image: segment 1: paddr=0022aef4 vaddr=3ffbdb60 size=05124h ( 20772) load
I (263) esp_image: segment 2: paddr=00230020 vaddr=400d0020 size=e0210h (918032) map
I (595) esp_image: segment 3: paddr=00310238 vaddr=3ffc2c84 size=00964h ( 2404) load
I (596) esp_image: segment 4: paddr=00310ba4 vaddr=40080000 size=1fb24h (129828) load
I (670) boot: Loaded app from partition at offset 0x1f0000
I (671) boot: Disabling RNG early entropy source...
I (682) cpu_start: Multicore app
I (683) cpu_start: Pro cpu up.
I (683) cpu_start: Starting app cpu, entry point is 0x400814c0
0x400814c0: call_start_cpu1 at C:/Espressif/v5.1/esp-idf/components/esp_system/port/cpu_start.c:159
I (672) cpu_start: App cpu up.
I (703) cpu_start: Pro cpu start user code
I (703) cpu_start: cpu freq: 160000000 Hz
I (703) cpu_start: Application information:
I (708) cpu_start: Project name: mSpaceGate
I (713) cpu_start: App version: t0.2.5
I (718) cpu_start: Compile time: Mar 2 2024 14:04:43
I (724) cpu_start: ELF file SHA256: 3a93f59e049c6ac9...
I (730) cpu_start: ESP-IDF: v5.1.2-684-g7cbee80fb9
I (736) cpu_start: Min chip rev: v0.0
I (741) cpu_start: Max chip rev: v3.99
I (745) cpu_start: Chip rev: v3.1 Verification of new SW version no is ok and new SW is set for testing: I (12473) AWS_OTA: In self test mode.
I (12483) AWS_OTA: New image has a higher version number than the current image: New image version=0.2.5, Previous image version=0.2.4
I (12493) AWS_OTA: Image version is valid: Begin testing file: File ID=0
I (12503) AWS_OTA: otaPal_SetPlatformImageState, 1
W (12503) AWS_OTA: Set image as testing! During validation of new image other tasks are put on hold again: I (16653) AWS_OTA: Job parsing success: OtaJobParseErr_t=OtaJobParseErrNone, Job name=AFR_OTA-mSpaceGate_t0_2_5
I (16663) coreMQTT: State record updated. New state=MQTTPublishDone.
I (16673) ota_over_mqtt: Received OtaJobEventReceivedJob callback from OTA Agent.
I (16683) ota_over_mqtt: Received update response: $aws/things/GA913F0001/jobs/AFR_OTA-mSpaceGate_t0_2_5/update/accepted{"timestamp":1709385188}0:c5:a8",
"firmware": "t0.2.5",
"rssi": -52
}t0_2_5","status":"IN_PROGRESS","statusDetails":{"self_test":"ready","updatedBy":"0x00020004"},"queuedAt":1709384957,"startedAt":1709384966,"lastUpdatedAt":1709385169,"versionNumber":15,"executionNumber":1,"jobDocument":{"afr_ota":{"protocols":["MQTT"],"streamname":"AFR_OTA-a5ccc605-4e35-4db8-a17f-ed918d372f7c","files":[{"filepath":"/","filesize":1312496,"fileid":0,"certfile":"Code Verify Key","sig-sha256-ecdsa":"MEUCIQDsSDL3R+0JJpIK0HfC/mrhfjg1NbLWwWpPmPKyCDlCKQIge2wNUqam3yhEO5pmO5C5IW7zX0AzhJkUxRQZ6D/qkkA="}]}}}}.
I (16693) report: OTA started. Preventing coreMQTT-Agent commands from being enqueued.
I (16693) AWS_OTA: otaPal_GetPlatformImageState
I (16763) sub_pub_unsub: OTA started. Preventing coreMQTT-Agent commands from being enqueued.
I (16773) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
I (16773) core_mqtt_agent_manager: OTA started.
I (16783) esp_ota_ops: [1] aflags/seq:0x1/0x2, pflags/seq:0x2/0x1
I (16793) ota_over_mqtt: OTA Event processing completed. Freeing the event buffer to pool.
I (16803) AWS_OTA: Current State=[CreatingFile], Event=[ReceivedJobDocument], New state=[CreatingFile]
I (16813) AWS_OTA: Beginning self-test.
I (16823) AWS_OTA: otaPal_GetPlatformImageState
I (16823) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
I (16833) esp_ota_ops: [1] aflags/seq:0x1/0x2, pflags/seq:0x2/0x1
I (16833) ota_over_mqtt: Received OtaJobEventStartTest callback from OTA Agent.
I (16843) AWS_OTA: otaPal_SetPlatformImageState, 2
I (16853) AWS_OTA: Set image as valid one! OTA Job is completed, but other tasks never receive flag I (21063) ota_over_mqtt: New image validation succeeded in self test mode.
I (21073) coreMQTT: State record updated. New state=MQTTPublishDone.
I (21073) AWS_OTA: Current State=[WaitingForJob], Event=[StartSelfTest], New state=[WaitingForJob]
I (21083) ota_over_mqtt: Received update response: $aws/things/GA913F0001/jobs/AFR_OTA-mSpaceGate_t0_2_5/update/accepted{"timestamp":1709385192}0:c5:a8",
"firmware": "t0.2.5",
"rssi": -52
}t0_2_5","status":"IN_PROGRESS","statusDetails":{"self_test":"ready","updatedBy":"0x00020004"},"queuedAt":1709384957,"startedAt":1709384966,"lastUpdatedAt":1709385169,"versionNumber":15,"executionNumber":1,"jobDocument":{"afr_ota":{"protocols":["MQTT"],"streamname":"AFR_OTA-a5ccc605-4e35-4db8-a17f-ed918d372f7c","files":[{"filepath":"/","filesize":1312496,"fileid":0,"certfile":"Code Verify Key","sig-sha256-ecdsa":"MEUCIQDsSDL3R+0JJpIK0HfC/mrhfjg1NbLWwWpPmPKyCDlCKQIge2wNUqam3yhEO5pmO5C5IW7zX0AzhJkUxRQZ6D/qkkA="}]}}}}.
I (23153) coreMQTT: De-serialized incoming PUBLISH packet: DeserializerResult=MQTTSuccess.
I (23153) coreMQTT: State record updated. New state=MQTTPubAckSend.
I (23153) ota_over_mqtt: Received job message callback, size 24.
I (23163) AWS_OTA: No active job available in received job document: OtaJobParseErr_t=OtaJobParseErrNoActiveJobs
I (23173) ota_over_mqtt: Received OtaJobEventNoActiveJob callback from OTA Agent.
I (23183) AWS_OTA: otaPal_GetPlatformImageState
I (23183) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
I (23193) esp_ota_ops: [1] aflags/seq:0x2/0x2, pflags/seq:0x2/0x1
I (23203) ota_over_mqtt: OTA Event processing completed. Freeing the event buffer to pool.
I (23213) AWS_OTA: Current State=[WaitingForJob], Event=[ReceivedJobDocument], New state=[CreatingFile]
I (31943) ota_over_mqtt: OTA agent suspend by timer.
W (32943) AWS_OTA: OTA Timer handle NULL for Timerid=0, can't stop.
I (32943) AWS_OTA: OTA Agent is suspended.
I (32943) AWS_OTA: Current State=[Suspended], Event=[Suspend], New state=[Suspended]
I (152943) ota_over_mqtt: OTA agent resumed by timer.
I (153943) AWS_OTA: otaPal_GetPlatformImageState
I (153943) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
I (153943) esp_ota_ops: [1] aflags/seq:0x2/0x2, pflags/seq:0x2/0x1
I (153943) AWS_OTA: Current State=[RequestingJob], Event=[Resume], New state=[RequestingJob]
I (154393) ota_over_mqtt: Subscribed to topic $aws/things/GA913F0001/jobs/notify-next.
I (154393) AWS_OTA: Subscribed to MQTT topic: $aws/things/GA913F0001/jobs/notify-next
I (155393) coreMQTT: Publishing message to $aws/things/GA913F0001/jobs/$next/get.
I (156603) coreMQTT: Ack packet deserialized with result: MQTTSuccess.
I (156603) coreMQTT: State record updated. New state=MQTTPublishDone.
I (156603) ota_over_mqtt: Sent PUBLISH packet to broker $aws/things/GA913F0001/jobs/$next/get to broker.
I (156603) coreMQTT: De-serialized incoming PUBLISH packet: DeserializerResult=MQTTSuccess.
I (156613) AWS_OTA: Current State=[WaitingForJob], Event=[RequestJobDocument], New state=[WaitingForJob]
I (156623) coreMQTT: State record updated. New state=MQTTPublishDone.
I (156643) ota_over_mqtt: Received job message callback, size 53.
I (156653) AWS_OTA: No active job available in received job document: OtaJobParseErr_t=OtaJobParseErrNoActiveJobs
I (156663) ota_over_mqtt: Received OtaJobEventNoActiveJob callback from OTA Agent.
I (156663) AWS_OTA: otaPal_GetPlatformImageState
I (156673) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
I (156683) esp_ota_ops: [1] aflags/seq:0x2/0x2, pflags/seq:0x2/0x1
I (156683) ota_over_mqtt: OTA Event processing completed. Freeing the event buffer to pool.
I (156693) AWS_OTA: Current State=[WaitingForJob], Event=[ReceivedJobDocument], New state=[CreatingFile] Sorry for big number of logs, but I tried to explain as detailed as I can. |
I've tried to update the static void prvOtaAppCallback(OtaJobEvent_t event, void *pData) {
OtaErr_t err = OtaErrUninitialized;
switch (event) {
case OtaJobEventActivate:
ESP_LOGI(TAG, "Received OtaJobEventActivate callback from OTA Agent.");
/**
* Activate the new firmware image immediately. Applications can choose to
* postpone the activation to a later stage if needed.
*/
err = OTA_ActivateNewImage();
/**
* Activation of the new image failed. This indicates an error that requires
* a follow up through manual activation by resetting the device. The demo
* reports the error and shuts down the OTA agent.
*/
ESP_LOGE(TAG, "New image activation failed.");
/* Shutdown OTA Agent, if it is required that the unsubscribe operations are
* not performed while shutting down please set the second parameter to 0
* instead of 1. */
OTA_Shutdown(0, 1);
break;
case OtaJobEventFail:
ESP_LOGI(TAG, "Received an OtaJobEventFail notification from OTA Agent.");
/* Signal coreMQTT-Agent network manager that an OTA job has stopped. */
xCoreMqttAgentManagerPost(CORE_MQTT_AGENT_OTA_STOPPED_EVENT);
break;
case OtaJobEventStartTest:
/* This demo just accepts the image since it was a good OTA update and
* networking and services are all working (or we would not have made it
* this far). If this were some custom device that wants to test other
* things before validating new image, this would be the place to kick off
* those tests before calling OTA_SetImageState() with the final result of
* either accepted or rejected. */
ESP_LOGI(TAG, "Received OtaJobEventStartTest callback from OTA Agent.");
err = OTA_SetImageState(OtaImageStateAccepted);
if (err == OtaErrNone) {
ESP_LOGI(TAG, "New image validation succeeded in self test mode.");
} else {
ESP_LOGE(TAG, "Failed to set image state as accepted with error %d.",
err);
}
break;
case OtaJobEventProcessed:
ESP_LOGI(
TAG,
"OTA Event processing completed. Freeing the event buffer to pool.");
configASSERT(pData != NULL);
prvOTAEventBufferFree((OtaEventData_t *)pData);
break;
case OtaJobEventSelfTestFailed:
ESP_LOGI(TAG,
"Received OtaJobEventSelfTestFailed callback from OTA Agent.");
/* Requires manual activation of previous image as self-test for
* new image downloaded failed.*/
ESP_LOGE(TAG,
"OTA Self-test failed for new image. shutting down OTA Agent.");
/* Shutdown OTA Agent, if it is required that the unsubscribe operations are
* not performed while shutting down please set the second parameter to 0
* instead of 1. */
OTA_Shutdown(0, 1);
/* Signal coreMQTT-Agent network manager that an OTA job has stopped. */
xCoreMqttAgentManagerPost(CORE_MQTT_AGENT_OTA_STOPPED_EVENT);
break;
case OtaJobEventReceivedJob:
ESP_LOGI(TAG, "Received OtaJobEventReceivedJob callback from OTA Agent.");
/* Signal coreMQTT-Agent network manager that an OTA job has started. */
xCoreMqttAgentManagerPost(CORE_MQTT_AGENT_OTA_STARTED_EVENT);
break;
case OtaJobEventUpdateComplete:
ESP_LOGI(TAG,
"Received OtaJobEventUpdateComplete callback from OTA Agent.");
/* Signal coreMQTT-Agent network manager that an OTA job has stopped. */
xCoreMqttAgentManagerPost(CORE_MQTT_AGENT_OTA_STOPPED_EVENT);
break;
case OtaJobEventNoActiveJob:
ESP_LOGI(TAG, "Received OtaJobEventNoActiveJob callback from OTA Agent.");
/* No more jobs available in IoTCore, no further actions on this event. */
break;
default:
ESP_LOGW(TAG,
"Received an unhandled callback event from OTA Agent, "
"event = %d",
event);
break;
}
} but there is the same behaviour The solution which works for me is to add the static void prvOtaAppCallback(OtaJobEvent_t event, void *pData) {
OtaErr_t err = OtaErrUninitialized;
switch (event) {
case OtaJobEventActivate:
ESP_LOGI(TAG, "Received OtaJobEventActivate callback from OTA Agent.");
/**
* Activate the new firmware image immediately. Applications can choose to
* postpone the activation to a later stage if needed.
*/
err = OTA_ActivateNewImage();
/**
* Activation of the new image failed. This indicates an error that requires
* a follow up through manual activation by resetting the device. The demo
* reports the error and shuts down the OTA agent.
*/
ESP_LOGE(TAG, "New image activation failed.");
/* Shutdown OTA Agent, if it is required that the unsubscribe operations are
* not performed while shutting down please set the second parameter to 0
* instead of 1. */
OTA_Shutdown(0, 1);
break;
case OtaJobEventFail:
ESP_LOGI(TAG, "Received an OtaJobEventFail notification from OTA Agent.");
/* Signal coreMQTT-Agent network manager that an OTA job has stopped. */
xCoreMqttAgentManagerPost(CORE_MQTT_AGENT_OTA_STOPPED_EVENT);
break;
case OtaJobEventStartTest:
/* This demo just accepts the image since it was a good OTA update and
* networking and services are all working (or we would not have made it
* this far). If this were some custom device that wants to test other
* things before validating new image, this would be the place to kick off
* those tests before calling OTA_SetImageState() with the final result of
* either accepted or rejected. */
ESP_LOGI(TAG, "Received OtaJobEventStartTest callback from OTA Agent.");
err = OTA_SetImageState(OtaImageStateAccepted);
if (err == OtaErrNone) {
ESP_LOGI(TAG, "New image validation succeeded in self test mode.");
} else {
ESP_LOGE(TAG, "Failed to set image state as accepted with error %d.",
err);
}
xCoreMqttAgentManagerPost(CORE_MQTT_AGENT_OTA_STOPPED_EVENT);
break;
case OtaJobEventProcessed:
ESP_LOGI(
TAG,
"OTA Event processing completed. Freeing the event buffer to pool.");
configASSERT(pData != NULL);
prvOTAEventBufferFree((OtaEventData_t *)pData);
break;
case OtaJobEventSelfTestFailed:
ESP_LOGI(TAG,
"Received OtaJobEventSelfTestFailed callback from OTA Agent.");
/* Requires manual activation of previous image as self-test for
* new image downloaded failed.*/
ESP_LOGE(TAG,
"OTA Self-test failed for new image. shutting down OTA Agent.");
/* Shutdown OTA Agent, if it is required that the unsubscribe operations are
* not performed while shutting down please set the second parameter to 0
* instead of 1. */
OTA_Shutdown(0, 1);
/* Signal coreMQTT-Agent network manager that an OTA job has stopped. */
xCoreMqttAgentManagerPost(CORE_MQTT_AGENT_OTA_STOPPED_EVENT);
break;
case OtaJobEventReceivedJob:
ESP_LOGI(TAG, "Received OtaJobEventReceivedJob callback from OTA Agent.");
/* Signal coreMQTT-Agent network manager that an OTA job has started. */
xCoreMqttAgentManagerPost(CORE_MQTT_AGENT_OTA_STARTED_EVENT);
break;
case OtaJobEventNoActiveJob:
ESP_LOGI(TAG, "Received OtaJobEventNoActiveJob callback from OTA Agent.");
/* No more jobs available in IoTCore, no further actions on this event. */
break;
default:
ESP_LOGW(TAG,
"Received an unhandled callback event from OTA Agent, "
"event = %d",
event);
break;
}
} LOG: with this function definition both task are start working again after OTA completed: I (21180) ota_over_mqtt: New image validation succeeded in self test mode.
I (21190) coreMQTT: State record updated. New state=MQTTPublishDone.
I (21190) report: OTA stopped. No longer preventing coreMQTT-Agent commands from being enqueued.
I (21190) AWS_OTA: Current State=[WaitingForJob], Event=[StartSelfTest], New state=[WaitingForJob]
I (21210) sub_pub_unsub: OTA stopped. No longer preventing coreMQTT-Agent commands from being enqueued.
I (21200) ota_over_mqtt: Received update response: $aws/things/GA913F0001/jobs/AFR_OTA-mSpaceLock-t0_2_7/update/accepted{"timestamp":1709393030}0:c5:a8",
"firmware": "t0.2.7",
"rssi": -56
}t0_2_7","status":"IN_PROGRESS","statusDetails":{"self_test":"ready","updatedBy":"0x00020006"},"queuedAt":1709392696,"startedAt":1709392802,"lastUpdatedAt":1709393006,"versionNumber":15,"executionNumber":1,"jobDocument":{"afr_ota":{"protocols":["MQTT"],"streamname":"AFR_OTA-508f39ed-54b2-44e9-bb2d-37c3a845ab0f","files":[{"filepath":"/","filesize":1312512,"fileid":0,"certfile":"Code Verify Key","sig-sha256-ecdsa":"MEQCIBkZyQTR8ho/2aZAoLz07TGXPEbKzjneoCg+yCeXBjiUAiAsoCRg4Fd6bIk5cC+h9YmLEDylKOo1jHGsaRfu9pKU7g=="}]}}}}.
I (21230) core_mqtt_agent_manager: OTA stopped.
I (21780) coreMQTT: De-serialized incoming PUBLISH packet: DeserializerResult=MQTTSuccess.
I (21780) coreMQTT: State record updated. New state=MQTTPubAckSend.
I (21780) ota_over_mqtt: Received job message callback, size 24.
I (21790) AWS_OTA: No active job available in received job document: OtaJobParseErr_t=OtaJobParseErrNoActiveJobs
I (21800) ota_over_mqtt: Received OtaJobEventNoActiveJob callback from OTA Agent.
I (21810) AWS_OTA: otaPal_GetPlatformImageState
I (21810) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
I (21820) esp_ota_ops: [1] aflags/seq:0x2/0x4, pflags/seq:0x2/0x3
I (21820) ota_over_mqtt: OTA Event processing completed. Freeing the event buffer to pool.
I (21830) AWS_OTA: Current State=[WaitingForJob], Event=[ReceivedJobDocument], New state=[CreatingFile]
I (24740) sub_pub_unsub: Task "SubPub0" sending subscribe request to coreMQTT-Agent for topic filter: /filter/SubPub0 with id 4
I (25050) sub_pub_unsub: Subscribe 4 for topic filter /filter/SubPub0 succeeded for task "SubPub0".
I (25050) sub_pub_unsub: Task "SubPub0" sending publish request to coreMQTT-Agent with message "SubPub0" on topic "/filter/SubPub0" with ID 5.
I (25070) sub_pub_unsub: Task "SubPub0" waiting for publish 5 to complete.
I (26050) coreMQTT: Publishing message to /filter/SubPub0. I will prepare pull request with that change proposal. |
My fix was this. Not sure if this lines up with the desired flow...
|
Hi,
It seems appropriate to trigger xCoreMqttAgentManagerPost(CORE_MQTT_AGENT_OTA_STOPPED_EVENT) again at this point to resume other tasks. However, if the OTA is shut down here, the entire OTA service would be disabled after this. The reference code provided is intended to demonstrate the ability to run OTA with the simplest error handling approach. I will test it with xCoreMqttAgentManagerPost(CORE_MQTT_AGENT_OTA_STOPPED_EVENT) added and consider raising a PR for the same. Appreciate the contributions from all of you. |
Closing as PRs are merged. Thanks for your contributions! |
Describe the bug
After OTA job is finished and new firmware is verified successfully, there is no signal to coreMQTT-Agent network manager that an OTA job has stopped.
The signal is released only if
OtaJobEventFail
System information
"Code to reproduce bug"
.Expected behavior
After OtaJobEventStartTest is done and image validation is finished there should be signal
xCoreMqttAgentManagerPost(CORE_MQTT_AGENT_OTA_STOPPED_EVENT)
like this:
and only after this flag other tasks can start their activity again.
Actually based on FreeRTOS documentation there should be one more case
OtaJobEventUpdateComplete = 7
, but I can't find this event in : https://github.com/FreeRTOS/iot-reference-esp32c3/blob/main/main/demo_tasks/ota_over_mqtt_demo/ota_over_mqtt_demo.c#L516-L609Additional question
In case
OtaJobEventSelfTestFailed
OTA_Shutdown is executed, should program also triggerxCoreMqttAgentManagerPost(CORE_MQTT_AGENT_OTA_STOPPED_EVENT);
in that case or ot will be covered byOtaJobEventFail
any way?Screenshots or console output
Sorry, I didn't record console output for that case. How ever after sucessful OTA update other task kept beeing inactive. And only way to recover from this stage was to manually reboot the device
The text was updated successfully, but these errors were encountered: