Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lorawan: FUOTA callback funciton mistakenly called after early exit failure from too many missing fragments #72764

Open
knitHacker opened this issue May 14, 2024 · 2 comments
Assignees
Labels
area: LoRa bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@knitHacker
Copy link

Describe the bug
The issue occurs when trying to complete a FUOTA session through the Lorawan stack. There is a log message indicating a fuota session has completed and the user's callback function is called even though the fuota session failed due to missing fragments from the block not being received.

In the FUOTA session if the missing number of fragments is larger than the max redundancy the function FragDecoderProcess (in the loramac-node code base) will return FRAG_SESSION_FINISHED and mark the the error by setting FragDecoder.Status.MatrixError to 1. When processing data fragments in the lorawan frag_transport service the return status assumes that if FRAG_SESSION_FINISHED value is returned then the session completed successfully and finishes up the session as if it had succeeded without checking the MatrixError value (called memory_error in the zephyr lorawan service code). This will also call the user's call back function which assumes that the fuota session succeeded along with some misleading log messages indicating it succeeded. The only time the memory_error is examined seems to be in the frag status response message.

Please also mention any information which could help others to understand
the problem you're facing:
I am working off a custom board with a STM32WL and using AWS for the FUOTA server.

To Reproduce
Steps to reproduce the behavior:
Set the CONFIG_LORAWAN_FRAG_TRANSPORT_MAX_REDUNDANCY very low and start a session with a large file to force more missing frags than the redundancy could account for when reassembling the file.

I believe I first saw it when trying to do multiple AWS fuota sessions in succession while debugging. After starting a new session before the previous one was complete AWS would start sending from where it left off in the previous session after sending the new session setup message. This meant that the device didn't receive the first part of the block or even worse only received the redundancy packets without any of the original fragments.

Expected behavior
It should not call the user callback function meant to be called after a successful FUOTA session and it shouldn't log that the FUOTA has finished successfully.

Impact
The code will try to load an incomplete binary into memory to be booted from in the next reboot which could occur after the user's callback function is called since that is what the suggested usage for that callback function.

Environment (please complete the following information):

@knitHacker knitHacker added the bug The issue is a bug, or the PR is fixing a bug label May 14, 2024
Copy link

Hi @knitHacker! We appreciate you submitting your first issue for our open-source project. 🌟

Even though I'm a bot, I can assure you that the whole community is genuinely grateful for your time and effort. 🤖💙

@martinjaeger
Copy link
Member

Thanks @knitHacker for your report. I'm quite busy with other work at the moment, but will have a look asap, probably next week.

@nashif nashif added the priority: low Low impact/importance bug label May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: LoRa bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

No branches or pull requests

6 participants