Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hardfault handler to SOS again #2381

Merged
merged 2 commits into from
Dec 7, 2021

Conversation

elcojacobs
Copy link
Contributor

@elcojacobs elcojacobs commented Dec 4, 2021

This fixes the broken hardfault handler as reported in #2380

I added the .align 4 line, which I think might be the actual fix for the compiler warning that the commit that introduced the bug intended to address.

It is very easy to test the hard fault handler, so it seems that the compiler warning was 'fixed' in the commit below without any testing! 馃槰

7b6e208

I think that you are actually causing a hardfault in the hardfault handler now, resulting in an endless loop! @avtolstoy @sergeuz

Test app:

void setup()
{
    int (*bad_instruction)(void) = nullptr;
    bad_instruction();
}

void loop()
{
    HAL_Delay_Milliseconds(1);
}

@elcojacobs
Copy link
Contributor Author

elcojacobs commented Dec 5, 2021

It seems that the hardfault handler is part of the system layer, so it cannot be easily hotfixed without changing how the system layer is downloaded to the device.
Because this is not fixable in the user app, we depend on Particle to do a hotfix release ASAP.

Or does the hardfault handler have weak linkage so we can override it?
That would be useful anyway, so we can store the address causing the hardfault in backup ram and report on reboot.

@elcojacobs elcojacobs changed the title Fix/hardfault handler Fix hardfault handler to SOS again Dec 5, 2021
@sergeuz
Copy link
Member

sergeuz commented Dec 7, 2021

Thanks for the PR, @elcojacobs. I'm not sure how this issue sneaked through testing, but your fix resolves it for me 馃憤

@sergeuz sergeuz added this to the 3.2.0 milestone Dec 7, 2021
@sergeuz sergeuz added the bug label Dec 7, 2021
@elcojacobs
Copy link
Contributor Author

Thanks for merging the fix. Will build this into a release soon too?

@sergeuz
Copy link
Member

sergeuz commented Dec 7, 2021

Will build this into a release soon too?

We will include this fix in the next 3.x release that we're hoping to get out this month.

@sergeuz sergeuz merged commit 5af90dd into particle-iot:develop Dec 7, 2021
@elcojacobs
Copy link
Contributor Author

Can you please do better than 'this month'?

This is a critical bug that causes a complete device freeze. It causes large batches of beer to be ruined for my customers. Note that even the watchdog stops working with this bug.

You don't have to fast track 3.2.0 for me, just merge only the hardfault fix into the last release candidate.

@elcojacobs
Copy link
Contributor Author

Can you discuss a hotfix release internally and inform us of the outcome?
This fix doesn't conflict with any other changes and should be an easy merge and release. I think the type of bug definitely warrants a hotfix release.

Waiting more than a week is unacceptable for my customers. If you cannot do a hotfix release, we have two options:

  • Force all our customers to do a temporary flash over usb with hotfixed system images we compile ourselves. We already flash the user app over local wifi using ymodem, but the system layer comes from the cloud.
  • Change the app to download the system images from our servers instead of particle. Where would we make that change and how should we host the binaries?

@sergeuz
Copy link
Member

sergeuz commented Dec 8, 2021

That's the only timeline I have at the moment, but the possibility of a hotfix release is being discussed in the context of your ticket for the original issue.

@keeramis keeramis mentioned this pull request Jan 13, 2022
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants