Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Watchdog timer failing to trigger on Archim 2.0 #13558

Closed
marcio-ao opened this issue Apr 2, 2019 · 4 comments
Closed

[BUG] Watchdog timer failing to trigger on Archim 2.0 #13558

marcio-ao opened this issue Apr 2, 2019 · 4 comments

Comments

@marcio-ao
Copy link
Contributor

On Archim 2.0, we have noticed some rare occasions where a printer locks up and going into thermal runaway without triggering a reboot. The printer's UI is frozen and unresponsive.

I suspect the underlying cause may be a corrupted USB flash drive or noise on the ribbon cable, but in any case, it seems like the watchdog timer should prevent the printer from locking up and entering thermal runaway. However, this has not been happening in all cases.

Looking through the code, I noticed that very few places that call watchdog_reset(), which is reassuring, but Marlin/src/sd/Sd2Card.cpp stands out as possibly a place where it should not be called. In the interest of safety, perhaps Sd2Card should call thermalManager.manage_heater(); rather than watchdog_reset().

Unfortunately, since we have USB_FLASH_DRIVE_SUPPORT enabled, that code isn't even reached, so something else must be causing the watchdog timer fail to activate on the Archim. Any thoughts?

@boelle boelle changed the title Watchdog timer failing to trigger on Archim 2.0 [BUG] Watchdog timer failing to trigger on Archim 2.0 Jul 21, 2019
@boelle boelle closed this as completed Jul 21, 2019
@boelle boelle reopened this Jul 24, 2019
@marcio-ao
Copy link
Contributor Author

Since there was some activity on this, let me add some more information. We had some additional instances where this was happening and I suspect that the watchdog timer is effective for code that runs in a infinite loop (such as "for(;;)"), but it is not effective in code that gets stuck in a recursive loop that happens to involve a call idle(). In the case where we had a watchdog failure, our UI was requesting a toolhead change, but changing the toolhead somehow causing the idle() function to be called before active_extruder was changed, which caused our UI to request another toolhead change, etc. This recursion was such that I presume only parts of the idle() function had a chance to execute.

One thing I noticed is that watchdog_reset() is called in the code that reads the temperature. It seems like it should be reset in the code that compares the read temperature to the target temperature and turns the heaters on or off based on that measurement. What keeps your house from burning down is not measuring the temperature -- it is turning off the heater when the temperature is too hot.

@boelle
Copy link
Contributor

boelle commented Sep 24, 2019

@marcio-ao i assume this is still a problem?

@marcio-ao
Copy link
Contributor Author

@boelle: We haven't encountered this issue in a while. Closing this ticket.

@github-actions
Copy link

github-actions bot commented Jul 4, 2020

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Jul 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants