You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We took the 202305 release branch image from community builds and loaded it in Accton-AS7716-32X. Then executed T1 test cases. During this process of execution we could observe a CPU stall issue with the following details:
Issue Description:
Hardlockup, caused by "swapper" linux kernel daemon that moves processes between main memory and secondary storage. In current context, the system was IDLE and this triggered swapper daemon to free the swap space and on that instance got stuck into lockup state.
I have reported the same issue with Accton team and they responded with following analysis -
According to our understanding, usually "NMI watchdog: Watchdog detected" appears due to NOS operating the hardware, but has not waited for a response.
When NOS wants to handle an event interrupt, it will first disable the IRQ, then handle the current interrupt soon, and then enable the IRQ again.
If the action of handling the interrupt is abnormal and the IRQ is not enabled, when this period is the time set by the watchdog, this type of message will print.
Description
We took the 202305 release branch image from community builds and loaded it in Accton-AS7716-32X. Then executed T1 test cases. During this process of execution we could observe a CPU stall issue with the following details:
Issue Description:
Hardlockup, caused by "swapper" linux kernel daemon that moves processes between main memory and secondary storage. In current context, the system was IDLE and this triggered swapper daemon to free the swap space and on that instance got stuck into lockup state.
Image Version:
SONiC-OS-202305.366435-a49860cc7
SONiC NOS Debian kernel (5.10.140-1)
Current Behavior: Board was hung with above CPU stall error message.
Expected Behavior: The board should not be hung and the stall issue should not be seen.
Please suggest if this is a known issue or any solution to avoid this CPU hard lockup stall error.
The text was updated successfully, but these errors were encountered: