Skip to content

Commit

Permalink
IPMI: Trigger OPAL TI in abort path.
Browse files Browse the repository at this point in the history
The current assert/abort implementation for BMC based system invokes cec
reboot after printing backtrace. This means that BMC never gets notified
about OPAL crash/termination. This sometimes leads into never ending
IPL-ing loop if OPAL keeps aborting very early in boot path.

Trigger a software xstop (OPAL TI) to inform BMC about the OPAL
termination. BMC is capable of catching checkstop signal and facilitate in
rebooting (IPL-ing) host.

With AutoReboot policy, OpenBMC handles checkstop signals and counts them
against the reboot counter. In cases where OPAL is crashing before host
reaches to runtime, OpenBMC will move the system in Quiesced state after 3
or so attempts of IPL/reboot so that system can be debugged. When OPAL
triggers software checkstop it causes all the CPU threads to be stooped and
moved to quiesced state. Hence OPAL don't need to explicitly stop all CPUs
before calling software xstop.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
  • Loading branch information
maheshsal authored and oohal committed Nov 14, 2019
1 parent 83a92e3 commit a810d1f
Showing 1 changed file with 23 additions and 7 deletions.
30 changes: 23 additions & 7 deletions hw/ipmi/ipmi-attn.c
Expand Up @@ -14,6 +14,7 @@
#include <skiboot.h>
#include <stack.h>
#include <timebase.h>
#include <xscom.h>

/* Use same attention SRC for BMC based machine */
DEFINE_LOG_ENTRY(OPAL_RC_ATTN, OPAL_PLATFORM_ERR_EVT,
Expand Down Expand Up @@ -67,18 +68,33 @@ void __attribute__((noreturn)) ipmi_terminate(const char *msg)
*/
p9_sbe_terminate();

/* Terminate called before initializing IPMI (early abort) */
if (!ipmi_present()) {
if (platform.cec_reboot)
platform.cec_reboot();
goto out;
}
/*
* Trigger software xstop (OPAL TI). It will stop all the CPU threads
* moving them into quiesced state. OCC will collect all FIR data.
* Upon checkstop signal, BMC will then decide whether to reboot/IPL or
* not depending on AutoReboot policy, if any. This helps in cases
* where OPAL is crashing/terminating before host reaches to runtime.
* With OpenBMC AutoReboot policy, in such cases, it will make sure
* that system is moved to Quiesced state after 3 or so attempts to
* IPL. Without OPAL TI, OpenBMC will never know that OPAL is
* terminating and system would go into never ending IPL'ing loop.
*
* Once the system reaches to runtime, OpenBMC resets the boot counter.
* Hence next time when BMC receieves the OPAL TI, it will IPL the
* system if AutoReboot is enabled. We don't need to worry about self
* rebooting.
*/

xscom_trigger_xstop();
/*
* Control will not reach here if software xstop has been supported and
* enabled. If not supported then fallback to cec reboot path below.
*/

/* Reboot call */
if (platform.cec_reboot)
platform.cec_reboot();

out:
while (1)
time_wait_ms(100);
}

0 comments on commit a810d1f

Please sign in to comment.