Skip to content

Commit

Permalink
core/fast-reset: Do not enable by default
Browse files Browse the repository at this point in the history
Fast reboot started life as a debug hack and it escaped into the wild when
Stewart enabled it by default. There was some reasons for this, but the
main one is that a full reboot takes somewhere between one and five
minutes. For those of us who spend all day rebooting their POWER systems
this is great, but the utility for end users has always been pretty
questionable.

Rebooting a system should be a fairly infrequent activity in the field
with the main reasons for doing one being:

1) Kernel updates,
2) Misbehaving hardware

Although 1) can be performed by kexec we have found that it fails due to
2) occasionally. The reason for 2) is usually hardware getting itself
into a bad state. The universal fix for that type of hardware problem
is turning the hardware off and back on again so it's preferable that
a reboot actually does that.

This patch refactors the reboot handling OPAL calls so that fast-reboot
is only used by default when explicitly enabled, or manually invoked.
This allows developers to continue to use fast-reboot without expecting
users deal with its quirks (and understand how a "normal" reboot,
fast-reboot and MPIPL differ).

This has two user visible changes:

1. Full reboot is now the default. In order to get fast-reboot as the
   default the nvram option needs to be set:

	nvram -p ibm,skiboot --update-config fast-reset=1

2. The nvram option to force a fast-reboot even when some part of
   skiboot has called disable_fast_reboot() has changed from
   'fast-reset=im-feeling-lucky' to 'force-fast-reset=1' because
   it's impossible to actually use that 'feature' if fast-reboot is
   off by default.

	nvram -p ibm,skiboot --update-config force-fast-reset=1

Cc: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
  • Loading branch information
oohal committed Feb 26, 2020
1 parent 10bbcd0 commit ee07f2c
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 20 deletions.
2 changes: 1 addition & 1 deletion core/fast-reboot.c
Expand Up @@ -118,7 +118,7 @@ void fast_reboot(void)
}

if (fast_reboot_disabled &&
nvram_query_eq_dangerous("fast-reset", "im-feeling-lucky")) {
nvram_query_eq_dangerous("force-fast-reset", "1")) {
/* Do fast reboot even if it's been disabled */
prlog(PR_NOTICE, "RESET: Ignoring fast reboot disabled: %s\n",
fast_reboot_disabled);
Expand Down
43 changes: 24 additions & 19 deletions core/platform.c
Expand Up @@ -48,32 +48,37 @@ static int64_t opal_cec_power_down(uint64_t request)
}
opal_call(OPAL_CEC_POWER_DOWN, opal_cec_power_down, 1);

static int64_t opal_cec_reboot(void)
static int64_t full_reboot(void)
{
prlog(PR_NOTICE, "OPAL: Reboot request...\n");

opal_quiesce(QUIESCE_HOLD, -1);

if (proc_gen == proc_gen_p8) {
/*
* Bugs in P8 mean fast reboot isn't 100% reliable when cores
* are busy, so only attempt if explicitly *enabled*.
*/
if (nvram_query_eq_safe("fast-reset", "1"))
fast_reboot();

} else if (!nvram_query_eq_safe("fast-reset", "0")) {
/* Try fast-reset unless explicitly disabled */
fast_reboot();
}

console_complete_flush();

if (platform.cec_reboot)
return platform.cec_reboot();

return OPAL_SUCCESS;
}

static int64_t opal_cec_reboot(void)
{
opal_quiesce(QUIESCE_HOLD, -1);

/*
* Fast-reset was enabled by default for a long time in an attempt to
* make it more stable by exercising it more frequently. This resulted
* in a fair amount of pain due to mis-behaving hardware and confusion
* about what a "reset" is supposed to do exactly. Additionally,
* secure variables require a full reboot to work at all.
*
* Due to all that fast-reset should only be used if it's explicitly
* enabled. It started life as a debug hack and should remain one.
*/
if (nvram_query_eq_safe("fast-reset", "1"))
fast_reboot();

return full_reboot();
}
opal_call(OPAL_CEC_REBOOT, opal_cec_reboot, 0);

static int64_t opal_cec_reboot2(uint32_t reboot_type, char *diag)
Expand Down Expand Up @@ -105,15 +110,15 @@ static int64_t opal_cec_reboot2(uint32_t reboot_type, char *diag)
console_complete_flush();
return xscom_trigger_xstop();
case OPAL_REBOOT_FULL_IPL:
disable_fast_reboot("full IPL reboot requested");
return opal_cec_reboot();
prlog(PR_NOTICE, "Reboot: Full reboot requested");
return full_reboot();
case OPAL_REBOOT_MPIPL:
prlog(PR_NOTICE, "Reboot: OS reported error. Performing MPIPL\n");
console_complete_flush();
if (platform.terminate)
platform.terminate("OS reported error. Performing MPIPL\n");
else
opal_cec_reboot();
full_reboot();
for (;;);
break;
case OPAL_REBOOT_FAST:
Expand Down

0 comments on commit ee07f2c

Please sign in to comment.