Skip to content

Commit

Permalink
core/direct-controls: improve p9_stop_thread error handling
Browse files Browse the repository at this point in the history
p9_stop_thread should fail the operation if it finds the thread was
already quiescd. This implies something else is doing direct controls
on the thread (e.g., pdbg) or there is some exceptional condition we
don't know how to deal with. Proceeding here would cause things to
trample on each other, for example the hard lockup watchdog trying to
send a sreset to the core while it is stopped for debugging with pdbg
will end in tears.

If p9_stop_thread times out waiting for the thread to quiesce, do
not hit it with a core_start direct control, because we don't know
what state things are in and doing more things at this point is worse
than doing nothing. There is no good recipe described in the workbook
to de-assert the core_stop control if it fails to quiesce the thread.
After timing out here, the thread may eventually quiesce and get
stuck, but that's simpler to debug than undefied behaviour.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
  • Loading branch information
npiggin authored and stewartsmith committed May 6, 2018
1 parent 5a1463d commit 0e27cc8
Showing 1 changed file with 5 additions and 9 deletions.
14 changes: 5 additions & 9 deletions core/direct-controls.c
Original file line number Diff line number Diff line change
Expand Up @@ -496,10 +496,12 @@ static int p9_stop_thread(struct cpu_thread *cpu)
rc = p9_thread_quiesced(cpu);
if (rc < 0)
return rc;
if (rc)
prlog(PR_WARNING, "Stopping thread %u:%u:%u warning:"
" thread is quiesced already.\n",
if (rc) {
prlog(PR_ERR, "Could not stop thread %u:%u:%u:"
" Thread is quiesced already.\n",
chip_id, core_id, thread_id);
return OPAL_BUSY;
}

if (xscom_write(chip_id, dctl_addr, P9_THREAD_STOP(thread_id))) {
prlog(PR_ERR, "Could not stop thread %u:%u:%u:"
Expand All @@ -522,12 +524,6 @@ static int p9_stop_thread(struct cpu_thread *cpu)
" Unable to quiesce thread.\n",
chip_id, core_id, thread_id);

if (xscom_write(chip_id, dctl_addr, P9_THREAD_CONT(thread_id))) {
prlog(PR_ERR, "Could not resume thread %u:%u:%u:"
" Unable to write EC_DIRECT_CONTROLS.\n",
chip_id, core_id, thread_id);
}

return OPAL_HARDWARE;
}

Expand Down

0 comments on commit 0e27cc8

Please sign in to comment.