Skip to content

UefiCpuPkg/MpInitLib: Ensure AP wake up on WakeUpByInitSipiSipi mode #11215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

philnoh2
Copy link
Contributor

If AP loop mode is not Halt loop, it is found that the AP in ApInitDone state exists the loop mode immediately when WAKEUP_AP_SIGNAL is signaled at the first step of AP wake up. When WakeUpByInitSipiSipi is enabled, it is possible that the AP could clear StartupApSignal buffer before the BSP sends Init-Startup IPI to the AP. In the case, it causes that the BSP frees AP reset vector buffer even though AP wake up is in process. This results in an exception error. To ensure AP wake up, it is expected that the AP is in the known non-executable state before the AP wakeup signal.

For example, the related code in WakeUpAp function is as follow.

WakeUpAP ()
{
  ...
  } else {
    ...
    //
    // Wakeup specified AP
    //
    ASSERT (CpuMpData->InitFlag == ApInitDone);
    *(UINT32 *)CpuData->StartupApSignal = WAKEUP_AP_SIGNAL;
    if (ResetVectorRequired) {
      ...
      } else {
        SendInitSipiSipi (
          CpuInfoInHob[ProcessorNumber].ApicId,
          (UINT32)ExchangeInfo->BufferStart
          );
      }
    }

    //
    // Wait specified AP waken up
    //
    WaitApWakeup (CpuData->StartupApSignal);
  }

  if (ResetVectorRequired) {
    FreeResetVector (CpuMpData);
  }
  ...
}
  • Breaking change?
    • Breaking change - Does this PR cause a break in build or boot behavior?
    • Examples: Does it add a new library class or move a module to a different repo.
  • Impacts security?
    • Security - Does this PR have a direct security impact?
    • Examples: Crypto algorithm change or buffer overflow fix.
  • Includes tests?
    • Tests - Does this PR include any explicit test code?
    • Examples: Unit tests or integration tests.

How This Was Tested

The update has been verified on AMD platforms. We met the case/issue when testing an MP test tool for EFI shell. The tool enables and disables APs sequentially through gEfiMpServiceProtocolGuid->EnableDisableAP service. The error was detected by the stress test (e.g. 10000 times) in the tool. It fixes the issue.

Integration Instructions

N/A

@LiuZhiguang001
Copy link
Contributor

I agree this is an issue, thanks for identifying it.
However, sending Init to AP is too heavy.
How about moving the line "*(UINT32 *)CpuData->StartupApSignal = WAKEUP_AP_SIGNAL;" after InitSipiSipi and before "WaitApWakeup (CpuData->StartupApSignal);"

@philnoh2
Copy link
Contributor Author

I agree this is an issue, thanks for identifying it. However, sending Init to AP is too heavy. How about moving the line "*(UINT32 *)CpuData->StartupApSignal = WAKEUP_AP_SIGNAL;" after InitSipiSipi and before "WaitApWakeup (CpuData->StartupApSignal);"

Thank you for your review and suggestion. Previously we reviewed the suggestion. At that time, we thought that the way wouldn't be applicable because ApWakeupFunction function on the AP (InitSipiSipi) would reach the code that clears StartupApSignal buffer, before the BSP sets WAKEUP_AP_SIGNAL to the buffer. It is found that the function clears the buffer in the middle where is before invoking the AP function. In the case, the AP would not take a chance to clear the buffer in the function. Then, would result in an infinite loop in WaitApWakeup function. It looks like setting WAKEUP_AP_SIGNAL before InitSipiSipi is required on the current code flow for AP wake up.

For the PR update, we considered that currently ResetProcessorToIdleState function is limitedly called in the following cases. We thought that sending Init IPI to AP in the function could have the limited impact.

CheckThisAP/CheckAllAPs functions : In case timeout is raised in the functions (as a failure)
EnableDisableApWorker : In case EnableDisableAP service (PPI, Protocol) is called (e.g. UefiCpuPkg UnitTest calls it)

@LiuZhiguang001
Copy link
Contributor

I agree this is an issue, thanks for identifying it. However, sending Init to AP is too heavy. How about moving the line "*(UINT32 *)CpuData->StartupApSignal = WAKEUP_AP_SIGNAL;" after InitSipiSipi and before "WaitApWakeup (CpuData->StartupApSignal);"

Thank you for your review and suggestion. Previously we reviewed the suggestion. At that time, we thought that the way wouldn't be applicable because ApWakeupFunction function on the AP (InitSipiSipi) would reach the code that clears StartupApSignal buffer, before the BSP sets WAKEUP_AP_SIGNAL to the buffer. It is found that the function clears the buffer in the middle where is before invoking the AP function. In the case, the AP would not take a chance to clear the buffer in the function. Then, would result in an infinite loop in WaitApWakeup function. It looks like setting WAKEUP_AP_SIGNAL before InitSipiSipi is required on the current code flow for AP wake up.

For the PR update, we considered that currently ResetProcessorToIdleState function is limitedly called in the following cases. We thought that sending Init IPI to AP in the function could have the limited impact.

CheckThisAP/CheckAllAPs functions : In case timeout is raised in the functions (as a failure) EnableDisableApWorker : In case EnableDisableAP service (PPI, Protocol) is called (e.g. UefiCpuPkg UnitTest calls it)

Thanks for the detailed explanation. You are right; my suggestion also has issue. :)

If AP loop mode is not Halt loop, it is found that the AP in ApInitDone
state exists the loop mode immediately when WAKEUP_AP_SIGNAL is signaled
at the first step of AP wake up. When WakeUpByInitSipiSipi is enabled,
it is possible that the AP could clear StartupApSignal buffer before the
BSP sends Init-Startup IPI to the AP. In the case, it causes that the BSP
frees AP reset vector buffer even though AP wake up is in process. This
results in an exception error. To ensure AP wake up, it is expected that
the AP is in the known non-executable state before the AP wakeup signal.

Signed-off-by: Phil Noh <Phil.Noh@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants