Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various fixes for unleashed board #1

Merged
merged 5 commits into from
Dec 21, 2018
Merged

Various fixes for unleashed board #1

merged 5 commits into from
Dec 21, 2018

Conversation

atishp04
Copy link
Collaborator

No description provided.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
@avpatel avpatel merged commit f02926c into master Dec 21, 2018
@avpatel avpatel deleted the atish_fix_1 branch December 23, 2018 15:05
avpatel pushed a commit that referenced this pull request Mar 10, 2023
…ready

When a boot hart executes sbi_hsm_hart_start() to start a secondary hart,
next_arg1, next_addr and next_mode for the latter are stored in the scratch
area after the state has been set to SBI_HSM_STATE_START_PENDING.

The secondary hart waits in the loop with wfi() in sbi_hsm_hart_wait() at
that time. However, "wfi" instruction is not guaranteed to wait for an
interrupt to be received by the hart, it is just a hint for the CPU.
According to RISC-V Privileged Architectures spec. v20211203, even an
implementation of "wfi" as "nop" is legal.

So, the secondary might leave the loop in sbi_hsm_hart_wait() as soon as
its state has been set to SBI_HSM_STATE_START_PENDING, even if it got no
IPI or it got an IPI unrelated to sbi_hsm_hart_start(). This could lead to
the following race condition when booting Linux, for example:

  Boot hart (#0)                        Secondary hart (#1)
  runs Linux startup code               waits in sbi_hsm_hart_wait()

  sbi_ecall(SBI_EXT_HSM,
            SBI_EXT_HSM_HART_START,
            ...)
  enters sbi_hsm_hart_start()
  sets state of hart #1 to START_PENDING
                                        leaves sbi_hsm_hart_wait()
                                        runs to the end of init_warmboot()
                                        returns to scratch->next_addr
                                        (next_addr can be garbage here)

  sets next_addr, etc. for hart #1
  (no good: hart #1 has already left)

  sends IPI to hart #1
  (no good either)

If this happens, the secondary hart jumps to a wrong next_addr at the end
of init_warmboot(), which leads to a system hang or crash.

To reproduce the issue more reliably, one could add a delay in
sbi_hsm_hart_start() after setting the hart's state but before sending
IPI to that hart:

    hstate = atomic_cmpxchg(&hdata->state, SBI_HSM_STATE_STOPPED,
                            SBI_HSM_STATE_START_PENDING);
    ...
  + sbi_timer_mdelay(10);
    init_count = sbi_init_count(hartid);
    rscratch->next_arg1 = arg1;
    rscratch->next_addr = saddr;

The issue can be reproduced, for example, in a QEMU VM with '-machine virt'
and 2 or more CPUs, with Linux as the guest OS.

This patch moves writing of next_arg1, next_addr and next_mode for the
secondary hart before setting its state to SBI_HSM_STATE_START_PENDING.

In theory, it is possible that two or more harts enter sbi_hsm_hart_start()
for the same target hart simultaneously. To make sure the current hart has
exclusive access to the scratch area of the target hart at that point, a
per-hart 'start_ticket' is used. It is initially 0. The current hart tries
to acquire the ticket first (set it to 1) at the beginning of
sbi_hsm_hart_start() and only proceeds if it has successfully acquired it.

The target hart reads next_addr, etc., and then the releases the ticket
(sets it to 0) before calling sbi_hart_switch_mode(). This way, even if
some other hart manages to enter sbi_hsm_hart_start() after the ticket has
been released but before the target hart jumps to next_addr, it will not
cause problems.

atomic_cmpxchg() already has "acquire" semantics, among other things, so
no additional barriers are needed in hsm_start_ticket_acquire(). No hart
can perform or observe the update of *rscratch before setting of
'start_ticket' to 1.

atomic_write() only imposes ordering of writes, so an explicit barrier is
needed in hsm_start_ticket_release() to ensure its "release" semantics.
This guarantees that reads of scratch->next_addr, etc., in
sbi_hsm_hart_start_finish() cannot happen after 'start_ticket' has been
released.

Signed-off-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
atishp04 added a commit that referenced this pull request Aug 24, 2023
Rivos specific firmware need requires the following setup before
opensbi firmware execution starts.

1. R-code setup
2. MCRR/MDRR setup from LRAM
3. PMARR setup

Currently, fw_rivos does all 3 operations if FW_RIVOS=y is provided
at runtime. Eventually, #1 should be done by separate R-code stage
code as we need to perform more than jump to LRAM.

Ideally, #2 & #3 should be done stage prior to OpenSBI. In upstream
boot flow U-BOOT SPL can be used to support that. However, Rivos
firmware has already multiple stages boot flow defined. Eventually,
we should be able to drop the code under FW_RIVOS and just load
opensbi as last stage resident M-mode firmware (fw4-m).

Signed-off-by: Atish Patra <atishp@rivosinc.com>
MikeAmi pushed a commit to MikeAmi/opensbi that referenced this pull request Jan 26, 2024
Use _NPROCESSORS_ONLN for both Quartz64 UEFI build and edk2/BaseTools
pbug44 added a commit to pbug44/opensbi that referenced this pull request Apr 20, 2024
…-src#1 when setting

the MXSTATUS on the thead9xx which has 32-bit processors that don't understand
this extended status it seems.

So give it some workaround code to make sure we're on CPU #0 (the boot proc).
Since I'm adding this on the final_init it may be too late but we'll see.
If anything I'll add another commit to shove it into early_init.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants