Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PMP access failure #103

Closed
lbmeng opened this issue Apr 2, 2019 · 20 comments
Closed

PMP access failure #103

lbmeng opened this issue Apr 2, 2019 · 20 comments

Comments

@lbmeng
Copy link
Collaborator

lbmeng commented Apr 2, 2019

On SiFive FU540, OpenSBI sets up the PMP regions as follows:

PMP0: 0x0000000080000000-0x000000008001ffff (A)
PMP1: 0x0000000000000000-0x0000007fffffffff (A,R,W,X)

With above settings, when booting my kernel, I got:

sbi_trap_error: hart1: trap handler failed (error -5)
sbi_trap_error: hart1: mcause=0x0000000000000001 mtval=0xffffffff80200034
sbi_trap_error: hart1: mepc=0xffffffff80200034 mstatus=0x8000000a00006900
sbi_trap_error: hart1: ra=0x0000000080200032 sp=0x00000000ff798c40
sbi_trap_error: hart1: gp=0x00000000ff79ce70 tp=0x0000000000000001
sbi_trap_error: hart1: s0=0x0000000000000001 s1=0x00000000ff7991f0
sbi_trap_error: hart1: a0=0x0000000030000000 a1=0x00000000ff7991f0
sbi_trap_error: hart1: a2=0x8000000000000000 a3=0x0000000000000007
sbi_trap_error: hart1: a4=0x0000000000000000 a5=0x0000000000002c6a
sbi_trap_error: hart1: a6=0x0000000000000006 a7=0x00000000ff79d6be
sbi_trap_error: hart1: s2=0x00000000fff9ebc2 s3=0xffffffff00000000
sbi_trap_error: hart1: s4=0x0000000000000003 s5=0x00000000ff7a5c88
sbi_trap_error: hart1: s6=0x00000000fffebdb8 s7=0x0000000000000000
sbi_trap_error: hart1: s8=0x0000000000000000 s9=0x0000000000000000
sbi_trap_error: hart1: s10=0x0000000084000040 s11=0x0000000084979a78
sbi_trap_error: hart1: t0=0x8000000000080b4e t1=0x0000000040000000
sbi_trap_error: hart1: t2=0x0000000000000ff0 t3=0x0000000080b4eff8
sbi_trap_error: hart1: t4=0x00000000300000cf t5=0x0000000080000000
sbi_trap_error: hart1: t6=0x000000000000001e

From above log, both mepc and mtval point to 0xffffffff80200034, and this is the location where stvec is programmed. My codes is supposed to jump to stevc after satp is programmed with valid page tables. Both mstatus.MPP and mstatus.SPP is set to S-mode, so what happened is that after satp is written, the next instruction will immediately trap to stvec (mstatus.SPP <= 1), but at the same time, another exception trapped when trying to read instructions from stvec (mstatus.MPP <= 1)

After I changed OpenSBI to not program the PMP region 0 for the firmware image, like below:

PMP0: 0x0000000000000000-0x0000007fffffffff (A,R,W,X)

Then my kernel successfully boots.

So it seems that accessing 0xffffffff80200034 somehow falls into the access check of PMP region 0. But its address is not in the range at all. I can't figure out why this is the case.

@avpatel
Copy link
Collaborator

avpatel commented Apr 2, 2019

PMP0 entry has been there for quite some time. This is to protect the RAM area where OpenSBI firmware is running.

I tried latest OpenSBI + U-Boot on SiFive unleashed but I was not able to reproduce this issue. What is different about your setup?

For me, PAGE_OFFSET is 0xffffffe000000000, why is PAGE_OFFSET 0xffffffff80000000 in your case?

@avpatel
Copy link
Collaborator

avpatel commented Apr 2, 2019

I quickly tried PAGE_OFFSET 0xffffffff80000000 on QEMU/virt machine using Linux-5.1-rc2 kernel and it worked fine for me.

It seems your using non-upstream kernel because at 0xffffffff80200034 I have clear_bss() part of head.S.

If you are using https://github.com/riscv/riscv-linux tree then please stop using because that tree is obsolete and lacks many upstream changes.

@lbmeng
Copy link
Collaborator Author

lbmeng commented Apr 2, 2019

Thanks for checking this. I am not using a Linux kernel, by my own OS kernel instead.

@avpatel
Copy link
Collaborator

avpatel commented Apr 2, 2019

The real issue is that memory used by OpenSBI firmware is not marked as reserved in DTS passed to Linux and U-Boot. This needs to be either fixes in DTS itself OR OpenSBI has to update the DTS before passing it to Linux or U-Boot.

There is already an existing issue for it: #70

We use PMP to protect OpenSBI firmware is to safe-guard it from buggy S-mode Software.

@lbmeng
Copy link
Collaborator Author

lbmeng commented Apr 2, 2019

I doubt this is related to #70.

As you can see from ra register it is still 0x80200032 (the physical address). The PMP check happens right after satp is written, and virtual address translation is on.

@avpatel
Copy link
Collaborator

avpatel commented Apr 2, 2019

Please fix your OS kernel because we cannot allow S-mode access to firmware memory protected using PMP0.

@lbmeng
Copy link
Collaborator Author

lbmeng commented Apr 3, 2019

Please fix your OS kernel because we cannot allow S-mode access to firmware memory protected using PMP0.

My OS kernel does not access to any firmware memory range. This is confirmed.
The same kernel works perfectly fine in QEMU with OpenSBI + U-Boot.

I posted here in case someone knows any potential issues of PMP.
I will try to create a test case to trigger the issue.

@avpatel
Copy link
Collaborator

avpatel commented Apr 3, 2019

Initially, I encountered few issues with PMP checking on QEMU but those turned-out to be QEMU bugs which are now fixed upstream QEMU.

What you are seeing can also be some HW errata (who knows).

For a test case, you can either come-up with test payload in OpenSBI or you can use U-Boot MM/MD commands to show PMP behaviour.

@lbmeng
Copy link
Collaborator Author

lbmeng commented Apr 3, 2019

Please try the test case lbmeng@f3ba28f

Steps:

  • build the test.bin via "make PLATFORM=sifive/fu540*
  • copy generated test payload test.bin to somewhere out of the tree
  • create the OpenSBI firmware image plus the test payload via "make PLATFORM=sifive/fu540 FW_PAYLOAD_PATH=test.bin FU540_ENABLED_HART_MASK=0x02"
  • burn to the SD card, and run on the unleashed board

Log below:

PMP0: 0x0000000080000000-0x000000008001ffff (A)
PMP1: 0x0000000000000000-0x0000007fffffffff (A,R,W,X)
sbi_trap_error: hart1: trap handler failed (error -5)
sbi_trap_error: hart1: mcause=0x0000000000000001 mtval=0xffffffff802000d0
sbi_trap_error: hart1: mepc=0xffffffff802000d0 mstatus=0x8000000a00006900
sbi_trap_error: hart1: ra=0x000000008000074c sp=0x0000000080013e80
sbi_trap_error: hart1: gp=0x0000000000000000 tp=0x0000000080013f00
sbi_trap_error: hart1: s0=0x0000000080013e90 s1=0x0000000080013f00
sbi_trap_error: hart1: a0=0x0000000030000000 a1=0x0000000082200000
sbi_trap_error: hart1: a2=0x8000000000000000 a3=0xffffffff802000d0
sbi_trap_error: hart1: a4=0x0000000080203008 a5=0x0000000080203000
sbi_trap_error: hart1: a6=0x0000000082200000 a7=0x0000000080200000
sbi_trap_error: hart1: s2=0x0000000080009550 s3=0xffffffff00000000
sbi_trap_error: hart1: s4=0x0000000000000000 s5=0x0000000000000000
sbi_trap_error: hart1: s6=0x0000000000000001 s7=0x0000000000000005
sbi_trap_error: hart1: s8=0x0000000000002000 s9=0x0000000000000000
sbi_trap_error: hart1: s10=0x0000000000000000 s11=0x0000000000000000
sbi_trap_error: hart1: t0=0x8000000000080202 t1=0x0000000040000000
sbi_trap_error: hart1: t2=0x0000000000000ff0 t3=0x0000000080202ff8
sbi_trap_error: hart1: t4=0x00000000300000cf t5=0x0000000080000000
sbi_trap_error: hart1: t6=0x0000000082200000

@avpatel
Copy link
Collaborator

avpatel commented Apr 3, 2019

Based your test code, it seems cache speculative access for S-mode is creating problems for you.

As-per your test code, you are mapping 0xffffffff80000000 (V) -> 0x80000000 (P). This means you are mapping initial part of RAM as well which gives cache speculative access freedom to fetch memory from 0x80000000 hence it fails for you.

In Linux, we start mapping from kernel load address onwards so we never see this issue.

Try creating 2M/4KB mappings and don't map memory where firmware is running.

@avpatel
Copy link
Collaborator

avpatel commented Apr 3, 2019

I think in your test code the instruction slli a0, a0, 28 should be slli a0, a0, 18.

@lbmeng
Copy link
Collaborator Author

lbmeng commented Apr 3, 2019

Based your test code, it seems cache speculative access for S-mode is creating problems for you.

As-per your test code, you are mapping 0xffffffff80000000 (V) -> 0x80000000 (P). This means you are mapping initial part of RAM as well which gives cache speculative access freedom to fetch memory from 0x80000000 hence it fails for you.

My understanding is that the cache speculative access is to fetch several more instructions after current pc in the pipeline for better performance. In my test codes, the pc does not get any chance to be within the firmware memory range (0x80000000-0x8001ffff), hence there should be no speculative access falling into that range.

In Linux, we start mapping from kernel load address onwards so we never see this issue.

Try creating 2M/4KB mappings and don't map memory where firmware is running.

Yes, I see Linux is using 2M/4KB mappings and does not map the lower 2MB. But per my read of the privileged spec 1.10, what the test codes do seem not wrong.

@lbmeng
Copy link
Collaborator Author

lbmeng commented Apr 3, 2019

I think in your test code the instruction slli a0, a0, 28 should be slli a0, a0, 18.

Bit 28 is the PPN[2] and PPN[2] maps 1GiB.

@avpatel
Copy link
Collaborator

avpatel commented Apr 4, 2019

I think in your test code the instruction slli a0, a0, 28 should be slli a0, a0, 18.

Bit 28 is the PPN[2] and PPN[2] maps 1GiB.

Got it, there is no issue here.

@andreas-schwab
Copy link

Is that the same as issue 65?

@lbmeng
Copy link
Collaborator Author

lbmeng commented Apr 8, 2019

Is that the same as issue 65?

No.

@avpatel
Copy link
Collaborator

avpatel commented Apr 15, 2019

Hi Bin,

I believe Andrew answered your query.

Please close this issue because its not related to OpenSBI.

Regards,
Anup

@lbmeng
Copy link
Collaborator Author

lbmeng commented Apr 22, 2019

Closing this as Andrew confirmed it is a silicon erratum.

@lbmeng lbmeng closed this as completed Apr 22, 2019
@kprovost
Copy link

Is there a pointer with information about this silicon erratum somewhere?

@lbmeng
Copy link
Collaborator Author

lbmeng commented Aug 8, 2019 via email

lbmeng added a commit to lbmeng/riscv-sbi-doc that referenced this issue Mar 7, 2020
S-mode software needs a way to know memory used by SBI firmware so
that it can correctly mark such memory as reserved.

Related discussion:
riscv-software-src/opensbi#103

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
lbmeng added a commit to lbmeng/riscv-sbi-doc that referenced this issue Mar 7, 2020
S-mode software needs a way to know memory used by SBI firmware so
that it can correctly mark such memory as reserved.

Related discussion:
riscv-software-src/opensbi#103

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
lbmeng added a commit to lbmeng/riscv-sbi-doc that referenced this issue Mar 10, 2020
S-mode software needs a way to know memory used by SBI firmware so
that it can correctly mark such memory as reserved.

Related discussion:
riscv-software-src/opensbi#103

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
lbmeng added a commit to lbmeng/riscv-sbi-doc that referenced this issue Mar 10, 2020
S-mode software needs a way to know memory used by SBI firmware so
that it can correctly mark such memory as reserved.

Related discussion:
riscv-software-src/opensbi#103

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants