PMP access failure #103

lbmeng · 2019-04-02T03:09:41Z

On SiFive FU540, OpenSBI sets up the PMP regions as follows:

PMP0: 0x0000000080000000-0x000000008001ffff (A)
PMP1: 0x0000000000000000-0x0000007fffffffff (A,R,W,X)

With above settings, when booting my kernel, I got:

sbi_trap_error: hart1: trap handler failed (error -5)
sbi_trap_error: hart1: mcause=0x0000000000000001 mtval=0xffffffff80200034
sbi_trap_error: hart1: mepc=0xffffffff80200034 mstatus=0x8000000a00006900
sbi_trap_error: hart1: ra=0x0000000080200032 sp=0x00000000ff798c40
sbi_trap_error: hart1: gp=0x00000000ff79ce70 tp=0x0000000000000001
sbi_trap_error: hart1: s0=0x0000000000000001 s1=0x00000000ff7991f0
sbi_trap_error: hart1: a0=0x0000000030000000 a1=0x00000000ff7991f0
sbi_trap_error: hart1: a2=0x8000000000000000 a3=0x0000000000000007
sbi_trap_error: hart1: a4=0x0000000000000000 a5=0x0000000000002c6a
sbi_trap_error: hart1: a6=0x0000000000000006 a7=0x00000000ff79d6be
sbi_trap_error: hart1: s2=0x00000000fff9ebc2 s3=0xffffffff00000000
sbi_trap_error: hart1: s4=0x0000000000000003 s5=0x00000000ff7a5c88
sbi_trap_error: hart1: s6=0x00000000fffebdb8 s7=0x0000000000000000
sbi_trap_error: hart1: s8=0x0000000000000000 s9=0x0000000000000000
sbi_trap_error: hart1: s10=0x0000000084000040 s11=0x0000000084979a78
sbi_trap_error: hart1: t0=0x8000000000080b4e t1=0x0000000040000000
sbi_trap_error: hart1: t2=0x0000000000000ff0 t3=0x0000000080b4eff8
sbi_trap_error: hart1: t4=0x00000000300000cf t5=0x0000000080000000
sbi_trap_error: hart1: t6=0x000000000000001e

From above log, both mepc and mtval point to 0xffffffff80200034, and this is the location where stvec is programmed. My codes is supposed to jump to stevc after satp is programmed with valid page tables. Both mstatus.MPP and mstatus.SPP is set to S-mode, so what happened is that after satp is written, the next instruction will immediately trap to stvec (mstatus.SPP <= 1), but at the same time, another exception trapped when trying to read instructions from stvec (mstatus.MPP <= 1)

After I changed OpenSBI to not program the PMP region 0 for the firmware image, like below:

PMP0: 0x0000000000000000-0x0000007fffffffff (A,R,W,X)

Then my kernel successfully boots.

So it seems that accessing 0xffffffff80200034 somehow falls into the access check of PMP region 0. But its address is not in the range at all. I can't figure out why this is the case.

avpatel · 2019-04-02T04:19:05Z

PMP0 entry has been there for quite some time. This is to protect the RAM area where OpenSBI firmware is running.

I tried latest OpenSBI + U-Boot on SiFive unleashed but I was not able to reproduce this issue. What is different about your setup?

For me, PAGE_OFFSET is 0xffffffe000000000, why is PAGE_OFFSET 0xffffffff80000000 in your case?

avpatel · 2019-04-02T04:48:09Z

I quickly tried PAGE_OFFSET 0xffffffff80000000 on QEMU/virt machine using Linux-5.1-rc2 kernel and it worked fine for me.

It seems your using non-upstream kernel because at 0xffffffff80200034 I have clear_bss() part of head.S.

If you are using https://github.com/riscv/riscv-linux tree then please stop using because that tree is obsolete and lacks many upstream changes.

lbmeng · 2019-04-02T05:58:52Z

Thanks for checking this. I am not using a Linux kernel, by my own OS kernel instead.

avpatel · 2019-04-02T06:02:10Z

The real issue is that memory used by OpenSBI firmware is not marked as reserved in DTS passed to Linux and U-Boot. This needs to be either fixes in DTS itself OR OpenSBI has to update the DTS before passing it to Linux or U-Boot.

There is already an existing issue for it: #70

We use PMP to protect OpenSBI firmware is to safe-guard it from buggy S-mode Software.

lbmeng · 2019-04-02T06:53:51Z

I doubt this is related to #70.

As you can see from ra register it is still 0x80200032 (the physical address). The PMP check happens right after satp is written, and virtual address translation is on.

avpatel · 2019-04-02T07:57:37Z

Please fix your OS kernel because we cannot allow S-mode access to firmware memory protected using PMP0.

lbmeng · 2019-04-03T03:32:24Z

Please fix your OS kernel because we cannot allow S-mode access to firmware memory protected using PMP0.

My OS kernel does not access to any firmware memory range. This is confirmed.
The same kernel works perfectly fine in QEMU with OpenSBI + U-Boot.

I posted here in case someone knows any potential issues of PMP.
I will try to create a test case to trigger the issue.

avpatel · 2019-04-03T03:53:04Z

Initially, I encountered few issues with PMP checking on QEMU but those turned-out to be QEMU bugs which are now fixed upstream QEMU.

What you are seeing can also be some HW errata (who knows).

For a test case, you can either come-up with test payload in OpenSBI or you can use U-Boot MM/MD commands to show PMP behaviour.

lbmeng · 2019-04-03T10:32:45Z

Please try the test case lbmeng@f3ba28f

Steps:

build the test.bin via "make PLATFORM=sifive/fu540*
copy generated test payload test.bin to somewhere out of the tree
create the OpenSBI firmware image plus the test payload via "make PLATFORM=sifive/fu540 FW_PAYLOAD_PATH=test.bin FU540_ENABLED_HART_MASK=0x02"
burn to the SD card, and run on the unleashed board

Log below:

PMP0: 0x0000000080000000-0x000000008001ffff (A)
PMP1: 0x0000000000000000-0x0000007fffffffff (A,R,W,X)
sbi_trap_error: hart1: trap handler failed (error -5)
sbi_trap_error: hart1: mcause=0x0000000000000001 mtval=0xffffffff802000d0
sbi_trap_error: hart1: mepc=0xffffffff802000d0 mstatus=0x8000000a00006900
sbi_trap_error: hart1: ra=0x000000008000074c sp=0x0000000080013e80
sbi_trap_error: hart1: gp=0x0000000000000000 tp=0x0000000080013f00
sbi_trap_error: hart1: s0=0x0000000080013e90 s1=0x0000000080013f00
sbi_trap_error: hart1: a0=0x0000000030000000 a1=0x0000000082200000
sbi_trap_error: hart1: a2=0x8000000000000000 a3=0xffffffff802000d0
sbi_trap_error: hart1: a4=0x0000000080203008 a5=0x0000000080203000
sbi_trap_error: hart1: a6=0x0000000082200000 a7=0x0000000080200000
sbi_trap_error: hart1: s2=0x0000000080009550 s3=0xffffffff00000000
sbi_trap_error: hart1: s4=0x0000000000000000 s5=0x0000000000000000
sbi_trap_error: hart1: s6=0x0000000000000001 s7=0x0000000000000005
sbi_trap_error: hart1: s8=0x0000000000002000 s9=0x0000000000000000
sbi_trap_error: hart1: s10=0x0000000000000000 s11=0x0000000000000000
sbi_trap_error: hart1: t0=0x8000000000080202 t1=0x0000000040000000
sbi_trap_error: hart1: t2=0x0000000000000ff0 t3=0x0000000080202ff8
sbi_trap_error: hart1: t4=0x00000000300000cf t5=0x0000000080000000
sbi_trap_error: hart1: t6=0x0000000082200000

avpatel · 2019-04-03T11:04:24Z

Based your test code, it seems cache speculative access for S-mode is creating problems for you.

As-per your test code, you are mapping 0xffffffff80000000 (V) -> 0x80000000 (P). This means you are mapping initial part of RAM as well which gives cache speculative access freedom to fetch memory from 0x80000000 hence it fails for you.

In Linux, we start mapping from kernel load address onwards so we never see this issue.

Try creating 2M/4KB mappings and don't map memory where firmware is running.

avpatel · 2019-04-03T11:23:29Z

I think in your test code the instruction slli a0, a0, 28 should be slli a0, a0, 18.

lbmeng · 2019-04-03T15:05:10Z

Based your test code, it seems cache speculative access for S-mode is creating problems for you.

As-per your test code, you are mapping 0xffffffff80000000 (V) -> 0x80000000 (P). This means you are mapping initial part of RAM as well which gives cache speculative access freedom to fetch memory from 0x80000000 hence it fails for you.

My understanding is that the cache speculative access is to fetch several more instructions after current pc in the pipeline for better performance. In my test codes, the pc does not get any chance to be within the firmware memory range (0x80000000-0x8001ffff), hence there should be no speculative access falling into that range.

In Linux, we start mapping from kernel load address onwards so we never see this issue.

Try creating 2M/4KB mappings and don't map memory where firmware is running.

Yes, I see Linux is using 2M/4KB mappings and does not map the lower 2MB. But per my read of the privileged spec 1.10, what the test codes do seem not wrong.

lbmeng · 2019-04-03T15:07:19Z

I think in your test code the instruction slli a0, a0, 28 should be slli a0, a0, 18.

Bit 28 is the PPN[2] and PPN[2] maps 1GiB.

avpatel · 2019-04-04T03:46:54Z

I think in your test code the instruction slli a0, a0, 28 should be slli a0, a0, 18.

Bit 28 is the PPN[2] and PPN[2] maps 1GiB.

Got it, there is no issue here.

andreas-schwab · 2019-04-08T07:48:07Z

Is that the same as issue 65?

lbmeng · 2019-04-08T14:13:21Z

Is that the same as issue 65?

No.

avpatel · 2019-04-15T04:37:05Z

Hi Bin,

I believe Andrew answered your query.

Please close this issue because its not related to OpenSBI.

Regards,
Anup

lbmeng · 2019-04-22T14:56:09Z

Closing this as Andrew confirmed it is a silicon erratum.

kprovost · 2019-07-25T16:53:02Z

Is there a pointer with information about this silicon erratum somewhere?

lbmeng · 2019-08-08T01:48:57Z

On Fri, Jul 26, 2019 at 12:53 AM Kristof Provost ***@***.***> wrote: Is there a pointer with information about this silicon erratum somewhere? —

I don't know. Mabye Andrew or guys from SiFive knows. Note: I still do not have time to investigate some combination of the PMP settings and PTE settings. I suspect the errata is not limited to gigapage usage. Regards, Bin

S-mode software needs a way to know memory used by SBI firmware so that it can correctly mark such memory as reserved. Related discussion: riscv-software-src/opensbi#103 Signed-off-by: Bin Meng <bmeng.cn@gmail.com>

lbmeng closed this as completed Apr 22, 2019

lbmeng mentioned this issue Mar 7, 2020

Introduce Physical Memory Protection Extension riscv-non-isa/riscv-sbi-doc#37

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PMP access failure #103

PMP access failure #103

lbmeng commented Apr 2, 2019

avpatel commented Apr 2, 2019 •

edited

avpatel commented Apr 2, 2019

lbmeng commented Apr 2, 2019

avpatel commented Apr 2, 2019 •

edited

lbmeng commented Apr 2, 2019

avpatel commented Apr 2, 2019

lbmeng commented Apr 3, 2019

avpatel commented Apr 3, 2019

lbmeng commented Apr 3, 2019

avpatel commented Apr 3, 2019

avpatel commented Apr 3, 2019

lbmeng commented Apr 3, 2019

lbmeng commented Apr 3, 2019

avpatel commented Apr 4, 2019

andreas-schwab commented Apr 8, 2019

lbmeng commented Apr 8, 2019

avpatel commented Apr 15, 2019

lbmeng commented Apr 22, 2019

kprovost commented Jul 25, 2019

lbmeng commented Aug 8, 2019 via email

PMP access failure #103

PMP access failure #103

Comments

lbmeng commented Apr 2, 2019

avpatel commented Apr 2, 2019 • edited

avpatel commented Apr 2, 2019

lbmeng commented Apr 2, 2019

avpatel commented Apr 2, 2019 • edited

lbmeng commented Apr 2, 2019

avpatel commented Apr 2, 2019

lbmeng commented Apr 3, 2019

avpatel commented Apr 3, 2019

lbmeng commented Apr 3, 2019

avpatel commented Apr 3, 2019

avpatel commented Apr 3, 2019

lbmeng commented Apr 3, 2019

lbmeng commented Apr 3, 2019

avpatel commented Apr 4, 2019

andreas-schwab commented Apr 8, 2019

lbmeng commented Apr 8, 2019

avpatel commented Apr 15, 2019

lbmeng commented Apr 22, 2019

kprovost commented Jul 25, 2019

lbmeng commented Aug 8, 2019 via email

avpatel commented Apr 2, 2019 •

edited

avpatel commented Apr 2, 2019 •

edited