Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sbi_trap_error: hart2: trap handler failed (error -5) #65

Closed
andreas-schwab opened this issue Feb 18, 2019 · 33 comments
Closed

sbi_trap_error: hart2: trap handler failed (error -5) #65

andreas-schwab opened this issue Feb 18, 2019 · 33 comments
Assignees

Comments

@andreas-schwab
Copy link

sbi_trap_error: hart2: trap handler failed (error -5)
sbi_trap_error: hart2: mcause=0xd mtval=0x155574e000
sbi_trap_error: hart2: mepc=0x80003dba mstatus=0x8000000a00027822
sbi_trap_error: hart2: ra=0x80001eaa sp=0x80011d60
sbi_trap_error: hart2: gp=0x2aaaccdd18 tp=0x155573cb30
sbi_trap_error: hart2: s0=0x80011dc0 s1=0x5e
sbi_trap_error: hart2: a0=0x20000 a1=0x80011df0
sbi_trap_error: hart2: a2=0x15555eee96 a3=0x8000000a00006022
sbi_trap_error: hart2: a4=0x80011d6b a5=0x155574e001
sbi_trap_error: hart2: a6=0x0 a7=0x155574e001
sbi_trap_error: hart2: s2=0xee s3=0x3
sbi_trap_error: hart2: s4=0x96 s5=0x20
sbi_trap_error: hart2: s6=0x0 s7=0x15555eee96
sbi_trap_error: hart2: s8=0x74 s9=0x14774
sbi_trap_error: hart2: s10=0x58782b5b s11=0x34fe3e52
sbi_trap_error: hart2: t0=0x4 t1=0x155574dffd
sbi_trap_error: hart2: t2=0x55 t3=0x85a303
sbi_trap_error: hart2: t4=0x0 t5=0x0
sbi_trap_error: hart2: t6=0x15

The kernel was booted via u-boot.

@atishp04
Copy link
Collaborator

Do you have the U-Boot SMP patches included ?
Without that, you should provide FU540_ENABLED_HART_MASK=0x02 as an compile time option.

Refer: https://github.com/riscv/opensbi/blob/master/docs/platform/sifive_fu540.md

@andreas-schwab
Copy link
Author

Yes, of course.

@andreas-schwab
Copy link
Author

I'm also often seeing trap errors while opensbi is still booting.

sbi_trap_error: hart1: misaligned store handler failed (error -10)

OpenSBI v0.1 (Feb 14 2019 12:10:27)
sbi_trap_error: hart1: mcause=0x6 mtval=0xfffffffffffffff9


/ __ \ / | _ _ |
| | | |
__ ___ _ __ | (
| |
) || |
| | | | '_ \ / _ \ '_ \ ___ | _ < | |
| || | |) | __/ | | |) | |) || |
_
/| ./ _|| ||/|____/|
| |
|_|

sbi_trap_error: hart1: mepc=0x80005bae mstatus=0xa00001800
Platform Name : SiFive Freedom U540
sbi_trap_error: hart1: ra=0x8000674a sp=0xffffffffffffffd9
Platform HART Features : RV64ACDFIMSU
sbi_trap_error: hart1: gp=0x8006e78 tp=0x80013f00
Platform Max HARTs : 5
sbi_trap_error: hart1: s0=0x80013e90 s1=0x8000b020
Current Hart : 2
sbi_trap_error: hart1: a0=0x8000b020 a1=0x8000b021
Firmware Base : 0x80000000
sbi_trap_error: hart1: a2=0x5 a3=0x0
Firmware Size : 88 KB
sbi_trap_error: hart1: a4=0x8000b736 a5=0xa
Runtime SBI Version : 0.1
sbi_trap_error: hart1: a6=0x8000a000 a7=0x1

sbi_trap_error: hart1: s2=0x2 s3=0x8000b018
PMP0: 0x0000000080000000-0x000000008001ffff (Asbi_trap_error: hart1: s4=0xfffffffffffffffd s5=0x0
)
sbi_trap_error: hart1: s6=0x1 s7=0x5
PMP1: 0x0000000000000000-0x0000007fffffffff (Asbi_trap_error: hart1: s8=0x2000 s9=0x0
,Rsbi_trap_error: hart1: s10=0x0 s11=0x0
,Wsbi_trap_error: hart1: t0=0xa000 t1=0x4
,Xsbi_trap_error: hart1: t2=0x0 t3=0x0
)
sbi_trap_error: hart1: t4=0x0 t5=0x0
sbi_trap_error: hart1: t6=0x0

@avpatel
Copy link
Collaborator

avpatel commented Feb 19, 2019

That's strange. Me and Atish don't see this issue.

Can you try pre-built riscv64 toolchain from bootlin at
https://toolchains.bootlin.com/ ?

Some of us use this toolchain for cross-compilation.

@andreas-schwab
Copy link
Author

I'm not cross compiling. You can try out the packages from https://download.opensuse.org/ports/riscv/tumbleweed/repo/oss/.

@avpatel
Copy link
Collaborator

avpatel commented Feb 20, 2019

Do you have steps for setting-up SUSE RISC-V rootfs with native compiler??

We can try at our end.

@andreas-schwab
Copy link
Author

There are rootfs tarballs available in https://download.opensuse.org/ports/riscv/tumbleweed/images. The JeOS-devel image has a compiler pre-installed.

@atishp04
Copy link
Collaborator

I tried from fedora rootfs. I did not see any issue. I will try opensuse one tomorrow and report.

@atishp04
Copy link
Collaborator

atishp04 commented Feb 22, 2019

I did not see any issue with native toolchain on OpenSUSE. This is what I did.

  1. Download rootfs tarball.
    https://download.opensuse.org/ports/riscv/tumbleweed/images/openSUSE-Tumbleweed-RISC-V-JeOS.riscv64-rootfs.riscv64-2019.02.18-Build32.4.tar.xz

  2. Bootinto fedora. chroot into OpenSUSE rootfs

  3. Install gcc & make.

  4. Here is my gcc version

# gcc --version
gcc (SUSE Linux) 8.2.1 20190204 [gcc-8-branch revision 268513]
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  1. Recompile OpenSBI on the board and copy the fw_payload.bin to the correct partition of the sdcard.

Can you please try with the above rootfs or gcc toolchain and master OpenSBI ?

@lukasauer
Copy link
Contributor

Did you use the reset button on the HiFive Unleashed board to reset the system?
I occasionally see the same errors, but only if I use the reset button instead of the power button to reset the board.

I am also seeing some other strange behavior when using the reset button. Quite often, OpenSBI will pass an unmodified device tree to the payload. Without the Linux SMP patches, this cause the kernel boot to hang as it waits for hart 0 to come up.

@andreas-schwab
Copy link
Author

With a warm reset it often hangs inside opensbi. That looks like something is using uninitialized memory.

@andreas-schwab
Copy link
Author

sbi_trap_error: hart3: trap handler failed (error -5)
sbi_trap_error: hart3: mcause=0x000000000000000d mtval=0x00000015557a1000
sbi_trap_error: hart3: mepc=0x0000000080003e58 mstatus=0x8000000a00027822
sbi_trap_error: hart3: ra=0x0000000080001f48 sp=0x000000008000fd60
sbi_trap_error: hart3: gp=0x0000002aaaccdd18 tp=0x000000155573cb30
sbi_trap_error: hart3: s0=0x000000008000fdc0 s1=0x000000000000005e
sbi_trap_error: hart3: a0=0x0000000000020000 a1=0x000000008000fdf0
sbi_trap_error: hart3: a2=0x00000015555eee96 a3=0x8000000a00006022
sbi_trap_error: hart3: a4=0x000000008000fd6b a5=0x00000015557a1001
sbi_trap_error: hart3: a6=0x0000000000000000 a7=0x00000015557a1001
sbi_trap_error: hart3: s2=0x00000000000000ee s3=0x0000000000000003
sbi_trap_error: hart3: s4=0x0000000000000096 s5=0x0000000000000020
sbi_trap_error: hart3: s6=0x0000000000000000 s7=0x00000015555eee96
sbi_trap_error: hart3: s8=0x0000000000000074 s9=0x0000000000014774
sbi_trap_error: hart3: s10=0x0000000058782b5b s11=0x0000000034fe3e52
sbi_trap_error: hart3: t0=0x0000000000000004 t1=0x00000015557a0ffd
sbi_trap_error: hart3: t2=0x0000000000000055 t3=0x000000000085a303
sbi_trap_error: hart3: t4=0x0000000000000000 t5=0x0000000000000000
sbi_trap_error: hart3: t6=0x0000000000000015

mepc=0x0000000080003e58 is at load_u8 in sbi_misaligned_load_handler. Shouldn't the trap handler first verify that the address is actually valid?

@andreas-schwab
Copy link
Author

If the same kernel (modulo built-in initrd) is booted via bbl the issue does not occur.

@atishp04
Copy link
Collaborator

@lukasauer : Yes I am able to reproduce your use case with reset button. I have always used external power reset instead of reset button. That's why I never saw this issue.

This is what I am seeing.

  1. If U-boot is given as they payload, a reset results in hang in OpenSBI.
  2. If kernel image is given as the direct payload, an invalid DT is being passed to the Kernel. It results in kernel hang.

I am looking into this.

@atishp04
Copy link
Collaborator

@andreas-schwab : Do you also see all the issues only when reset button is used ?

@andreas-schwab
Copy link
Author

The initial issue also happens after a cold boot.

@atishp04
Copy link
Collaborator

atishp04 commented Feb 26, 2019

@andreas-schwab I tried OpenSuse toolchain and I couldn't reproduce the issue with cold boot.
Can you please try with the versions I mentioned and confirm?

I am looking into the warm boot issue.

@andreas-schwab
Copy link
Author

Try running git fetch over NFS.

@avpatel avpatel mentioned this issue Mar 6, 2019
@atishp04
Copy link
Collaborator

atishp04 commented Mar 7, 2019

@andreas-schwab @lukasauer : Can you please try the PR #84 on top of master ?

Please save the objdump of openSBI in case you still see the exception.

@andreas-schwab
Copy link
Author

Doesn't change anything.

@atishp04
Copy link
Collaborator

atishp04 commented Mar 7, 2019

Well it changed for us. We are not able to reproduce the issue our end at least with the master branch.

SiFive folks also suggested that reset button on HiFive Unleashed has some issues and known to behave unexpectedly sometimes. It is better to use the power reset button (which just works) if warm reset button still showcase problems that we don't see.

@andreas-schwab
Copy link
Author

This issue has nothing to do with resetting.

@andreas-schwab
Copy link
Author

sbi_trap_error: hart2: trap handler failed (error -5)
sbi_trap_error: hart2: mcause=0x000000000000000d mtval=0xfffffff374736574
sbi_trap_error: hart2: mepc=0x00000000800043a8 mstatus=0x8000000a00027822
sbi_trap_error: hart2: ra=0x00000000800020f8 sp=0x0000000080011ca0
sbi_trap_error: hart2: gp=0x0000000000013800 tp=0x000000155591a620
sbi_trap_error: hart2: s0=0x0000000080011cc0 s1=0x0000000000000004
sbi_trap_error: hart2: a0=0x0000000080011ca8 a1=0x0000000080011cf0
sbi_trap_error: hart2: a2=0x8000000a00006022 a3=0x0000000000020000
sbi_trap_error: hart2: a4=0x0000000080011ca8 a5=0xfffffff374736575
sbi_trap_error: hart2: a6=0xfffffff37473657c a7=0xfffffff374736574
sbi_trap_error: hart2: s2=0x0000000080011e00 s3=0x0000000080011cf0
sbi_trap_error: hart2: s4=0x0000000000000002 s5=0x000000000000003f
sbi_trap_error: hart2: s6=0x0000000000000001 s7=0x0000000000029680
sbi_trap_error: hart2: s8=0x0000000000010f18 s9=0x0000003fffdeea50
sbi_trap_error: hart2: s10=0x0000000000010f85 s11=0x0000000000010f86
sbi_trap_error: hart2: t0=0x8000000a00006022 t1=0x0000000000000780
sbi_trap_error: hart2: t2=0x0000001555737390 t3=0x0000000000000000
sbi_trap_error: hart2: t4=0x0000000000000008 t5=0x0000000000000000
sbi_trap_error: hart2: t6=0x0000000000000000

@mickflemm
Copy link
Contributor

#114

@andreas-schwab
Copy link
Author

Looks good.

@andreas-schwab
Copy link
Author

No, it is still broken.

sbi_trap_error: hart4: trap handler failed (error -5)
sbi_trap_error: hart4: mcause=0x000000000000000d mtval=0x0000001555a45000
sbi_trap_error: hart4: mepc=0x000000008000474a mstatus=0x8000000a00027822
sbi_trap_error: hart4: ra=0x00000000800021ca sp=0x000000008000eca0
sbi_trap_error: hart4: gp=0x0000002aaacd9df8 tp=0x0000001555741b30
sbi_trap_error: hart4: s0=0x000000008000ecc0 s1=0x000000008000ecf0
sbi_trap_error: hart4: a0=0x000000008000eca8 a1=0x000000008000ecf0
sbi_trap_error: hart4: a2=0x8000000a00006022 a3=0x0000000000020000
sbi_trap_error: hart4: a4=0x000000008000ecab a5=0x0000001555a45001
sbi_trap_error: hart4: a6=0x0000001555a45001 a7=0x0000001555a44ffd
sbi_trap_error: hart4: s2=0x0000000000000004 s3=0x000000008000ee00
sbi_trap_error: hart4: s4=0x0000000000000004 s5=0x0000003fffff9014
sbi_trap_error: hart4: s6=0x0000003fffff9ba8 s7=0x0000001555a35000
sbi_trap_error: hart4: s8=0x0000000000001000 s9=0x00000000000120eb
sbi_trap_error: hart4: s10=0x000000007d3f4c4b s11=0x000000001900ed5a
sbi_trap_error: hart4: t0=0x8000000a00006022 t1=0x000000000085a303
sbi_trap_error: hart4: t2=0xffffffffa7e98960 t3=0x0000000000000003
sbi_trap_error: hart4: t4=0x0000000000000004 t5=0x0000000000000020
sbi_trap_error: hart4: t6=0x0000000000000000

@andreas-schwab
Copy link
Author

sbi_trap_error: hart4: trap handler failed (error -5)
sbi_trap_error: hart4: mcause=0x000000000000000d mtval=0x00000015557a6000
sbi_trap_error: hart4: mepc=0x00000000800047f2 mstatus=0x8000000a00027822
sbi_trap_error: hart4: ra=0x0000000080002272 sp=0x000000008000eca0
sbi_trap_error: hart4: gp=0x0000002aaacd9df8 tp=0x0000001555741b30
sbi_trap_error: hart4: s0=0x000000008000ecc0 s1=0x000000008000ecf0
sbi_trap_error: hart4: a0=0x000000008000eca8 a1=0x000000008000ecf0
sbi_trap_error: hart4: a2=0x8000000a00006022 a3=0x0000000000020000
sbi_trap_error: hart4: a4=0x000000008000ecab a5=0x00000015557a6001
sbi_trap_error: hart4: a6=0x00000015557a6001 a7=0x00000015557a5ffd
sbi_trap_error: hart4: s2=0x0000000000000004 s3=0x000000008000ee00
sbi_trap_error: hart4: s4=0x0000000000000004 s5=0x0000003fffbe0cf4
sbi_trap_error: hart4: s6=0x0000003fffbe1888 s7=0x0000001555796000
sbi_trap_error: hart4: s8=0x0000000000001000 s9=0x000000000001545f
sbi_trap_error: hart4: s10=0x000000004ae86add s11=0x0000000071e75d65
sbi_trap_error: hart4: t0=0x8000000a00006022 t1=0x000000000085a303
sbi_trap_error: hart4: t2=0x000000005d0d5ba0 t3=0x0000000000000003
sbi_trap_error: hart4: t4=0x0000000000000004 t5=0x0000000000000020
sbi_trap_error: hart4: t6=0x0000000000000000

    for (i = 0; i < len; i++)
            val.data_bytes[i] = load_u8((void *)(addr + i));
800047e2:   41178733                sub     a4,a5,a7
800047e6:   fe840513                addi    a0,s0,-24
800047ea:   972a                    add     a4,a4,a0
800047ec:   0785                    addi    a5,a5,1
800047ee:   3006a673                csrrs   a2,mstatus,a3
800047f2:   fff7c503                lbu     a0,-1(a5) # dfff <_fw_start-0x7fff2001>
800047f6:   30061073                csrw    mstatus,a2
800047fa:   00a70023                sb      a0,0(a4) # 2000 <_fw_start-0x7fffe000>  
    for (i = 0; i < len; i++)
800047fe:   fef812e3                bne     a6,a5,800047e2 <sbi_misaligned_load_handler+0x12c>

@andreas-schwab
Copy link
Author

#include <sys/mman.h>

int
main (void)
{
void *p = mmap (0, 8192, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
return *(unsigned int *) (p + 4096 - 3);
}

@avpatel
Copy link
Collaborator

avpatel commented May 20, 2019

Thanks for the traces. It makes sense now.

I will soon send a patch fix to handle this case.

@avpatel
Copy link
Collaborator

avpatel commented May 22, 2019

Andreas,

I tried your test code on QEMU but was not able to reproduce the issue so could you try my latest patches at your end.

My latest patches are in mprv_trap_v2 branch of https://github.com/avpatel/opensbi.git

Regards,
Anup

@andreas-schwab
Copy link
Author

Looks good so far.

@avpatel
Copy link
Collaborator

avpatel commented May 22, 2019

Thanks, I will wait 1 more day before merging this patch.

Meanwhile, can you provide Tested-by to my patches on OpenSBI.

Regards,
Anup

@avpatel
Copy link
Collaborator

avpatel commented May 24, 2019

All required changes are merged in riscv/opensbi.

I am assuming you are not seeing this issue anymore.

If you see this issue again then please re-open.

Regards,
Anup

@avpatel avpatel closed this as completed May 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants