Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: RiscV FPU regs not saved in multithreaded applications #54205

Closed
pottendo opened this issue Jan 29, 2023 · 8 comments
Closed

Regression: RiscV FPU regs not saved in multithreaded applications #54205

pottendo opened this issue Jan 29, 2023 · 8 comments
Assignees
Labels
area: RISCV RISCV Architecture (32-bit & 64-bit) bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@pottendo
Copy link

hi,
With reference to this thread I found a regression in a test which was working in 3.2.99:
FPU registers aren't saved in this multithreaded FPU-intense application.

Find the test in my tree here

the board is a FPGA (Lattice ECP5) loaded with a 'VexRiscV-SMP' cpu all assembled with the litex framework.
See here and here
The output CPU is supposed to feature an ISA: rv32i2p0_mafdc
The FPU is working, as the program is 100x faster than building the application without FPU instructions.

Zephyr config:

select CPU_HAS_FPU_DOUBLE_PRECISION
select REQUIRES_FULL_LIBCPP
select THREAD_NAME
select FPU_SHARING

and many more you find here: https://github.com/pottendo/zephyr-pottendo/blob/main/boards/riscv/orangecart/orangecart_fpu_defconfig
I've tried to select RISCV_ISA_EXT_C but this didn't make a difference.

The test-program produces part of a mandelbrot set on the console:
Bad case:
image

Good case - built without FPU instructions:
image

My program uses k_float_enable(k_current_get(), 0) to enable the FPU reg-saving; this was necessary in previous versions (3.2.99); from the code now in (3.3-rc1) I see that this may not be necessary any more for RiscV CPUs.

Debugging into arch/riscv/core/fpu.c I found that void z_riscv_fpu_trap(z_arch_esf_t *esf) is never being executed. It seems that the FPU instructions used by this CPU core won't trap as needed.

I use the SDK: zephyr-sdk-0.15.2 on an Ubuntu 22.10 host environment.

$ ./riscv64-zephyr-elf-gcc -v
Using built-in specs.
COLLECT_GCC=./riscv64-zephyr-elf-gcc
COLLECT_LTO_WRAPPER=/opt/zephyr-sdk-0.15.2/riscv64-zephyr-elf/bin/../libexec/gcc/riscv64-zephyr-elf/12.1.0/lto-wrapper
Target: riscv64-zephyr-elf
Configured with: /__w/_temp/workspace/build/.build/riscv64-zephyr-elf/src/gcc/configure --build=x86_64-build_pc-linux-gnu --host=x86_64-build_pc-linux-gnu --target=riscv64-zephyr-elf --prefix=/__w/_temp/workspace/output/riscv64-zephyr-elf --exec_prefix=/__w/_temp/workspace/output/riscv64-zephyr-elf --with-local-prefix=/__w/_temp/workspace/output/riscv64-zephyr-elf/riscv64-zephyr-elf --with-headers=/__w/_temp/workspace/output/riscv64-zephyr-elf/riscv64-zephyr-elf/include --with-newlib --enable-threads=no --disable-shared --with-arch=rv32ima_zicsr_zifencei --with-abi=ilp32 --with-pkgversion='Zephyr SDK 0.15.2' --with-bugurl=https://github.com/zephyrproject-rtos/sdk-ng/issues --enable-__cxa_atexit --disable-libgomp --disable-libmudflap --disable-libmpx --disable-libssp --disable-libquadmath --disable-libquadmath-support --disable-libstdcxx-verbose --with-gmp=/__w/_temp/workspace/build/.build/riscv64-zephyr-elf/buildtools --with-mpfr=/__w/_temp/workspace/build/.build/riscv64-zephyr-elf/buildtools --with-mpc=/__w/_temp/workspace/build/.build/riscv64-zephyr-elf/buildtools --with-isl=/__w/_temp/workspace/build/.build/riscv64-zephyr-elf/buildtools --enable-lto --enable-target-optspace --disable-nls --enable-multiarch --enable-languages=c,c++ --with-gnu-ld --with-gnu-as --enable-initfini-array
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 12.1.0 (Zephyr SDK 0.15.2) 

The test-program should be working on other suitable boards, however the bug may be specific to VexRiscV-smp CPUs.

thanks for taking care,
pottendo

PS: Big thanks for the entire Zephyr system and the community - a great project! Keep the momentum!

@pottendo pottendo added the bug The issue is a bug, or the PR is fixing a bug label Jan 29, 2023
@henrikbrixandersen henrikbrixandersen added the area: RISCV RISCV Architecture (32-bit & 64-bit) label Jan 30, 2023
@stephanosio stephanosio added the priority: high High impact/importance bug label Jan 31, 2023
@stephanosio
Copy link
Member

cc @fkokosinski @kgugala

@npitre
Copy link
Collaborator

npitre commented Jan 31, 2023

Please re-test using the latest main branch making sure you have
commit 83f849e ("riscv: FPU trap: catch CSR access to fcsr, frm
and fflags") in your build.

Also the latest main branch contains a new test:
tests/arch/riscv/fpu_sharing/
Please execute that test on your target and report the result.

@pottendo
Copy link
Author

hi,
sorry, that I've posted to the wrong thread.
Yes, I've merged to latest main branch and executed the test - here (repost after re-running):

--============= Liftoff! ===============--
*** Booting Zephyr OS build v3.3.0-rc1-51-ga359d18a496f ***
Running TESTSUITE riscv_fpu_sharing
===================================================================
START - test_basics

    Assertion failed at WEST_TOPDIR-pottendo/tests/arch/riscv/fpu_sharing/src/main.c:52: riscv_fpu_sharing_test_basics: fpu_is_off() is false

 FAIL - test_basics in 0.005 seconds
===================================================================
START - test_fp_insn_trap

    Assertion failed at WEST_TOPDIR-pottendo/tests/arch/riscv/fpu_sharing/src/main.c:370: riscv_fpu_sharing_test_fp_insn_trap: fpu_is_off() is false

 FAIL - test_fp_insn_trap in 0.006 seconds
===================================================================
START - test_multi_thread_interaction

    Assertion failed at WEST_TOPDIR-pottendo/tests/arch/riscv/fpu_sharing/src/main.c:90: new_thread_check: (fpu_is_clean() is false)
FPU not clean after read
 FAIL - test_multi_thread_interaction in 0.007 seconds
===================================================================
START - test_thread_vs_exc_interaction

    Assertion failed at WEST_TOPDIR-pottendo/tests/arch/riscv/fpu_sharing/src/main.c:302: riscv_fpu_sharing_test_thread_vs_exc_interaction: fpu_is_dirty() is false

 FAIL - test_thread_vs_exc_interaction in 0.006 seconds
===================================================================
TESTSUITE riscv_fpu_sharing failed.

------ TESTSUITE SUMMARY START ------

SUITE FAIL -   0.00% [riscv_fpu_sharing]: pass = 0, fail = 4, skip = 0, total = 4 duration = 0.024 seconds
 - FAIL - [riscv_fpu_sharing.test_basics] duration = 0.005 seconds
 - FAIL - [riscv_fpu_sharing.test_fp_insn_trap] duration = 0.006 seconds
 - FAIL - [riscv_fpu_sharing.test_multi_thread_interaction] duration = 0.007 seconds
 - FAIL - [riscv_fpu_sharing.test_thread_vs_exc_interaction] duration = 0.006 seconds

------ TESTSUITE SUMMARY END ------

===================================================================
PROJECT EXECUTION FAILED

As said, if I can further support, let me know how - if this is board and/or CPU specific and therefore cannot be further investigated, also fine for me. I can revert to the old implementation and can try to find the reason of my port-flaw on my own.

Again, I very much appreciate the work here - great OS, nice to hack around for me.

bye, pottendo

@npitre
Copy link
Collaborator

npitre commented Jan 31, 2023

What RISC-V implementation is this?

Are the MSTATUS_FS bits properly implemented?

@pottendo
Copy link
Author

pottendo commented Jan 31, 2023 via email

@npitre
Copy link
Collaborator

npitre commented Jan 31, 2023

Alright. Since you do have access to the CPU implementation source code,
I think it would be far better for you to figure out how to support the
MSTATUS_FS bits in your project.

I don't pretend to understand anything that I'm seeing in
https://github.com/SpinalHDL/VexRiscv however there are those lines in
VexRiscv/src/main/scala/vexriscv/plugin/FpuPlugin.scala:

      val fs = Reg(Bits(2 bits)) init(1)
      val sd = fs === 3

      when(stages.last.arbitration.isFiring && stages.last.input(FPU_ENABLE) && stages.last.input(FPU_OPCODE) =/= FpuOpcode.STORE){
        fs := 3 //DIRTY
      }
      when(List(CSR.FRM, CSR.FCSR, CSR.FFLAGS).map(id => service.isWriting(id)).orR){
        fs := 3
      }

      service.rw(CSR.SSTATUS, 13, fs)
      service.rw(CSR.MSTATUS, 13, fs)

      service.r(CSR.SSTATUS, 31, sd)
      service.r(CSR.MSTATUS, 31, sd)

So the connection between the FPU and MSTATUS_FS + CSR.MSTATUS_SD appears
to be there somehow already.

@npitre npitre added priority: low Low impact/importance bug and removed priority: high High impact/importance bug labels Jan 31, 2023
@pottendo
Copy link
Author

pottendo commented Jan 31, 2023 via email

@npitre
Copy link
Collaborator

npitre commented Feb 8, 2023

The VexRiscv core design was updated and confirmed to work here:

SpinalHDL/VexRiscv#297 (comment)

@npitre npitre closed this as completed Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: RISCV RISCV Architecture (32-bit & 64-bit) bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

No branches or pull requests

4 participants