Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various RISC-V FPU context switching fixes #54207

Merged
merged 4 commits into from Jan 30, 2023

Conversation

npitre
Copy link
Collaborator

@npitre npitre commented Jan 30, 2023

Fix various issues with the new FPU context switching code on RISC-V
and provide a comprehensive test for that code.

Fixed issues are:

  • IRQ state for the interrupted context corresponds to the PIE bit not
    the IE bit.

  • Restoring the saved FPU state should clear the entire field before
    or'ing wanted bits in.

  • Properly trap access to the fcsr, frm and fflags registers.

  • Work around platforms that don't capture faulting instructions in mtval.
    This is notably the case on QEMU with the previously mentioned registers.

Fixes #54208

Nicolas Pitre added 3 commits January 30, 2023 00:50
- IRQ state for the interrupted context corresponds to the PIE bit not
  the IE bit.

- Restoring the saved FPU state should clear the entire field before
  or'ing wanted bits in.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
The FRCSR, FSCSR, FRRM, FSRM, FSRMI, FRFLAGS, FSFLAGS and FSFLAGSI
are in fact CSR instructions targeting the fcsr, frm and fflags
registers. They should be caught as FPU instructions as well.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Some implementations may not capture the faulting instruction in mtval
and set it to zero when an illegal instruction fault is raised This is
notably the case with QEMU version 7.0.0 when a CSR instruction is
involved.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
The RISC-V FPU context switching code is intricate and sometimes subtle.
Here's a test that exercizes various code paths to ensure they work as
intended, and to confirm that the target hardware does behave as
expected too.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
@nordicjm nordicjm changed the title Various RUSC-V FPU context switching fixes Various RISC-V FPU context switching fixes Jan 30, 2023
@fabiobaltieri fabiobaltieri merged commit 373f8ac into zephyrproject-rtos:main Jan 30, 2023
@pottendo
Copy link

Hi fellow hackers,
seeing some improvements merged, I gave those a try. My testcase still fails, so I tried the new test-case. There seems to be something seriously wrong with my port:

--=============== SoC ==================--
CPU:		VexRiscv SMP-LINUX @ 80MHz
BUS:		WISHBONE 32-bit @ 4GiB
CSR:		32-bit data
ROM:		48KiB
SRAM:		16KiB
MAIN-RAM:	16384KiB 

--========== Initialization ============--
Memtest at 0x40000000 (2.0MiB)...
  Write: 0x40000000-0x40200000 2.0MiB
   Read: 0x40000000-0x40200000 2.0MiB
Memtest OK
Memspeed at 0x40000000 (Sequential, 2.0MiB)...
  Write speed: 21.8MiB/s
   Read speed: 13.0MiB/s

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
[LITEX-TERM] Received firmware download request from the device.
[LITEX-TERM] Uploading build/zephyr/zephyr.bin to 0x40000000 (100092 bytes)...
[LITEX-TERM] Upload calibration... (inter-frame: 10.00us, length: 64)
[LITEX-TERM] Upload complete (58.6KB/s).
[LITEX-TERM] Booting the device.
[LITEX-TERM] Done.
Executing booted program at 0x40000000

--============= Liftoff! ===============--
*** Booting Zephyr OS build v3.3.0-rc1-51-ga359d18a496f ***
Running TESTSUITE riscv_fpu_sharing
===================================================================
START - test_basics

    Assertion failed at WEST_TOPDIR-pottendo/tests/arch/riscv/fpu_sharing/src/main.c:52: riscv_fpu_sharing_test_basics: fpu_is_off() is false

 FAIL - test_basics in 0.005 seconds
===================================================================
START - test_fp_insn_trap

    Assertion failed at WEST_TOPDIR-pottendo/tests/arch/riscv/fpu_sharing/src/main.c:370: riscv_fpu_sharing_test_fp_insn_trap: fpu_is_off() is false

 FAIL - test_fp_insn_trap in 0.005 seconds
===================================================================
START - test_multi_thread_interaction

    Assertion failed at WEST_TOPDIR-pottendo/tests/arch/riscv/fpu_sharing/src/main.c:90: new_thread_check: (fpu_is_clean() is false)
FPU not clean after read
 FAIL - test_multi_thread_interaction in 0.006 seconds
===================================================================
START - test_thread_vs_exc_interaction

    Assertion failed at WEST_TOPDIR-pottendo/tests/arch/riscv/fpu_sharing/src/main.c:302: riscv_fpu_sharing_test_thread_vs_exc_interaction: fpu_is_dirty() is false

 FAIL - test_thread_vs_exc_interaction in 0.006 seconds
===================================================================
TESTSUITE riscv_fpu_sharing failed.

------ TESTSUITE SUMMARY START ------

SUITE FAIL -   0.00% [riscv_fpu_sharing]: pass = 0, fail = 4, skip = 0, total = 4 duration = 0.022 seconds
 - FAIL - [riscv_fpu_sharing.test_basics] duration = 0.005 seconds
 - FAIL - [riscv_fpu_sharing.test_fp_insn_trap] duration = 0.005 seconds
 - FAIL - [riscv_fpu_sharing.test_multi_thread_interaction] duration = 0.006 seconds
 - FAIL - [riscv_fpu_sharing.test_thread_vs_exc_interaction] duration = 0.006 seconds

------ TESTSUITE SUMMARY END ------

===================================================================
PROJECT EXECUTION FAILED

It seems that the mstatus is not behaving as expected, or the respective bits are organized differently on this CPU (VexRiscV-smp).
I did the Zephyr port to this board, so maybe I missed some specifics for FPU: See here for the dts + board configs.
The board `orangecart_fpu' is the one which uses the FPU.
Let me know if I can support further.

bye, pottendo

@npitre
Copy link
Collaborator Author

npitre commented Jan 31, 2023 via email

@carlocaione
Copy link
Collaborator

also it is really not nice to maintainers and project asking for support on downstream only boards and ports.

@pottendo
Copy link

also it is really not nice to maintainers and project asking for support on downstream only boards and ports.

Hi,
I apologize if I violated some rules; not sure what I should have done differently though.
My intention was to report a potential regression maybe hitting others as well. For me it wasn't obvious that this is board and/or port specific.
regards, pottendo

@npitre npitre deleted the rvfpufix branch February 20, 2023 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: RISCV RISCV Architecture (32-bit & 64-bit)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Various RISC-V FPU context switching issues
7 participants