Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RISC-V Vector Extension float32_t bugs/unsupported widening instructions #442

Open
OMaghiarIMG opened this issue Oct 12, 2023 · 8 comments
Labels
arch-riscv The RISC-V ISA bug sim-se gem5's Syscall Emulation Mode Stale

Comments

@OMaghiarIMG
Copy link

Describe the bug
Hello, tried running some RISC-V RVV code using syscall mode and ran into a couple of unsupported instructions/bugs.

Affects version
Develop branch 486916b

gem5 Modifications
No modifications.

To Reproduce

scons build/RISCV/gem5.fast -j8
$GEM5/build/RISCV/gem5.fast $GEM5/configs/example/gem5_library/riscv-se.py 

Script used for SE mode using RISCVMatchedBoard:

from gem5.components.boards.riscv_board import RiscvBoard
from gem5.prebuilt.riscvmatched.riscvmatched_board import RISCVMatchedBoard
from gem5.components.processors.cpu_types import CPUTypes
from gem5.isas import ISA
from gem5.utils.requires import requires
from gem5.resources.resource import obtain_resource, BinaryResource
from gem5.simulate.simulator import Simulator

# Run a check to ensure the right version of gem5 is being used.
requires(isa_required=ISA.RISCV)

# Setup the board.
board = RISCVMatchedBoard()
binary = BinaryResource("/path/to/gem5_example")
board.set_se_binary_workload(binary)

simulator = Simulator(board=board, full_system=False)
print("Beginning simulation!")
simulator.run()
print("Exiting @ tick {} because {}.".format(simulator.get_current_tick(), simulator.get_last_exit_event_cause(),))

Using QEMU(v8.0) user-mode side-by-side to show expected results.

1. Some widening instructions cause panic for LMUL=8:

Using a dot product example for vwredsum, normally most widening instructions don't work with LMUL=8 as you can't widen to LMUL=16, but some of them such as vwredsum and vwmacc should work correctly.

#include "riscv_vector.h"
#include "stdio.h"
#include "stdlib.h"
void rvv_dot_prod(int16_t *pSrcA, int16_t *pSrcB, uint32_t n, int64_t *result)
{
    size_t vl;
    vint16m4_t vSrcA, vSrcB;
    vint64m1_t vSum = __riscv_vmv_s_x_i64m1(0, 1);
    while (n > 0) {
        vl = __riscv_vsetvl_e16m4(n);
        vSrcA = __riscv_vle16_v_i16m4(pSrcA, vl);
        vSrcB = __riscv_vle16_v_i16m4(pSrcB, vl);
        vSum = __riscv_vwredsum_vs_i32m8_i64m1(__riscv_vwmul_vv_i32m8(vSrcA, vSrcB, vl), vSum, vl);
        pSrcA += vl;
        pSrcB += vl;
        n -= vl;
    }
    *result = __riscv_vmv_x_s_i64m1_i64(vSum);
}

void scalar_dot_prod(int16_t *pSrcA, int16_t *pSrcB, uint32_t n, int64_t *result)
{
    int64_t sum = 0;
    while (n > 0) {
        sum += ((int64_t)*pSrcA++ * *pSrcB++);
        n--;
    }
    *result = sum;
}

int main(int argc, char **argv)
{
    size_t vlen = 31;
    int16_t *input = (int16_t *)malloc(vlen * sizeof(int16_t));
    int16_t initial_value = 0;
    for (size_t i = 0; i < vlen; ++i) {
        input[i] = initial_value++;
    }
    int64_t scalar_result, vector_result;
    rvv_dot_prod(input, input, vlen, &vector_result);
    scalar_dot_prod(input, input, vlen, &scalar_result);
    printf("RVV: %ld\nScalar: %ld\n", vector_result, scalar_result);
    free(input);
    return 0;
}
$GEM5/build/RISCV/gem5.fast $GEM5/configs/example/gem5_library/riscv-se.py 
gem5 Simulator System.  https://www.gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 version DEVELOP-FOR-23.1
gem5 compiled Oct 10 2023 08:30:21
gem5 started Oct 12 2023 10:20:16

Beginning simulation!
Global frequency set at 1000000000000 ticks per second
warn: No dot file generated. Please install pydot to generate the dot file and pdf.
src/arch/riscv/linux/se_workload.cc:60: warn: Unknown operating system; assuming Linux.
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
board.remote_gdb: Listening for connections on port 7000
src/sim/simulate.cc:194: info: Entering event queue @ 0.  Starting simulation...
build/RISCV/arch/riscv/generated/decoder-ns.hh.inc:40614: panic: panic condition vlmul == 3 occurred: LMUL=8 is illegal for widening inst
Memory Usage: 16917496 KBytes
Program aborted at tick 11200518
--- BEGIN LIBC BACKTRACE ---
/home//gem5//build/RISCV/gem5.fast(_ZN4gem515print_backtraceEv+0x30)[0x55e3aa721d90]
/home//gem5//build/RISCV/gem5.fast(_ZN4gem512abortHandlerEi+0x4e)[0x55e3aa73b37e]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7f4117551420]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f411671d00b]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f41166fc859]
/home//gem5//build/RISCV/gem5.fast(+0xb05885)[0x55e3a9f8d885]
/home//gem5//build/RISCV/gem5.fast(_ZN4gem512RiscvISAInst8Vwmul_vvIjEC2ENS_16bitfield_backend17BitUnionOperatorsINS_8RiscvISA36BitfieldUnderlyingClassesExtMachInstEEE+0x52d)[0x55e3aa3c3bfd]
/home//gem5//build/RISCV/gem5.fast(_ZN4gem58RiscvISA7Decoder10decodeInstENS_16bitfield_backend17BitUnionOperatorsINS0_36BitfieldUnderlyingClassesExtMachInstEEE+0x106a7)[0x55e3aa0fb4c7]
/home//gem5//build/RISCV/gem5.fast(_ZN4gem58RiscvISA7Decoder6decodeENS_16bitfield_backend17BitUnionOperatorsINS0_36BitfieldUnderlyingClassesExtMachInstEEEm+0xa7)[0x55e3a9fe6f37]
/home//gem5//build/RISCV/gem5.fast(_ZN4gem58RiscvISA7Decoder6decodeERNS_11PCStateBaseE+0x11b)[0x55e3a9fe708b]
/home//gem5//build/RISCV/gem5.fast(_ZN4gem55minor6Fetch28evaluateEv+0xda5)[0x55e3aad37715]
/home//gem5//build/RISCV/gem5.fast(_ZN4gem55minor8Pipeline8evaluateEv+0x1d2)[0x55e3aad49222]
/home//gem5//build/RISCV/gem5.fast(+0x12c6567)[0x55e3aa74e567]
/home//gem5//build/RISCV/gem5.fast(_ZN4gem510EventQueue10serviceOneEv+0xa8)[0x55e3aa72e9f8]
/home//gem5//build/RISCV/gem5.fast(_ZN4gem59doSimLoopEPNS_10EventQueueE+0x70)[0x55e3aa751970]
/home//gem5//build/RISCV/gem5.fast(_ZN4gem58simulateEm+0x263)[0x55e3aa751f03]
/home//gem5//build/RISCV/gem5.fast(+0xb347f2)[0x55e3a9fbc7f2]
/home//gem5//build/RISCV/gem5.fast(+0xad2203)[0x55e3a9f5a203]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(+0x26f0c3)[0x7f41177d10c3]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(_PyObject_Call+0x60)[0x7f4117818180]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(_PyEval_EvalFrameDefault+0x5a52)[0x7f41175d9c12]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(+0x1d4424)[0x7f4117736424]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(_PyFunction_Vectorcall+0x9e)[0x7f4117817cbe]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(+0x71d5d)[0x7f41175d3d5d]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(_PyEval_EvalFrameDefault+0x8401)[0x7f41175dc5c1]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(+0x1d4424)[0x7f4117736424]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(_PyFunction_Vectorcall+0x9e)[0x7f4117817cbe]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(+0x71d5d)[0x7f41175d3d5d]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(_PyEval_EvalFrameDefault+0x4198)[0x7f41175d8358]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(+0x1d4424)[0x7f4117736424]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(_PyEval_EvalCodeWithName+0x52)[0x7f4117736772]
/lib/x86_64-linux-gnu/libpython3.9.so.1.0(PyEval_EvalCodeEx+0x42)[0x7f41177367c2]
--- END LIBC BACKTRACE ---
For more info on how to address this issue, please visit https://www.gem5.org/documentation/general_docs/common-errors/ 

Aborted (core dumped)

$ qemu-riscv64 -cpu rv64,v=true,vext_spec=v1.0,vlen=256,elen=64 gem5_example 
RVV: 9455
Scalar: 9455

2. Some float32 vector instructions return NaNs

Vector sum example using vfredusum, also noted similar behavior vfredmax. The float64 variants seem to work correctly.

#include "riscv_vector.h"
#include "stdio.h"

void vector_sum(float *pSrc, uint32_t n, float *pResult)
{
    size_t vl;
    vfloat32m8_t vSrc;
    vfloat32m1_t vSum = __riscv_vfmv_s_f_f32m1(0.0f, 1);
    while (n > 0) {
        vl = __riscv_vsetvl_e32m8(n);
        vSrc = __riscv_vle32_v_f32m8(pSrc, vl);
        vSum = __riscv_vfredusum_vs_f32m8_f32m1(vSrc, vSum, vl);
        pSrc += vl;
        n -= vl;
    }
    *pResult = __riscv_vfmv_f_s_f32m1_f32(vSum);
}

void scalar_sum(float *pSrc, uint32_t n, float *pResult)
{
    float sum = 0.0f;
    while (n > 0U) {
        sum += *pSrc++;
        n--;
    }
    *pResult = sum;
}

int main(int argc, char **argv)
{
    float input[] = {1.0f, 2.0f, 3.0f, 10.0f, 42.0f};
    size_t vlen = 5;
    float scalar_result, vector_result;
    vector_sum(input, vlen, &vector_result);
    scalar_sum(input, vlen, &scalar_result);
    printf("RVV: %f \nScalar: %f\n", vector_result, scalar_result);
    return 0;
}
$GEM5/build/RISCV/gem5.fast $GEM5/configs/example/gem5_library/riscv-se.py 
gem5 Simulator System.  https://www.gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 version DEVELOP-FOR-23.1
gem5 compiled Oct 10 2023 08:30:21
gem5 started Oct 12 2023 10:34:58

Beginning simulation!
Global frequency set at 1000000000000 ticks per second
warn: No dot file generated. Please install pydot to generate the dot file and pdf.
src/arch/riscv/linux/se_workload.cc:60: warn: Unknown operating system; assuming Linux.
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
board.remote_gdb: Listening for connections on port 7000
src/sim/simulate.cc:194: info: Entering event queue @ 0.  Starting simulation...
src/sim/mem_state.cc:448: info: Increasing stack size by one page.
src/sim/mem_state.cc:448: info: Increasing stack size by one page.
RVV: nan 
Scalar: 58.000000
Exiting @ tick 18640874 because exiting with last active thread context.

$ qemu-riscv64 -cpu rv64,v=true,vext_spec=v1.0,vlen=256,elen=64 gem5_example 
RVV: 58.000000 
Scalar: 58.000000

Host Operating System
Ubuntu 20.04

Host ISA
X86

Compiler used
gcc 9.4.0 to build gem5
Clang 16 for RVV code

clang -O3 -march=rv64gcv --target=riscv64-unknown-linux-musl -static -o gem5_example main.c
@powerjg powerjg added the arch-riscv The RISC-V ISA label Oct 12, 2023
@powerjg
Copy link
Contributor

powerjg commented Oct 12, 2023

Thanks for the bug report! You've done a great job explaining the issue!

@BobbyRBruce BobbyRBruce added the sim-se gem5's Syscall Emulation Mode label Oct 16, 2023
@ivanaamit ivanaamit added the needs details Needs more information to reproduce or more details label Jan 18, 2024
@ivanaamit
Copy link
Contributor

Hi @OMaghiarIMG, could you re-run this on develop and check if some of the issues have been resolved with the new version? Thank you.

@kourzanov
Copy link

kourzanov commented Feb 8, 2024

We analyzed the problem and the root cause is gem5's RVV not conforming to RISC-V's NaN-boxing requirements image
Basically there are float32 instructions (vfmv ones for sure, but possibly others as well) that produce values in 64-bit float registers which are not 1-padded, which results in other instructions interpreting the value as a NaN (even with correct 32-bit contents, but no NaN-boxing)

@github-actions github-actions bot added the Stale label Mar 1, 2024
Copy link

github-actions bot commented Mar 9, 2024

This issue is being closed because it has been inactive waiting for response for 30 days. If this is still an issue, please open a new issue and reference this one.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 9, 2024
@ivanaamit ivanaamit reopened this Mar 11, 2024
@github-actions github-actions bot removed the Stale label Mar 12, 2024
@github-actions github-actions bot added the Stale label Apr 2, 2024
@ivanaamit
Copy link
Contributor

Hi @OMaghiarIMG, I want to try to reproduce this issue on the current develop branch to see if it is still a bug. Would it be possible for you to share precompiled binaries of these programs to save me the trouble? Thanks.

@github-actions github-actions bot removed the Stale label Apr 6, 2024
@OMaghiarIMG
Copy link
Author

Hi @OMaghiarIMG, I want to try to reproduce this issue on the current develop branch to see if it is still a bug. Would it be possible for you to share precompiled binaries of these programs to save me the trouble? Thanks.

Hello @ivanaamit, I've attached binaries for the two examples. Tested them with QEMU but it seems the SE script I previously used no longer works - not sure how to enable vector now.
gem5_examples.zip

@ivanaamit
Copy link
Contributor

ivanaamit commented Apr 9, 2024

You can use the following script to reproduce your errors. The problem with your script is that the RISCVMatchedBoard does not have RVV enabled.

from gem5.components.boards.simple_board import SimpleBoard
from gem5.components.cachehierarchies.classic.no_cache import NoCache
from gem5.components.memory import SingleChannelDDR3_1600
from gem5.components.processors.cpu_types import CPUTypes
from gem5.components.processors.simple_processor import SimpleProcessor
from gem5.isas import ISA
from gem5.resources.resource import obtain_resource, BinaryResource
from gem5.simulate.simulator import Simulator
from gem5.utils.requires import requires

requires(isa_required=ISA.RISCV)

cache_hierarchy = NoCache()

memory = SingleChannelDDR3_1600(size="124MB")

processor = SimpleProcessor(cpu_type=CPUTypes.TIMING, isa=ISA.RISCV, num_cores=1)

board = SimpleBoard(
    clk_freq="3GHz",
    processor=processor,
    memory=memory,
    cache_hierarchy=cache_hierarchy,
)

board.set_se_binary_workload(BinaryResource("/path/to/gem5_example"))

simulator = Simulator(board=board)
simulator.run()

I was able to reproduce your error for float. When running the script for widening, I am getting a different error than the one you reported, but in any case, it seems that there is a bug.

Global frequency set at 1000000000000 ticks per second
src/mem/dram_interface.cc:690: warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (128 Mbytes)
src/arch/riscv/isa.cc:275: info: RVV enabled, VLEN = 256 bits, ELEN = 64 bits
src/arch/riscv/linux/se_workload.cc:60: warn: Unknown operating system; assuming Linux.
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
board.remote_gdb: Listening for connections on port 7000
src/sim/simulate.cc:199: info: Entering event queue @ 0.  Starting simulation...
src/arch/riscv/faults.cc:213: panic: Illegal instruction 0x400020d2eec62857 at pc (0x103ec=>0x103f0).(0=>1): Unsupported overlap in Vs2 and Vd for Widening op
Memory Usage: 268180 KBytes
Program aborted at tick 140566959

If you have fixes for either, please feel free to contribute. Thanks.

@github-actions github-actions bot added the Stale label May 1, 2024
Copy link

github-actions bot commented May 9, 2024

This issue is being closed because it has been inactive waiting for response for 30 days. If this is still an issue, please open a new issue and reference this one.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 9, 2024
@ivanaamit ivanaamit removed the needs details Needs more information to reproduce or more details label May 10, 2024
@ivanaamit ivanaamit reopened this May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-riscv The RISC-V ISA bug sim-se gem5's Syscall Emulation Mode Stale
Projects
None yet
Development

No branches or pull requests

5 participants