Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to update my riscv-gnu-toolchain to enable RVV support #353

Closed
schwa1z opened this issue Aug 24, 2022 · 54 comments
Closed

How to update my riscv-gnu-toolchain to enable RVV support #353

schwa1z opened this issue Aug 24, 2022 · 54 comments

Comments

@schwa1z
Copy link

schwa1z commented Aug 24, 2022

Hello community!

For some reason I want to use GCC as the compiler to support riscv-vector extension compilation. Long ago, I have installed gcc and riscv-gcc. The following is the version information now:

when I type gcc --versionin terminal:

$gcc --version
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

and about riscv-gcc, the information in /riscv-gnu-toolchain/readme.md:

The branch info has recorded in `.gitmodules` file, which can set or update via
`git submodule add -b` or `git submodule set-branch`.

However the only way to check which branch are using is to check `.gitmodules`
file, here is the example for `riscv-gcc`, it using riscv-gcc-10.2.0 branch, so
it will has a section named `riscv-gcc` and has a field `branch` is
`riscv-gcc-10.2.0`.

[submodule "riscv-gcc"]
        path = riscv-gcc
        url = ../riscv-gcc.git
        branch = riscv-gcc-10.2.0

since I'm confused about the relationship between gcc and riscv-gcc, I put all the information here. RVV extension is not supported in this version.

Now I want to update the version of my toolchain to enable RVV extension support. What should I do and what command should I type?

Thank you very much!

@zhongjuzhe
Copy link
Collaborator

You should checkout to branch 'riscv-gcc-rvv-next'

@schwa1z
Copy link
Author

schwa1z commented Aug 24, 2022

Thank you for your reply! @zhongjuzhe
I have switch the branch to rvv-next and build the toolchain successfully. But there is another question.
The riscv core I used need the follow files:

    RISCV_CC            := riscv32-unknown-elf-gcc
    RISCV_DUMP          := riscv32-unknown-elf-objdump
    RISCV_OBCP          := riscv32-unknown-elf-objcopy
    ARCH                := rv32imv
    LD_SCRIPT           := $(SW_DIR)/link.ld

While in install_path/bin/:
微信截图_20220824184107

It looks like I don't build up the toolchain correctly, because the core need 32bit files but I have 64bit now.

After that what should I do to satisfy the need of the riscv core?

@zhongjuzhe
Copy link
Collaborator

You should specify --arch=rv32gcv

@schwa1z
Copy link
Author

schwa1z commented Aug 24, 2022

Thanks! I reconfigured the toolchain like this:
微信截图_20220824190458

while the error occurs:
微信截图_20220824190516

Do you have any idea?

@pz9115
Copy link

pz9115 commented Aug 24, 2022

I think you should try to use --with-arch=rv32gcv in configure.

@schwa1z
Copy link
Author

schwa1z commented Aug 24, 2022

Hi! @pz9115 .
As you can see, I actually used --with-arch=rv32gcv in the first figure here. But there is an error in the second figure. Do you have any idea?

Thanks! I reconfigured the toolchain like this: 微信截图_20220824190458

while the error occurs: 微信截图_20220824190516

Do you have any idea?

@TommyMurphyTM1234
Copy link

TommyMurphyTM1234 commented Aug 24, 2022

Edit: just noticed that you seem to have built a Linux toolchain (riscv64-unknown-linux-gnu prefixed) whereas you seem to need a bare metal/newlib toolchain (riscvXX-unknown-elf prefixed)? If that's the case then, after configuring the toolchain you should be doing make and not make linux.

Please note that riscv64-unknown-elf and riscv32-unknown-elf prefixed tools can both build 32 and 64 bit RISC-V software. The 32/64 in the names is a bit confusing. The only thing that it really reflects is the default XLEN supported.

The riscv64-unknown-elf toolchain that you built first will compile 32 bit software for your RV32 target if you pass the appropriate -march= and -mabi= options when compiling software for your target (presumably -march=rv32imv -mabi=ilp32?).

If you use the standard libs then, in order for the software to link without errors, you'll also need to specify --enable-multilib at configuration time before building the tools so that the toolchain includes RV32 multilibs and not just the default rv64gc libs. You may also need to change the default set of multilibs built as specified in gcc/config/riscv/t-elf-multilib. In particular, you presumably will want multilibs for rv32imv/ilp32? Although I'm not sure if the standard libs have any specific support for or use of V extension capabilities?

If your makefile/build scripts reference riscv32-unknown-elf prefixed tools then just change them to reference riscv64-unknown-elf prefixed tools.

What version of the V extension does your target implement?

@TommyMurphyTM1234
Copy link

Actually, on reflection, it's probably simpler for your purposes, if you're targeting rv32imv, to do...

./configure --with-arch=rv32imv --with-abi=ilp32
make

to build the toolchain just for your target architecture/abi.

@schwa1z
Copy link
Author

schwa1z commented Aug 25, 2022

Thank you for your patient reply! @TommyMurphyTM1234 .I benefited a lot from your answer.
Actually, after reading your first answer, I have configured the toolchain like the second answer you provide.
But I encountered another problem, which is the riscv core couldn'd print any output information.
When I compile vector-mode C code(with RVV Intrinsic style). There is no error occur, but there is no output too.
However, when I compile normal C code(without RVV instruction), it works.

I want to know, is the rvv-next branch support Intrinsic coding?
Or I need to code in inline assembly style?

When I choose LLVM as the compiler(which is successful) to compile RVV Intrinsic C code, the output would be like:
QQ截图20220825182042
When I choose GCC to compile RVV Intrinsic C code, the output would be like:
image
There is no output, which makes me confused. Could you help me?

@zhongjuzhe
Copy link
Collaborator

Thank you for your patient reply! @TommyMurphyTM1234 .I benefited a lot from your answer. Actually, after reading your first answer, I have configured the toolchain like the second answer you provide. But I encountered another problem, which is the riscv core couldn'd print any output information. When I compile vector-mode C code(with RVV Intrinsic style). There is no error occur, but there is no output too. However, when I compile normal C code(without RVV instruction), it works.

I want to know, is the rvv-next branch support Intrinsic coding? Or I need to code in inline assembly style?

When I choose LLVM as the compiler(which is successful) to compile RVV Intrinsic C code, the output would be like: QQ截图20220825182042 When I choose GCC to compile RVV Intrinsic C code, the output would be like: image There is no output, which makes me confused. Could you help me?

rvv-next support all RVV feature including intrinsic and auto-vectorization. Would you mind share your codes? And also you need to check whether you are using the latest branch "riscv-gcc-rvv-next" in riscv-gcc.

@schwa1z
Copy link
Author

schwa1z commented Aug 25, 2022

Thank you @zhongjuzhe
I don't know how to check the version of my toolchain. I download it yesterday. Could you tell me how to check the version?
I used this command to download:

git clone https://github.com/riscv/riscv-gnu-toolchain -b rvv-next

and this command to make:

./configure --prefix=$RISCV --with-arch=rv32imv --with-abi=ilp32 --enable-multilib
make

@zhongjuzhe
Copy link
Collaborator

cd riscv-gcc && git log

@zhongjuzhe
Copy link
Collaborator

Thank you @zhongjuzhe , no problem, here is the main code:

    for (int n = sizeof(greydata); n > 0; n -= vl, src_B += vl, src_G += vl, src_R += vl, dst += vl, src_p151 += vl) {

        vl            = vsetvl_e8m1(n);
        vuint16m2_t vec_dst = vle16_v_u16m2(dst, vl);

        vuint8m1_t vec_B = vle8_v_u8m1(src_B, vl);
        vuint8m1_t vec_G = vle8_v_u8m1(src_G, vl);
        vuint8m1_t vec_R = vle8_v_u8m1(src_R, vl);

        vuint8m1_t vec_p = vle8_v_u8m1(src_p151, vl);
 
        vec_dst           = vwmaccu_vx_u16m2(vec_dst, 28, vec_B, vl);
        vec_dst           = vwmaccu_vv_u16m2(vec_dst, vec_p, vec_G, vl);
        vec_dst           = vwmaccu_vx_u16m2(vec_dst, 77, vec_R, vl);

        
        vec_dst           = vsrl_vx_u16m2(vec_dst, 8, vl);
        vse16_v_u16m2(dst, vec_dst, vl);
    }

I don't know how to check the version of my toolchain. I download it yesterday. Could you tell me how to check the version? I used this command to download:

git clone https://github.com/riscv/riscv-gnu-toolchain -b rvv-next

and this command to make:

./configure --prefix=$RISCV --with-arch=rv32imv --with-abi=ilp32 --enable-multilib
make

Would you mind giving me the whole program? including the main function and data initilization.

@schwa1z
Copy link
Author

schwa1z commented Aug 25, 2022

OK, the git log:
image

@zhongjuzhe
Copy link
Collaborator

zhongjuzhe commented Aug 25, 2022

No, plz Don't git log in riscv-gnu-toolchain directory. Plz git log riscv-gcc directly. cd riscv-gcc and then git log.

@schwa1z
Copy link
Author

schwa1z commented Aug 25, 2022

I'm sorry. Here it is:
image

@zhongjuzhe
Copy link
Collaborator

I'm sorry. Here it is: image

These codes are obsolete. Plz git pull in riscv-gcc and rebuild the toolchain. The latest commit of risc-gcc-rvv-next should be commit 53fcb21. And try your program again.

@schwa1z
Copy link
Author

schwa1z commented Aug 26, 2022

That's the problem! @zhongjuzhe I pull the latest commit and compile the code successfully.
All clear! Thank everyone in this post for your help! I'm very grateful!

@HamzaShabbir517
Copy link

@schwa1z Hope you are doing good. Can you please share the steps you follow to compile the Vector C code. As i am also trying to configure the toolchain for RV32IMFCV target. I have gone through the whole conversation and as mention i also try git log and it shows the same message as you post it here but when it try to run git pull it gives me an error so may be i am doing something wrong. Your steps may help me out. Thanks in advance

Just for reference sharing the snap
Screenshot from 2022-08-27 19-40-16

@schwa1z
Copy link
Author

schwa1z commented Aug 28, 2022

@HamzaShabbir517 Yes, I'm glad to share my method to configure the toolchain.

git clone https://github.com/riscv/riscv-gnu-toolchain -b rvv-next

then, in .../riscv-gnu-toolchain/, switch to 'riscv-gcc-rvv-next'branch, and pull the latest version

cd riscv-gcc
git checkout -b riscv-gcc-rvv-next
git pull

After that, configure the toolchain:(for you maybe --with-arch=rv32imfcv? I'm not sure), note that you need to configure the environment varible in bashrc. If you don't know how to configure the environment varible, tell me.

./configure --prefix=$RISCV --with-arch=rv32imv --with-abi=ilp32 --enable-multilib

and make:

sudo make

@HamzaShabbir517
Copy link

HamzaShabbir517 commented Aug 28, 2022

@schwa1z Thanks i was able to build it.
The another question i have is where can I find the definitions of functions of vector instruction like you have used in your program.
And is there any way to tell the compiler not to generate specific instructions as we all know vector extension contain lots of instructions and i have implement only few of them.

@schwa1z
Copy link
Author

schwa1z commented Aug 28, 2022

@HamzaShabbir517. There are two ways to call riscv vector instructions.
One is to use intrinsic(which is a API), like my code above. You can refer to https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md and riscv-v-spec-1.0 to write intrinsic code.
Another is to use inline assembly in C code. To be honest, I'm not good at this way. But you could find many reference on Internet.

I'm a little bit of confused about your last question. When you write specific Intrinsic/inline assembly in C code to call vector instruction, the compiler just compile the specific instruction you write, instead of translating other instructions.

@HamzaShabbir517
Copy link

Below is the snap of the C code i write for simple vector addition.
Screenshot from 2022-08-28 09-19-59

The assembly generated by the compiler include vl1re32.v instructions as well in some places as my hardware does not support this instruction so i want the compiler not to generate such instruction

@schwa1z
Copy link
Author

schwa1z commented Aug 28, 2022

I don't find this instruction in riscv-v-spec-1.0. But it looks strange. I don't know how to prevent the compiler to generate specific instruction. Maybe you could open a new issue.

@zhongjuzhe
Copy link
Collaborator

zhongjuzhe commented Aug 28, 2022 via email

@TommyMurphyTM1234
Copy link

@zhongjuzhe
Copy link
Collaborator

zhongjuzhe commented Aug 28, 2022

Below is the snap of the C code i write for simple vector addition. Screenshot from 2022-08-28 09-19-59

The assembly generated by the compiler include vl1re32.v instructions as well in some places as my hardware does not support this instruction so i want the compiler not to generate such instruction

I have tried your code using RVV GCC
riscv64-unknow-elf-gcc -O3:
generated assmebly:

foo:
beq a0,zero,.L1
.L3:
vsetvli a5,a0,e32,m1,ta,mu
slli a4,a5,2
vle32.v v24,(a2)
vle32.v v25,(a3)
add a2,a2,a4
vadd.vv v24,v24,v25
add a3,a3,a4
vse32.v v24,(a1)
add a1,a1,a4
j .L3
.L1:
ret

There is no vl1re32.v instructions when I compile using O3.
If you use O0 to compile, it may possible generate vl1re32.v because compiler
should use conservative way to compile the c codes.

Here is code-gen in LLVM:
https://godbolt.org/z/ecdfhreTW
GCC behaves same as LLVM.

@HamzaShabbir517
Copy link

thanks everyone for helping me out. Really appreciate it.

@zhongjuzhe
Copy link
Collaborator

zhongjuzhe commented Aug 28, 2022 via email

@HamzaShabbir517
Copy link

HamzaShabbir517 commented Aug 28, 2022

@zhongjuzhe Can you try this code as well at your side.
#include <riscv_vector.h>
void vector_add(size_t n, int32_t a[n], const int32_t b[n], const int32_t c[n])
{

while (n>0) 
{
size_t vl = vsetvl_e32m1(n);
vint32m1_t vb = vle32_v_i32m1(b,vl);
vint32m1_t vc = vle32_v_i32m1(c,vl);
vint32m1_t va = vadd_vv_i32m1(vb,vc,vl);
vse32_v_i32m1(a, va,vl);
a += vl;
b += vl;
c += vl;
n -= vl;
}

}

void main()
{
size_t n = 8;
int32_t A[] = {1,2,3,4,5,6,7,8};
int32_t B[] = {1,2,3,4,5,6,7,8};
int32_t C[8];

vector_add(n,C,A,B);

}

I am getting this assembly
ee000000 <vector_add>:
ee000000: c115 beqz a0,ee000024 <vector_add+0x24>
ee000002: 050577d7 vsetvli a5,a0,e32,m1,ta,mu
ee000006: 00279713 slli a4,a5,0x2
ee00000a: 02066c07 vle32.v v24,(a2)
ee00000e: 0206ec87 vle32.v v25,(a3)
ee000012: 8d1d sub a0,a0,a5
ee000014: 038c8c57 vadd.vv v24,v24,v25
ee000018: 0205ec27 vse32.v v24,(a1)
ee00001c: 963a add a2,a2,a4
ee00001e: 95ba add a1,a1,a4
ee000020: 96ba add a3,a3,a4
ee000022: f165 bnez a0,ee000002 <vector_add+0x2>
ee000024: 8082 ret

Disassembly of section .text.startup:

ee000026

:
ee000026: f0040737 lui a4,0xf0040
ee00002a: 711d addi sp,sp,-96
ee00002c: 00070613 mv a2,a4
ee000030: 02000793 li a5,32
ee000034: 868a mv a3,sp
ee000036: 00070713 mv a4,a4
ee00003a: 0c37f5d7 vsetvli a1,a5,e8,m8,ta,ma
ee00003e: 02070c07 vle8.v v24,(a4)
ee000042: 8f8d sub a5,a5,a1
ee000044: 02068c27 vse8.v v24,(a3)
ee000048: 972e add a4,a4,a1
ee00004a: 96ae add a3,a3,a1
ee00004c: f7fd bnez a5,ee00003a <main+0x14>
ee00004e: 02000793 li a5,32
ee000052: 1018 addi a4,sp,32
ee000054: 0c37f6d7 vsetvli a3,a5,e8,m8,ta,ma
ee000058: 02060c07 vle8.v v24,(a2)
ee00005c: 8f95 sub a5,a5,a3
ee00005e: 02070c27 vse8.v v24,(a4)
ee000062: 9636 add a2,a2,a3
ee000064: 9736 add a4,a4,a3
ee000066: f7fd bnez a5,ee000054 <main+0x2e>
ee000068: 0094 addi a3,sp,64
ee00006a: 1008 addi a0,sp,32
ee00006c: 880a mv a6,sp
ee00006e: 4721 li a4,8
ee000070: 050777d7 vsetvli a5,a4,e32,m1,ta,mu
ee000074: 00279613 slli a2,a5,0x2
ee000078: 02086c07 vle32.v v24,(a6)
ee00007c: 02056c87 vle32.v v25,(a0)
ee000080: 8f1d sub a4,a4,a5
ee000082: 038c8c57 vadd.vv v24,v24,v25
ee000086: 0206ec27 vse32.v v24,(a3)
ee00008a: 9832 add a6,a6,a2
ee00008c: 96b2 add a3,a3,a2
ee00008e: 9532 add a0,a0,a2
ee000090: f365 bnez a4,ee000070 <main+0x4a>
ee000092: 6125 addi sp,sp,96
ee000094: 8082 ret

@TommyMurphyTM1234
Copy link

@zhongjuzhe Can you try this code as well at your side. Screenshot from 2022-08-28 20-42-09

You really should post the actual code and not a screenshot expecting the other person to manually type the code in.

@zhongjuzhe
Copy link
Collaborator

@zhongjuzhe Can you try this code as well at your side. #include <riscv_vector.h> void vector_add(size_t n, int32_t a[n], const int32_t b[n], const int32_t c[n]) {

while (n>0) 
{
size_t vl = vsetvl_e32m1(n);
vint32m1_t vb = vle32_v_i32m1(b,vl);
vint32m1_t vc = vle32_v_i32m1(c,vl);
vint32m1_t va = vadd_vv_i32m1(vb,vc,vl);
vse32_v_i32m1(a, va,vl);
a += vl;
b += vl;
c += vl;
n -= vl;
}

}

void main() { size_t n = 8; int32_t A[] = {1,2,3,4,5,6,7,8}; int32_t B[] = {1,2,3,4,5,6,7,8}; int32_t C[8];

vector_add(n,C,A,B);

}

I am getting this assembly ee000000 <vector_add>: ee000000: c115 beqz a0,ee000024 <vector_add+0x24> ee000002: 050577d7 vsetvli a5,a0,e32,m1,ta,mu ee000006: 00279713 slli a4,a5,0x2 ee00000a: 02066c07 vle32.v v24,(a2) ee00000e: 0206ec87 vle32.v v25,(a3) ee000012: 8d1d sub a0,a0,a5 ee000014: 038c8c57 vadd.vv v24,v24,v25 ee000018: 0205ec27 vse32.v v24,(a1) ee00001c: 963a add a2,a2,a4 ee00001e: 95ba add a1,a1,a4 ee000020: 96ba add a3,a3,a4 ee000022: f165 bnez a0,ee000002 <vector_add+0x2> ee000024: 8082 ret

Disassembly of section .text.startup:

ee000026

:
ee000026: f0040737 lui a4,0xf0040
ee00002a: 711d addi sp,sp,-96
ee00002c: 00070613 mv a2,a4
ee000030: 02000793 li a5,32
ee000034: 868a mv a3,sp
ee000036: 00070713 mv a4,a4
ee00003a: 0c37f5d7 vsetvli a1,a5,e8,m8,ta,ma
ee00003e: 02070c07 vle8.v v24,(a4)
ee000042: 8f8d sub a5,a5,a1
ee000044: 02068c27 vse8.v v24,(a3)
ee000048: 972e add a4,a4,a1
ee00004a: 96ae add a3,a3,a1
ee00004c: f7fd bnez a5,ee00003a <main+0x14>
ee00004e: 02000793 li a5,32
ee000052: 1018 addi a4,sp,32
ee000054: 0c37f6d7 vsetvli a3,a5,e8,m8,ta,ma
ee000058: 02060c07 vle8.v v24,(a2)
ee00005c: 8f95 sub a5,a5,a3
ee00005e: 02070c27 vse8.v v24,(a4)
ee000062: 9636 add a2,a2,a3
ee000064: 9736 add a4,a4,a3
ee000066: f7fd bnez a5,ee000054 <main+0x2e>
ee000068: 0094 addi a3,sp,64
ee00006a: 1008 addi a0,sp,32
ee00006c: 880a mv a6,sp
ee00006e: 4721 li a4,8
ee000070: 050777d7 vsetvli a5,a4,e32,m1,ta,mu
ee000074: 00279613 slli a2,a5,0x2
ee000078: 02086c07 vle32.v v24,(a6)
ee00007c: 02056c87 vle32.v v25,(a0)
ee000080: 8f1d sub a4,a4,a5
ee000082: 038c8c57 vadd.vv v24,v24,v25
ee000086: 0206ec27 vse32.v v24,(a3)
ee00008a: 9832 add a6,a6,a2
ee00008c: 96b2 add a3,a3,a2
ee00008e: 9532 add a0,a0,a2
ee000090: f365 bnez a4,ee000070 <main+0x4a>
ee000092: 6125 addi sp,sp,96
ee000094: 8082 ret

You should use -fno-tree-vectorize to disable auto-vectorization

@HamzaShabbir517
Copy link

HamzaShabbir517 commented Aug 29, 2022

I have tried the flag as you mention but nothing happen got the same assembly as above. These are the CFlags i am using

TEST_CFLAGS = -march=rv32imfcv -mabi=ilp32f -O3 -fno-tree-vectorize

@zhongjuzhe
Copy link
Collaborator

I have tried the flag as you mention but nothing happen got the same assembly as above. These are the CFlags i am using

TEST_CFLAGS = -march=rv32imfcv -mabi=ilp32f -O3 -fno-tree-vectorize

You said :
ee000000 <vector_add>:
ee000000: c115 beqz a0,ee000024 <vector_add+0x24>
ee000002: 050577d7 vsetvli a5,a0,e32,m1,ta,mu
ee000006: 00279713 slli a4,a5,0x2
ee00000a: 02066c07 vle32.v v24,(a2)
ee00000e: 0206ec87 vle32.v v25,(a3)
ee000012: 8d1d sub a0,a0,a5
ee000014: 038c8c57 vadd.vv v24,v24,v25
ee000018: 0205ec27 vse32.v v24,(a1)
ee00001c: 963a add a2,a2,a4
ee00001e: 95ba add a1,a1,a4
ee000020: 96ba add a3,a3,a4
ee000022: f165 bnez a0,ee000002 <vector_add+0x2>
ee000024: 8082 ret

This assembly is correct. What's the problem?

@HamzaShabbir517
Copy link

The problem is in the main loop:
ee000026

:
ee000026: f0040737 lui a4,0xf0040
ee00002a: 7159 addi sp,sp,-112
ee00002c: 00070613 mv a2,a4
ee000030: 02400793 li a5,36
ee000034: 1034 addi a3,sp,40
ee000036: 00070713 mv a4,a4
ee00003a: 0c37f5d7 vsetvli a1,a5,e8,m8,ta,ma
ee00003e: 02070c07 vle8.v v24,(a4)
ee000042: 8f8d sub a5,a5,a1
ee000044: 02068c27 vse8.v v24,(a3)
ee000048: 972e add a4,a4,a1
ee00004a: 96ae add a3,a3,a1
ee00004c: f7fd bnez a5,ee00003a <main+0x14>

as here u can see to load scalar data it is also using vector instruction i want to stop that

@zhongjuzhe
Copy link
Collaborator

You can use -Os instead of -O3. It will stop optimizing "memcpy" using vector instructions.

@HamzaShabbir517
Copy link

I have already try that as whenever i use -Os it gives me the following error
Screenshot from 2022-08-29 12-06-42

@zhongjuzhe
Copy link
Collaborator

I have already try that as whenever i use -Os it gives me the following error Screenshot from 2022-08-29 12-06-42

This is the issue of linker. If you want to use -march=rv32imfcv -mabi=ilp32f.
You should build a rv32 toolchain which generate riscv32-unknown-elf-gcc.

Before that you could use the riscv64-unknown-elf-gcc -march=rv64imfcv -mabi=lp64f -Os to check whether my
suggestion is working for you for now.

@TommyMurphyTM1234
Copy link

TommyMurphyTM1234 commented Aug 29, 2022

I have already try that as whenever i use -Os it gives me the following error Screenshot from 2022-08-29 12-06-42

Looks like your toolchain doesn't have the multilib for rc32imfcv/ilp32f. If you built a multilib toolchain then note that that arch/abi is not in the default list of multilibs built.

You might be better off building a toolchain for your specific target as I suggested previously.

  • ./configure --with-arch=rv32imfcv --with-abi=ilp32f --prefix=...

@TommyMurphyTM1234
Copy link

TommyMurphyTM1234 commented Aug 29, 2022

I have already try that as whenever i use -Os it gives me the following error Screenshot from 2022-08-29 12-06-42

This is the issue of linker. If you want to use -march=rv32imfcv -mabi=ilp32f. You should build a rv32 toolchain which generate riscv32-unknown-elf-gcc.

Just being rv32 is not sufficient. It will need to be a toolchain that targets the specific arch/abi used here.

./configure --with-arch=rv32imfcv --with-abi=ilp32f --prefix=...

Note, also, that rv32 or rv64 toolchains can both compile 32 and 64 bit RISC-V code. But the default set of multilibs is only a selection of all possible multilibs and doesn't include any V multilibs.

In this case, rather than extending/tweaking the multilib list, it's probably simpler to just build an rv32imfcv/ilp32f toolchain.

@HamzaShabbir517
Copy link

I have build both the toolchains for 64 and 32 by using the following commands

for 64 bit
../configure --prefix=/home/hshabbir/local/riscvv09/gnu/ --with-arch=rv64imfcv --enable-multilib

for 32 bit
./configure --with-arch=rv32imfcv_zicsr_zifencei --with-abi=ilp32f --prefix=/home/hshabbir/local/rv32imfcv --enable-multilib

But still will try to build a specific toolchain and will try to compile it again

@TommyMurphyTM1234
Copy link

for 64 bit ../configure --prefix=/home/hshabbir/local/riscvv09/gnu/ --with-arch=rv64imfcv --enable-multilib

You don't need this toolchain.

for 32 bit ./configure --with-arch=rv32imfcv_zicsr_zifencei --with-abi=ilp32f --prefix=/home/hshabbir/local/rv32imfcv --enable-multilib

You don't need multilib.

I forgot about _Zicsr_Zifencei and am not sure that you need to specify it here, but if it works and doesn't cause any error then stick with it.

@HamzaShabbir517
Copy link

ok i will try it and get back to you

@HamzaShabbir517
Copy link

@TommyMurphyTM1234 @zhongjuzhe I have rebuild the toolchain by using the following command
./configure --with-arch=rv32imfcv--with-abi=ilp32f

Now when i compile the code by using following CFLAGS
TEST_CFLAGS = -march=rv32imfcv -mabi=ilp32f -Os

The code got compile by this have some memcy function in which there are lots of instructions
Disassembly of section .text:

ee000000 <vector_add>:
ee000000: e111 bnez a0,ee000004 <vector_add+0x4>
ee000002: 8082 ret
ee000004: 050577d7 vsetvli a5,a0,e32,m1,ta,mu
ee000008: 02066c07 vle32.v v24,(a2)
ee00000c: 0206ec87 vle32.v v25,(a3)
ee000010: 038c8c57 vadd.vv v24,v24,v25
ee000014: 0205ec27 vse32.v v24,(a1)
ee000018: 00279713 slli a4,a5,0x2
ee00001c: 95ba add a1,a1,a4
ee00001e: 963a add a2,a2,a4
ee000020: 96ba add a3,a3,a4
ee000022: 8d1d sub a0,a0,a5
ee000024: bff1 j ee000000 <vector_add>

Disassembly of section .text.startup:

ee000026

:
ee000026: 7159 addi sp,sp,-112
ee000028: d4a2 sw s0,104(sp)
ee00002a: f0040437 lui s0,0xf0040
ee00002e: 00040593 mv a1,s0
ee000032: 02000613 li a2,32
ee000036: 850a mv a0,sp
ee000038: d686 sw ra,108(sp)
ee00003a: 026000ef jal ra,ee000060
ee00003e: 00040593 mv a1,s0
ee000042: 02000613 li a2,32
ee000046: 1008 addi a0,sp,32
ee000048: 018000ef jal ra,ee000060
ee00004c: 1014 addi a3,sp,32
ee00004e: 860a mv a2,sp
ee000050: 008c addi a1,sp,64
ee000052: 4521 li a0,8
ee000054: fadff0ef jal ra,ee000000 <vector_add>
ee000058: 50b6 lw ra,108(sp)
ee00005a: 5426 lw s0,104(sp)
ee00005c: 6165 addi sp,sp,112
ee00005e: 8082 ret

Disassembly of section .text.memcpy:

ee000060 :
ee000060: 00b547b3 xor a5,a0,a1
ee000064: 8b8d andi a5,a5,3
ee000066: 00c508b3 add a7,a0,a2
ee00006a: ebf9 bnez a5,ee000140 <memcpy+0xe0>
ee00006c: 478d li a5,3
ee00006e: 0cc7f963 bgeu a5,a2,ee000140 <memcpy+0xe0>
ee000072: 00357793 andi a5,a0,3
ee000076: 872a mv a4,a0
ee000078: eba9 bnez a5,ee0000ca <memcpy+0x6a>
ee00007a: ffc8f793 andi a5,a7,-4
ee00007e: 40e78633 sub a2,a5,a4
ee000082: 02000693 li a3,32
ee000086: 02000293 li t0,32
ee00008a: 06c6c363 blt a3,a2,ee0000f0 <memcpy+0x90>
ee00008e: 02f77b63 bgeu a4,a5,ee0000c4 <memcpy+0x64>
ee000092: 17fd addi a5,a5,-1
ee000094: 8f99 sub a5,a5,a4
ee000096: 8389 srli a5,a5,0x2
ee000098: 00178813 addi a6,a5,1
ee00009c: 86ba mv a3,a4
ee00009e: 862e mv a2,a1
ee0000a0: 87c2 mv a5,a6
ee0000a2: c2202373 csrr t1,vlenb
ee0000a6: 0507fe57 vsetvli t3,a5,e32,m1,ta,mu
ee0000aa: 02066c07 vle32.v v24,(a2)
ee0000ae: 41c787b3 sub a5,a5,t3
ee0000b2: 0206ec27 vse32.v v24,(a3)
ee0000b6: 961a add a2,a2,t1
ee0000b8: 969a add a3,a3,t1
ee0000ba: f7f5 bnez a5,ee0000a6 <memcpy+0x46>
ee0000bc: 00281793 slli a5,a6,0x2
ee0000c0: 973e add a4,a4,a5
ee0000c2: 95be add a1,a1,a5
ee0000c4: 09176163 bltu a4,a7,ee000146 <memcpy+0xe6>
ee0000c8: 8082 ret
ee0000ca: 0005c683 lbu a3,0(a1)
ee0000ce: 0705 addi a4,a4,1
ee0000d0: 00377793 andi a5,a4,3
ee0000d4: fed70fa3 sb a3,-1(a4)
ee0000d8: 0585 addi a1,a1,1
ee0000da: d3c5 beqz a5,ee00007a <memcpy+0x1a>
ee0000dc: 0005c683 lbu a3,0(a1)
ee0000e0: 0705 addi a4,a4,1
ee0000e2: 00377793 andi a5,a4,3
ee0000e6: fed70fa3 sb a3,-1(a4)
ee0000ea: 0585 addi a1,a1,1
ee0000ec: fff9 bnez a5,ee0000ca <memcpy+0x6a>
ee0000ee: b771 j ee00007a <memcpy+0x1a>
ee0000f0: 41d0 lw a2,4(a1)
ee0000f2: 4dd4 lw a3,28(a1)
ee0000f4: 0005af83 lw t6,0(a1)
ee0000f8: 0085af03 lw t5,8(a1)
ee0000fc: 00c5ae83 lw t4,12(a1)
ee000100: 0105ae03 lw t3,16(a1)
ee000104: 0145a303 lw t1,20(a1)
ee000108: 0185a803 lw a6,24(a1)
ee00010c: c350 sw a2,4(a4)
ee00010e: 5190 lw a2,32(a1)
ee000110: 01f72023 sw t6,0(a4)
ee000114: 01e72423 sw t5,8(a4)
ee000118: 01d72623 sw t4,12(a4)
ee00011c: 01c72823 sw t3,16(a4)
ee000120: 00672a23 sw t1,20(a4)
ee000124: 01072c23 sw a6,24(a4)
ee000128: cf54 sw a3,28(a4)
ee00012a: 02470713 addi a4,a4,36
ee00012e: 40e786b3 sub a3,a5,a4
ee000132: fec72e23 sw a2,-4(a4)
ee000136: 02458593 addi a1,a1,36
ee00013a: fad2cbe3 blt t0,a3,ee0000f0 <memcpy+0x90>
ee00013e: bf81 j ee00008e <memcpy+0x2e>
ee000140: 872a mv a4,a0
ee000142: 03157163 bgeu a0,a7,ee000164 <memcpy+0x104>
ee000146: 40e887b3 sub a5,a7,a4
ee00014a: c22026f3 csrr a3,vlenb
ee00014e: 0407f657 vsetvli a2,a5,e8,m1,ta,mu
ee000152: 02058c07 vle8.v v24,(a1)
ee000156: 8f91 sub a5,a5,a2
ee000158: 02070c27 vse8.v v24,(a4)
ee00015c: 95b6 add a1,a1,a3
ee00015e: 9736 add a4,a4,a3
ee000160: f7fd bnez a5,ee00014e <memcpy+0xee>
ee000162: 8082 ret
ee000164: 8082 ret

This is the whole code. When i run the same code on godbolt.org so the code compile from there is something like this
vector_add: # @vector_add
beqz a0, .LBB0_2
.LBB0_1: # =>This Inner Loop Header: Depth=1
vsetvli a4, a0, e32, m1, ta, mu
vle32.v v8, (a2)
vle32.v v9, (a3)
vadd.vv v8, v8, v9
vse32.v v8, (a1)
slli a5, a4, 2
add a1, a1, a5
add a2, a2, a5
sub a0, a0, a4
add a3, a3, a5
bnez a0, .LBB0_1
.LBB0_2:
ret
foo: # @foo
addi sp, sp, -96
li a0, 8
sw a0, 92(sp)
li a6, 7
sw a6, 88(sp)
li a7, 6
sw a7, 84(sp)
li a3, 5
sw a3, 80(sp)
li a4, 4
sw a4, 76(sp)
li a5, 3
sw a5, 72(sp)
li a1, 2
sw a1, 68(sp)
li a2, 1
sw a2, 64(sp)
sw a0, 60(sp)
sw a6, 56(sp)
sw a7, 52(sp)
sw a3, 48(sp)
sw a4, 44(sp)
sw a5, 40(sp)
sw a1, 36(sp)
sw a2, 32(sp)
addi a1, sp, 32
addi a2, sp, 64
mv a3, sp
.LBB1_1: # =>This Inner Loop Header: Depth=1
vsetvli a4, a0, e32, m1, ta, mu
vle32.v v8, (a2)
vle32.v v9, (a1)
vadd.vv v8, v8, v9
vse32.v v8, (a3)
slli a5, a4, 2
add a3, a3, a5
add a2, a2, a5
sub a0, a0, a4
add a1, a1, a5
bnez a0, .LBB1_1
addi sp, sp, 96
ret

Now what should i do

@HamzaShabbir517
Copy link

HamzaShabbir517 commented Aug 29, 2022

This is the make file command i am using to build the code

ifneq (,$(wildcard $(TEST_DIR)/$(TEST).makefile))
program.hex:
	@echo Building $(TEST) via $(TEST_DIR)/$(TEST).makefile
	$(MAKE) -f $(TEST_DIR)/$(TEST).makefile
else
program.hex: $(OFILES) $(LINK)
	@echo Building $(TEST)
	$(GCC_PREFIX)-gcc $(ABI) -Wl,-Map=$(TEST).map -lgcc -T$(LINK) -o $(TEST).exe $(OFILES) -nostartfiles -lm $(TEST_LIBS)
	$(GCC_PREFIX)-objcopy -O verilog  $(TEST).exe program.hex
	$(GCC_PREFIX)-objdump -S $(TEST).exe > $(TEST).dis
	@echo Completed building $(TEST)

%.o : %.c ${BUILD_DIR}/defines.h
	$(GCC_PREFIX)-gcc ${includes} ${TEST_CFLAGS} -DCOMPILER_FLAGS="\"${TEST_CFLAGS}\"" ${ABI} -nostdlib -c $< -o $@

@TommyMurphyTM1234
Copy link

Your makefile snippet is impossible to read.
You should put it inside code tags.

@HamzaShabbir517
Copy link

HamzaShabbir517 commented Aug 29, 2022

@TommyMurphyTM1234 I have updated the code in the previous comment of mine

I understand your problem now. You want to use scalar instructions to do the memcpy. However, current rvv-next enable optimizing memcpy using RVV instructions. I will add a compile option to disable this today. After I have done this, I will tell you to pull the latest codes.

@zhongjuzhe
Copy link
Collaborator

@TommyMurphyTM1234 I have updated the code in the previous comment of mine

I understand your problem now. You want to use scalar instructions to do the memcpy. However, current rvv-next enable optimizing memcpy using RVV instructions. I will add a compile option to disable this today. After I have done this, I will tell you to pull the latest codes.

@zhongjuzhe
Copy link
Collaborator

@TommyMurphyTM1234 I have updated the code in the previous comment of mine

I understand your problem now. You want to use scalar instructions to do the memcpy. However, current rvv-next enable optimizing memcpy using RVV instructions. I will add a compile option to disable this today. After I have done this, I will tell you to pull the latest codes.

Plz pull the latest codes, the latest code should be:
42df346
Then rebuild the toolchain. use -O3 -fno-tree-vectorize to compile your program

@HamzaShabbir517
Copy link

ok thanks i will rebuild in and let you know

@HamzaShabbir517
Copy link

HamzaShabbir517 commented Aug 30, 2022

@TommyMurphyTM1234 @zhongjuzhe Thanks its working completely fine now

@TommyMurphyTM1234
Copy link

@TommyMurphyTM1234 Thanks its working completely fine now

I presume you meant @zhongjuzhe ? 🙂

@TommyMurphyTM1234
Copy link

@HamzaShabbir517 Yes, I'm glad to share my method to configure the toolchain.

git clone https://github.com/riscv/riscv-gnu-toolchain -b rvv-next

then, in .../riscv-gnu-toolchain/, switch to 'riscv-gcc-rvv-next'branch, and pull the latest version

cd riscv-gcc
git checkout -b riscv-gcc-rvv-next
git pull

After that, configure the toolchain:(for you maybe --with-arch=rv32imfcv? I'm not sure), note that you need to configure the environment varible in bashrc. If you don't know how to configure the environment varible, tell me.

./configure --prefix=$RISCV --with-arch=rv32imv --with-abi=ilp32 --enable-multilib

and make:

sudo make

Hi @HamzaShabbir517 - should these instructions still work? When I try them I get this:

user@hornbill:~/Downloads/rvv-toolchain$ git clone https://github.com/riscv/riscv-gnu-toolchain -b rvv-next
Cloning into 'riscv-gnu-toolchain'...
remote: Enumerating objects: 8099, done.
remote: Counting objects: 100% (26/26), done.
remote: Compressing objects: 100% (23/23), done.
remote: Total 8099 (delta 5), reused 15 (delta 2), pack-reused 8073
Receiving objects: 100% (8099/8099), 5.08 MiB | 7.63 MiB/s, done.
Resolving deltas: 100% (4023/4023), done.
user@hornbill:~/Downloads/rvv-toolchain$ cd riscv-gnu-toolchain/
user@hornbill:~/Downloads/rvv-toolchain/riscv-gnu-toolchain$ cd riscv-gcc/
user@hornbill:~/Downloads/rvv-toolchain/riscv-gnu-toolchain/riscv-gcc$ git checkout -b riscv-gcc-rvv-next
Switched to a new branch 'riscv-gcc-rvv-next'
user@hornbill:~/Downloads/rvv-toolchain/riscv-gnu-toolchain/riscv-gcc$ git pull
There is no tracking information for the current branch.
Please specify which branch you want to merge with.
See git-pull(1) for details.

    git pull <remote> <branch>

If you wish to set tracking information for this branch you can do so with:

    git branch --set-upstream-to=origin/<branch> riscv-gcc-rvv-next

user@hornbill:~/Downloads/rvv-toolchain/riscv-gnu-toolchain/riscv-gcc$ 

@HamzaShabbir517
Copy link

@TommyMurphyTM1234 Yes but i think you dont need to git pull as when you checkout it checkout the branch with all the updates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants