Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuck at the complie flow make riscv_tests_simv #61

Closed
fantasysee opened this issue Aug 19, 2021 · 7 comments
Closed

Stuck at the complie flow make riscv_tests_simv #61

fantasysee opened this issue Aug 19, 2021 · 7 comments
Assignees

Comments

@fantasysee
Copy link

Hi, @mp-17 @suehtamacv
When I try to make riscv_tests_simv according to the README file, my terminal has been stuck with no message update for a long while, about a few hours.

(base) ➜ hardware git:(main) ✗ make riscv_tests_simv
build/verilator/Vara_tb_verilator -l ram,/home/fantasysee/Projects/ara/apps/bin/rv64uv-ara-vadd,elf &> build/rv64uv-ara-vadd.trace

And I checked the message in the build/rv64uv-ara-vadd.trace file for several times, which is listed as below. It remains the same for a long while as well.

Program header number 0 in `/home/fantasysee/Projects/ara/apps/bin/rv64uv-ara-vadd' low is 80000000
Program header number 0 in `/home/fantasysee/Projects/ara/apps/bin/rv64uv-ara-vadd' high is 80004179
Program header number 1 in `/home/fantasysee/Projects/ara/apps/bin/rv64uv-ara-vadd' high is 80004877
Program header number 2 in `/home/fantasysee/Projects/ara/apps/bin/rv64uv-ara-vadd' high is 80004b17
Program header number 3 in `/home/fantasysee/Projects/ara/apps/bin/rv64uv-ara-vadd' is not of type PT_LOAD; ignoring.
Set `ram TOP.ara_tb_verilator.dut.i_ara_soc.i_dram 10 0x80000000 0x80000 write with offset: 0x0 write with size: 0x4b18
Simulation of Ara
=================


Simulation running, end by pressing CTRL-c.

Note that, my QuestaSim version is Mentor Graphics QuestaSim 10.6c instead of Mentor Graphics QuestaSim 2020.1. And I merely make a fake version soft link to 2020.1, with no modification in the hardware/Makefile.

Is this experimental phenomenon normal? If yes, could you please tell me how long this process approximately lasts? If no, would you please help me check if there is something wrong with my experimental environment?

Thanks in advance!!!

@suehtamacv
Copy link
Contributor

Hi @fantasysee,

That is weird indeed, simulation usually takes no more than 2min. And, as you can see in the CI, we can simulate this vadd kernel. How did you compile the binary? Did you compile it with the compiler provided with this repo?

From what you report, I think an exception happened. If the core reaches an illegal exception, it enters a spinlock phase, and does not leave it (we should probably fix this). Can you attach the vadd binary you are trying to simulate here?

Matheus

@suehtamacv suehtamacv self-assigned this Aug 20, 2021
@fantasysee
Copy link
Author

fantasysee commented Aug 20, 2021

Hi Matheus @suehtamacv,

Thank you very much for your warm reply.

I compile the binary following the README.md with the compiler provided with this repo:

  1. Build the LLVM toolchain & Spike: make toolchain-llvm; make riscv-isa-sim.[The gcc toolchain I also built: make toolchain-gcc]
  2. Build Verilator: make verilator.
  3. Build Applications & RISC-V Tests: cd apps; make bin/hello_world; make riscv_tests;.
  4. Simulate the unit tests:
# Go to the hardware folder
cd hardware
# Apply the patches (only need to run this once)
make apply-patches
# Verilate the design
make verilate
# Run the tests
make riscv_tests_simv

The links of rv64uv-ara-vadd binary and its dump are attached here.

Thanks again!!!

Best Regards,
Chao Fang

@suehtamacv
Copy link
Contributor

Hi Chao,

I just compared your binary with the one I have on my machine, and while there were a few differences in the addresses, they are not significant. I then tried to run a simulation of your binary with a freshly-compiled verilator model, and it worked without issues.

I guess the problem is in your Verilator model, then. Are you sure you are using the version of Verilator version shipped with this repo? (I guess you should, based on your list of commands, but just checking). Did Verilator compile successfully? Did you get any warnings compiling Verilator, or the Verilator model of Ara? Which version of LLVM did you use to compile both of them?

Matheus

@fantasysee
Copy link
Author

Thank you, Matheus.

Based on your helpful advice and the fact that my binary file combined with your freshly-compiled Verilator model worked without issues, I guess there is something mismatched in my hardware toolchain.

I double-checked the version of Verilator and the version of LLVM at first.

The version of compiled Verilator is v4.211 for the first time, and I found the required version of Verilator is v4.106 in DEPENDENCIES.md. I then modified the Makefile and re-compiled Verilator, but also met the same issue. And I also checked the workflow in the CI, I found the Verilator version is v4.211 as well. Then I think the version of Verilator is acceptable. I am sure that both versions of Verilators have successfully complied without warnings and errors since I checked the log of make verilator.

The LLVM version is 13.0.0.

However, an error is thrown following the command make compile, which compiles the hardware without running the simulation.

(base) ➜ hardware git:(main) ✗ make compile
mkdir -p build/work-dpi
g++ -shared -fPIC -std=c++11 -Bsymbolic -c tb/dpi/elfloader.cc -I/home/fantasysee/Projects/ara/install/verilator/share/verilator/include/v
ltstd -I/home/fantasysee/Projects/ara/install/riscv-isa-sim/include -o build/work-dpi/elfloader.o
tb/dpi/elfloader.cc: In function ‘char read_section(long long int, svOpenArrayHandle)’:
tb/dpi/elfloader.cc:63:1: warning: control reaches end of non-void function [-Wreturn-type]
63 | }
| ^
mkdir -p build/work-dpi
g++ -shared -m64 -o build/work-dpi/ara_dpi.so build/work-dpi/elfloader.o
cd build && questa-2020.1 vlib work && questa-2020.1 vmap work work
Successfully installed bender 0.21.0 in '/home/fantasysee/Projects/ara/hardware'.
bender 0.21.0 available.
./bender script vsim --vlog-arg="-suppress vlog-2583 -suppress vlog-13314 -suppress vlog-13233 -work work" -t rtl -t asic -t ara_test -t c
va6_test --define NR_LANES=4 --define VLEN=4096 --define RVV_ARIANE=1 > build/compile_default.tcl
echo "exit" >> build/compile_default.tcl
cd build && questa-2020.1 vsim -work work -c -do compile_default.tcl
make: *** [Makefile:115: build/compile_default.tcl] Error 101

At the same time, the called Questasim throws an error in its GUI, thereby failing to continue the compile flow.

Error (suppressible): (vsim-19) Failed to access library 'work' at "work".

Finally, I locate my issue on the version of QuestaSim and realize what the problem is.
I compile the ara project on WSL2, which is Ubuntu 20.04.1 LTS. However, my Questasim is installed in Windows. It can be searched and called in the WSL2, while it is incompatible with the workflow of ara. At least I observed incompatibility in two ways. First, the Windows Questasim required *.dll instead of *.so files to link. Second, the Windows Questasim can't access the path in WSL2-like behavior.

I feel very sorry for the occupation of your valuable time. I think it would be better if I declare the compile environment in advance before I seek help.

I would then try to find a Ubuntu OS machine instead of a virtual machine like WSL2, and find a Linux version of QuestaSim to re-compile the ara project. I'll let you know if I make any progress.

I sincerely appreciate your response.

Best Regards,
Chao

@fantasysee
Copy link
Author

Hi, Matheus.

I just succeeded to run the simulation of the RISC-V unit tests, while the simulation unexpectedly stops at the rv64uv-ara-vnmsac. It seems that there are no rules to build this instruction. Is this experimental phenomenon correct?

I checked riscv-tests-simv in the CI at first, and found that it could be simulated normally.

And then I checked the process to build the unit tests for the vector instructions. The command I use to build the riscv_tests is make -C apps riscv_tests on the ara project root path. I found that there is a segmentation fault when building the unit test for the vmacc instruction.

mkdir -p bin/
/home/cfang/proj_riscv/ara/install/riscv-gcc/bin/riscv64-unknown-elf-gcc -Iinclude -I/home/cfang/proj_riscv/ara/apps/riscv-tests/isa/macros/scalar -I/home/cfang/proj_riscv/ara/apps/riscv-tests/isa/macros/vector -mcmodel=medany -march=rv64gcv -mabi=lp64d -I/home/cfang/proj_riscv/ara/apps/common -static -std=gnu99 -O3 -ffast-math -fno-common -fno-builtin-printf -DNR_LANES=2 -Wunused-variable -Wall -Wextra -Wno-unused-command-line-argument -static -nostartfiles -lm -lgcc -mcmodel=medany -march=rv64gcv -mabi=lp64d -I/home/cfang/proj_riscv/ara/apps/common -static -std=gnu99 -O3 -ffast-math -fno-common -fno-builtin-printf -DNR_LANES=2 -Wunused-variable -Wall -Wextra -Wno-unused-command-line-argument -o bin/rv64uf-ara-fmadd /home/cfang/proj_riscv/ara/apps/riscv-tests/isa/rv64uf/fmadd.S common/crt0-gcc.S.o common/printf-gcc.c.o common/string-gcc.c.o common/serial-gcc.c.o -T/home/cfang/proj_riscv/ara/apps/common/link.ld
/home/cfang/proj_riscv/ara/install/riscv-llvm/bin/llvm-objdump --mattr=+experimental-v -D bin/rv64uf-ara-fmadd > bin/rv64uf-ara-fmadd.dump
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/cfang/proj_riscv/ara/install/riscv-llvm/bin/llvm-objdump --mattr=+experimental-v -D bin/rv64uf-ara-fmadd
bash: line 1: 3524 Segmentation fault (core dumped) /home/cfang/proj_riscv/ara/install/riscv-llvm/bin/llvm-objdump --mattr=+experimental-v -D bin/rv64uf-ara-fmadd > bin/rv64uf-ara-fmadd.dump
Makefile:83: recipe for target 'bin/rv64uf-ara-fmadd' failed
make: *** [bin/rv64uf-ara-fmadd] Error 139
rm common/string-gcc.c.o common/printf-gcc.c.o common/serial-gcc.c.o common/crt0-gcc.S.o
make: Leaving directory '/home/cfang/proj_riscv/ara/apps'

The LLVM version is 13.0.0.

What may be the problem to trigger the segmentation fault?

===
The previous issue is indeed triggered by the platform where QuestaSim was installed. And the solution is using a Ubuntu OS and a Linux version QuestaSim.

Your helpful advice and the CI help me a lot. Thank you very much!!!

Regards,
Chao

@fantasysee
Copy link
Author

Hi, Matheus.

I'm glad to tell you that I just succeeded to simulate all the unit tests.

I checked the CI several times, and confirmed that unit tests of these instructions can be compiled.

And I found that the LLVM compiled on WSL2 doesn't throw the segmentation fault when compiling the unit tests. Then I simulated the unit tests on the PC with Ubuntu OS with the binary compiled on the WSL2, and worked without issues.

It's a little bit strange to me. There must be something different when compiling toolchain-llvm between WSL2 and PC with Ubuntu OS.

I check the version of gcc to compile toolchain-llvm at first. Aha! The gcc version is 7.5.0 on PC with Ubuntu OS, while it is 9.3.0 on WSL2. I guessed the gcc-7.5.0 is unable to build toolchain-llvm completely. That's why I passed all the provided applications of ara except for some riscv unit tests.

The solution is simple: upgrade the gcc to version 9.3.0.

I sincerely appreciate your help. Thank you very much!!!

Best regards,
Chao

@suehtamacv
Copy link
Contributor

Hi @fantasysee,

That sounds like a very complex bug. So it boils down to the LLVM compiled with on WSL2 not being able to correctly compile Verilator and the RISC-V apps of this repo? I do not have access to such an environment. An alternative would be for us to add CI support for this configuration here on GitHub...

Thanks for letting me know of this bug,
Matheus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants