Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HW] Draft PR for Implementing Ara on FPGA #146

Closed
wants to merge 8 commits into from

Conversation

hossein1387
Copy link
Contributor

@hossein1387 hossein1387 commented Sep 19, 2022

This is a Draft PR for FPGA implementation of Ara. There are several issues that needs to be discussed before merging this PR into main branch:

  • As it can be seen in ara.core, since not most of pulp IPs have an updated core file (fusesoc specific format), I had to hand pick files from different dependencies to be able to synthesize correctly without any error. I am following up this issue on different pulp repositories (common-cell) and hopefully fix it soon.
  • I tried to follow the structure of FPGA implementation in CVA6 which requires me to add an fpga folder in Ara's root directory.
  • Right now, this PR works for xcvu9p FPGA however, supporting new FPGA parts should be straightforward.
  • Initial synthesis results for xcvu9p is available as follow:

Num Lanes L2 Size LUT BRAM DSP Clock WNS
16 8 MB 1649392 (139.51%) 569 (26.34%) 812 (11.87%) 10 MHz + 74.60 ns
8 4 MB 645894 (54.63%) 313 (14.49%) 420 (6.14%) 10 MHz + 89.22 ns
4 128 MB 220583 (18.66%) 4096 (189%) 96 (1.4%) 10 MHz + 87.80 ns
4 2 MB 306680 (25.94%) 185 (8.56%) 224 (3.27%) 10 MHz + 90.93 ns
2 1 MB 153558 (12.99%) 121 (5.6%) 126 (1.84%) 10 MHz + 90.96 ns
  • Based on the timing results, you can push synthesis for even higher clock frequency. Here is the results for 100Mhz timing constraint:

Num Lanes L2 Size LUT BRAM DSP Clock WNS
4 2 MB 306680 (25.94%) 185 (8.56%) 224 (3.27%) 10 MHz + 90.93 ns
4 2 MB 303111 (25.64%) 84 (8.52%) 224 (1.84%) 100 MHz +0.96 ns

Changelog

Added

  • Support for FPGA synthesis
  • Add FPGA folder in Ara's root directory (Following CVA6 FPGA implementation style)
  • Add support for Fusesoc
  • Add FPGA top-level module for Ara
  • Synthesis on xcvu9p (Alveo U200)

Checklist

  • Automated tests pass
  • Changelog updated
  • Code style guideline is observed

@hossein1387 hossein1387 changed the title Draft PR for Implementing Ara on FPGA [HW] Draft PR for Implementing Ara on FPGA Sep 19, 2022
@hossein1387 hossein1387 force-pushed the feat.fpga branch 3 times, most recently from cdf9593 to 1b014ba Compare September 20, 2022 00:32
@mp-17 mp-17 marked this pull request as draft September 20, 2022 07:46
@poldni
Copy link

poldni commented Nov 10, 2022

@hossein1387 Please, could you describe how the implementation can be replayed? Because for example I get an parsing error from Fusesoc (version 1.12.0, on Ubuntu 20.04) if I try to add the repository as library or list the cores via "fusesoc --cores-root . list-cores".
I do the following:

  1. clone your fork of the repo and switch to branch feat.fpga
  2. run fusesoc --cores-root . list-cores" gives me "::ara:0 : local" (there are no other libraries added to Fusesoc yet - status after install)
  3. run "git submodule update --init --recursive"
  4. run fusesoc --cores-root . list-cores" gives me an parsing error: "UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 153: invalid start byte" without any reference which core file causing this.

I guess this issue is related to the first bullet (different dependencies....) above?
Thanks.
Edit:
I found at least the problem. There are also "*.core" files in toolchain/riscv-llvm/ which are not Fusesoc type "core" files.

@hossein1387
Copy link
Contributor Author

hossein1387 commented Nov 11, 2022

Hi @poldni you first need to add Ara to Fusesoc. You can do so by:

cd [PATH_TO_ARA_REPO]
fusesoc library add ara .

Then you can try running FPGA implementation. Please let me know if this fixes the issue.

@poldni
Copy link

poldni commented Nov 11, 2022

Hi @poldni you first need to add Ara to Fusesoc. You can do so by:

cd [PATH_TO_ARA_REPO]
fusesoc library add ara .

Then you can try running FPGA implementation. Please let me know if this fixes the issue.

I tried this too, but the same problem appear like if I do this with the command "fusesoc --cores-root . list-cores". In the ara.core the IP's from "hardware/deps" are referenced. So it is needed to initialize the git submodules from the "ara" repository. As soon as you do this also the toolchain submodules are synchronized. And the submodule riscv-llvm contains files with the ending "*.core" which aren't of course regular fusesoc files. fusesoc recursively searching for core files in sub folders. So it can't parse these files and it messages an parsing error. So I temporary removed the toolchain folder in order to work with fusesoc to build the implementation, but getting a crash of the vivado tool (2022.2) if I use the currently referenced revisions of hardware/deps including your commit on common-cell (3e78959d12173ab1061380de3c496858c72b8ebd).
Please, could you provide a list of commit-id's of the hardware/deps which is/was working for you as a starting point?

@hossein1387
Copy link
Contributor Author

hossein1387 commented Nov 11, 2022

Ok so there are two issues. One with Fusesoc and one with common_cells. For Fusesoc, pease follow this PR and Issue I raised on the Fusesoc repository: olofk/fusesoc#580
Basically, we need to create a file called FUSESOC_IGNORE in the top directory of which you want to be ignored by Fusesoc.

For common_cells, yes there are issues regarding FPGA implementation, this is the commit I am using for common_cells:015917ff33e5f944e866814f72f2074fb0f4220f
Also, follow this issue I raised on the common_cells repository: pulp-platform/common_cells#143

@lapnd
Copy link

lapnd commented Feb 12, 2023

Hi @hossein1387
I'm trying this PR on AU200.
This is the step I tried follow

fusesoc library add ara .
fusesoc core list
fusesoc run --target=synth ara

However, it seems that no verilog code are generated and the build is failed as

Available cores:

Core                                    Cache status  Description
================================================================================
::ara:0                                :      local : Ara is a 64-bit Vector Unit, compatible with the RISC-V Vector Extension Version 0.10, working as a coprocessor to CORE-V's CVA6 core
::ariane:0                             :      local : <No description>
pulp-platform.org::axi:0.29.1          :      local : <No description>
pulp-platform.org::axi_mem_if:0        :      local : <No description>
pulp-platform.org::common_cells:1.24.1 :      local : <No description>
default:~/workspace/ara$ fusesoc run --target=synth ara
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/x86-32-freebsd-multithread.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/x86-32-linux-multithread.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/aarch64-freebsd-multithread.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/x86-64-netbsd-multithread.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/x86-64-linux-multithread.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/x86-64-linux.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/x86-64-freebsd.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/x86-32-netbsd-multithread.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/x86-64-netbsd.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/x86-64-freebsd-multithread.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/x86-32-freebsd.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/x86-32-linux.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/Register/Core/Inputs/x86-32-netbsd.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/Shell/ObjectFile/ELF/Inputs/netbsd-amd64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/tools/lldb-vscode/coreFile/linux-x86_64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/netbsd-core/2lwp_t2_SIGSEGV.amd64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/netbsd-core/2lwp_process_SIGSEGV.aarch64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/netbsd-core/1lwp_SIGSEGV.amd64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/netbsd-core/2lwp_t2_SIGSEGV.aarch64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/netbsd-core/2lwp_process_SIGSEGV.amd64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/netbsd-core/1lwp_SIGSEGV.aarch64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/altmain.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/linux-aarch64-sve-full.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/linux-arm.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/linux-s390x.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/linux-ppc64le.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/linux-x86_64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/linux-aarch64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/linux-fpr_sse_i386.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/linux-aarch64-neon.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/linux-aarch64-pac.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/linux-i386.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/linux-fpr_sse_x86_64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/linux-aarch64-sve-fpsimd.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/thread_crash/linux-x86_64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/thread_crash/linux-i386.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/gcore/linux-x86_64.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/postmortem/elf-core/gcore/linux-i386.core
WARNING: Unable to determine CAPI version from core file /workspace/ara/toolchain/riscv-llvm/lldb/test/API/functionalities/unwind/noreturn/module-end/test.core
INFO: Preparing pulp-platform.org::common_cells:1.24.1
INFO: Preparing ::ara:0
INFO: Setting up project

INFO: Building
vivado -notrace -mode batch -source ara_0.tcl

****** Vivado v2021.1 (64-bit)
  **** SW Build 3247384 on Thu Jun 10 19:36:07 MDT 2021
  **** IP Build 3246043 on Fri Jun 11 00:30:35 MDT 2021
    ** Copyright 1986-2021 Xilinx, Inc. All Rights Reserved.

source ara_0.tcl -notrace
WARNING: [filemgmt 56-12] File '/workspace/ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/axi/src/axi_cut.sv' cannot be added to the project because it already exists in the project, skipping this file
WARNING: [filemgmt 56-12] File '/workspace/ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/axi/src/axi_dw_converter.sv' cannot be added to the project because it already exists in the project, skipping this file
INFO: [Common 17-206] Exiting Vivado at Sun Feb 12 11:04:44 2023...
vivado -notrace -mode batch -source ara_0_run.tcl ara_0.xpr

****** Vivado v2021.1 (64-bit)
  **** SW Build 3247384 on Thu Jun 10 19:36:07 MDT 2021
  **** IP Build 3246043 on Fri Jun 11 00:30:35 MDT 2021
    ** Copyright 1986-2021 Xilinx, Inc. All Rights Reserved.

open_project ara_0.xpr
WARNING: [filemgmt 56-3] Default IP Output Path : Could not find the directory '/workspace/ara/build/ara_0/synth-vivado/ara_0.gen/sources_1'.
Scanning sources...
Finished scanning sources
source ara_0_run.tcl -notrace
[Sun Feb 12 11:05:08 2023] Launched synth_1...
Run output will be captured here: /workspace/ara/build/ara_0/synth-vivado/ara_0.runs/synth_1/runme.log
[Sun Feb 12 11:05:08 2023] Launched impl_1...
Run output will be captured here: /workspace/ara/build/ara_0/synth-vivado/ara_0.runs/impl_1/runme.log
[Sun Feb 12 11:05:08 2023] Waiting for impl_1 to finish...
[Sun Feb 12 11:05:30 2023] impl_1 finished
WARNING: [Vivado 12-8222] Failed run(s) : 'synth_1'
wait_on_run: Time (s): cpu = 00:00:13 ; elapsed = 00:00:21 . Memory (MB): peak = 2562.031 ; gain = 0.000 ; free physical = 71898 ; free virtual = 119195
Bitstream generation completed
ERROR: Implementation and bitstream generation step failed.
INFO: [Common 17-206] Exiting Vivado at Sun Feb 12 11:05:30 2023...
Makefile:16: recipe for target 'ara_0.bit' failed
make: *** [ara_0.bit] Error 1
ERROR: Failed to build ::ara:0 : '['make']' exited with an error: 2

Can you please tell me if I missed any steps?
Thank you!

@lapnd
Copy link

lapnd commented Feb 13, 2023

Never mind, my mistake on adding source code. Sorry for making noise

@hossein1387
Copy link
Contributor Author

Great! let us know if you ran into any more issues.

@lapnd
Copy link

lapnd commented Feb 14, 2023

Thank you @hossein1387
My board is AU200(xcu200-fsgd2104-2-e) is a bit different from you. I had modified fpga directory and ara.core to adapt new board. The bitstream was generated successfully with new ara.xdc to assign pin to input/output port such as

# serial rx
set_property LOC BF18 [get_ports {rx_i}]
set_property IOSTANDARD LVCMOS12 [get_ports {rx_i}]
# serial tx
set_property LOC BB20 [get_ports {tx_o}]
set_property IOSTANDARD LVCMOS12 [get_ports {tx_o}]

However, I have not seen anything on ttyUSB console yet.
I have a question.
How do we run new app with ara?
For example, if I have new hell-world app, how do I load it to memory and trigger it run
Thank you!

@hossein1387
Copy link
Contributor Author

We have not yet test our design on FPGA. We are now in the process of testing Ara on Alveo U200 board. That is why you do not see any constraint on I/O pins.

@lapnd
Copy link

lapnd commented Feb 14, 2023

Thank for your confirmation
Since LiteX also supports CVA6 as https://github.com/enjoy-digital/litex/tree/master/litex/soc/cores/cpu/cva6/cva6_wrapper
May be we can consider to update the the wrapper to include vector processing unit.
Using LiteX, we can utilize SoC, software framework to run the app easily.

@elisabethumblet
Copy link

Synthesis results for xcvu9p from September 2022 were not corresponding to the shown L2 Size. By adjusting the L2 Size computation, here are results with the 100MHz timing constraint:

Num Lanes L2 Size LUT BRAM DSP Clock WNS
4 256 kB 297300 (25.15%) 160 (7.41%) 224 (3.27%) 100 MHz + 0.004 ns
4 256 kB 296371 (25.07%) 160 (7.41%) 224 (3.27%) 10 MHz + 77.794 ns

@mp-17
Copy link
Collaborator

mp-17 commented Mar 1, 2023

Hello @elisabethumblet, thanks for the new data! Have you already tried to run the system on FPGA?

@elisabethumblet
Copy link

Hi @mp-17, not yet, we are still looking for a board at the moment, but hopefully that will be done soon!

@poldni
Copy link

poldni commented Apr 4, 2023

Hi, I'm also trying to run the system on an FPGA in a chipyard context (CVA6 Tile Wrapper) using this draft as template. Unfortunately I have only a ZCU104 at my disposal. I used the currently minimal NrLanes of 2. In this context the needed LUT's hit the boundaries of the ZCU104 and of course timing doesn't meet at all. I tried to reduce the VLEN, but apparently this doesn't have a big effect on the used LUT's or should it? I get round about 5000 LUT reduction from VLEN=2048 to VLEN=128 using Vivado 2022.2. Is there any other parameter which can be tuned?

@elisabethumblet
Copy link

elisabethumblet commented Apr 12, 2023

Hi @poldni, to reduce the number of LUT's you use, you can reduce the size of the cache by tuning the L2NumWords parameter.
The actual size of the cache will be: NrLanes32L2NumWords*1/8 and you get the result in bytes.
As for the timing, to give you an idea, a maximum frequency of 75MHz can be achieved with a configuration with 4 lanes and a 512kB cache (L2NumWords = 2^15) on the Alveo U280 board.

@ckf104
Copy link

ckf104 commented Apr 17, 2023

Hello, @hossein1387, I try to merge your PR with ara locally, but get error when synthesis. Vivado complains it can't find module tc_clk_gating

ERROR: [Synth 8-439] module 'tc_clk_gating' not found [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/cache_subsystem/cva6_icache.sv:422]
ERROR: [Synth 8-196] conditional expression could not be resolved to a constant [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/cache_subsystem/cva6_icache.sv:415]
ERROR: [Synth 8-6156] failed synthesizing module 'cva6_icache' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/cache_subsystem/cva6_icache.sv:28]
ERROR: [Synth 8-6156] failed synthesizing module 'wt_cache_subsystem' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/cache_subsystem/wt_cache_subsystem.sv:22]
ERROR: [Synth 8-6156] failed synthesizing module 'ariane' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/ariane.sv:26]
ERROR: [Synth 8-6156] failed synthesizing module 'ara_system' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/src/ara_system.sv:9]
ERROR: [Synth 8-6156] failed synthesizing module 'ara_soc' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/src/ara_soc.sv:9]
ERROR: [Synth 8-6156] failed synthesizing module 'xilinx_ara_soc' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/fpga/src/xilinx_ara_soc.sv:16]
---------------------------------------------------------------------------------
Finished RTL Elaboration : Time (s): cpu = 00:00:08 ; elapsed = 00:00:08 . Memory (MB): peak = 2795.797 ; gain = 466.184 ; free physical = 205 ; free virtual = 20934
Synthesis current peak Physical Memory [PSS] (MB): peak = 2166.562; parent = 1984.971; children = 181.592
Synthesis current peak Virtual Memory [VSS] (MB): peak = 3781.852; parent = 2795.801; children = 986.051
---------------------------------------------------------------------------------
RTL Elaboration failed
INFO: [Common 17-83] Releasing license: Synthesis
383 Infos, 148 Warnings, 29 Critical Warnings and 9 Errors encountered.
synth_design failed
ERROR: [Common 17-69] Command failed: Synthesis failed - please see the console or run log file for details
INFO: [Common 17-206] Exiting Vivado at Mon Apr 17 17:38:10 2023...
[Mon Apr 17 17:38:21 2023] synth_1 finished
WARNING: [Vivado 12-13638] Failed runs(s) : 'synth_1'
wait_on_runs: Time (s): cpu = 00:00:27 ; elapsed = 00:00:37 . Memory (MB): peak = 1331.492 ; gain = 0.000 ; free physical = 2159 ; free virtual = 22888
source ara_0_run.tcl -notrace
ERROR: [Common 17-70] Application Exception: Failed to launch run 'impl_1' due to failures in the following run(s):
synth_1
These failed run(s) need to be reset prior to launching 'impl_1' again.

INFO: [Common 17-206] Exiting Vivado at Mon Apr 17 17:38:21 2023...
make: *** [Makefile:16: ara_0.bit] Error 1
ERROR: Failed to build ::ara:0 : '['make']' exited with an error: 2

By searching, I found that module tc_clk_gating actually is in file hardware/deps/tech_cells_generic/src/fpga/tc_clk_xilinx.sv, but this file is not included in ara.core. Is this a error or may I mistake something?

Extra info: fusesoc pulls common_cells dependency verison 1.20.0 and I try another part xczu7ev-ffvf1517-1-i instead of xcvu9p-flgb2104-2-e(In fact, I don't find xcvu9p-flgb2104-2-e in my vivado, should I reinstall vivado with Vivado ML Enterprise edition? Currently I use Vivado ML Standard edition).

@hossein1387
Copy link
Contributor Author

Hello, @hossein1387, I try to merge your PR with ara locally, but get error when synthesis. Vivado complains it can't find module tc_clk_gating

ERROR: [Synth 8-439] module 'tc_clk_gating' not found [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/cache_subsystem/cva6_icache.sv:422]
ERROR: [Synth 8-196] conditional expression could not be resolved to a constant [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/cache_subsystem/cva6_icache.sv:415]
ERROR: [Synth 8-6156] failed synthesizing module 'cva6_icache' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/cache_subsystem/cva6_icache.sv:28]
ERROR: [Synth 8-6156] failed synthesizing module 'wt_cache_subsystem' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/cache_subsystem/wt_cache_subsystem.sv:22]
ERROR: [Synth 8-6156] failed synthesizing module 'ariane' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/ariane.sv:26]
ERROR: [Synth 8-6156] failed synthesizing module 'ara_system' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/src/ara_system.sv:9]
ERROR: [Synth 8-6156] failed synthesizing module 'ara_soc' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/src/ara_soc.sv:9]
ERROR: [Synth 8-6156] failed synthesizing module 'xilinx_ara_soc' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/fpga/src/xilinx_ara_soc.sv:16]
---------------------------------------------------------------------------------
Finished RTL Elaboration : Time (s): cpu = 00:00:08 ; elapsed = 00:00:08 . Memory (MB): peak = 2795.797 ; gain = 466.184 ; free physical = 205 ; free virtual = 20934
Synthesis current peak Physical Memory [PSS] (MB): peak = 2166.562; parent = 1984.971; children = 181.592
Synthesis current peak Virtual Memory [VSS] (MB): peak = 3781.852; parent = 2795.801; children = 986.051
---------------------------------------------------------------------------------
RTL Elaboration failed
INFO: [Common 17-83] Releasing license: Synthesis
383 Infos, 148 Warnings, 29 Critical Warnings and 9 Errors encountered.
synth_design failed
ERROR: [Common 17-69] Command failed: Synthesis failed - please see the console or run log file for details
INFO: [Common 17-206] Exiting Vivado at Mon Apr 17 17:38:10 2023...
[Mon Apr 17 17:38:21 2023] synth_1 finished
WARNING: [Vivado 12-13638] Failed runs(s) : 'synth_1'
wait_on_runs: Time (s): cpu = 00:00:27 ; elapsed = 00:00:37 . Memory (MB): peak = 1331.492 ; gain = 0.000 ; free physical = 2159 ; free virtual = 22888
source ara_0_run.tcl -notrace
ERROR: [Common 17-70] Application Exception: Failed to launch run 'impl_1' due to failures in the following run(s):
synth_1
These failed run(s) need to be reset prior to launching 'impl_1' again.

INFO: [Common 17-206] Exiting Vivado at Mon Apr 17 17:38:21 2023...
make: *** [Makefile:16: ara_0.bit] Error 1
ERROR: Failed to build ::ara:0 : '['make']' exited with an error: 2

By searching, I found that module tc_clk_gating actually is in file hardware/deps/tech_cells_generic/src/fpga/tc_clk_xilinx.sv, but this file is not included in ara.core. Is this a error or may I mistake something?

Extra info: fusesoc pulls common_cells dependency verison 1.20.0 and I try another part xczu7ev-ffvf1517-1-i instead of xcvu9p-flgb2104-2-e(In fact, I don't find xcvu9p-flgb2104-2-e in my vivado, should I reinstall vivado with Vivado ML Enterprise edition? Currently I use Vivado ML Standard edition).

Hi @ckf104 , Regarding the merge, many things in Ara have changed since we opened this PR. We do have an updated version of this PR locally and @elisabethumblet's results are based on that using xcvu9p-flgb2104-2-e. She just updated the PR and you should be able to merge your code.
Regarding the FPGA version, we are using Vivado : ML Editions, version 2021.1. As you see in the PR we were able to synthesize on both xcu280-fsvh2892-2l-e and xcvu9p-flgb2104-2-e part numbers.

@ckf104
Copy link

ckf104 commented Apr 19, 2023

Hello, @hossein1387, I try to merge your PR with ara locally, but get error when synthesis. Vivado complains it can't find module tc_clk_gating

ERROR: [Synth 8-439] module 'tc_clk_gating' not found [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/cache_subsystem/cva6_icache.sv:422]
ERROR: [Synth 8-196] conditional expression could not be resolved to a constant [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/cache_subsystem/cva6_icache.sv:415]
ERROR: [Synth 8-6156] failed synthesizing module 'cva6_icache' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/cache_subsystem/cva6_icache.sv:28]
ERROR: [Synth 8-6156] failed synthesizing module 'wt_cache_subsystem' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/cache_subsystem/wt_cache_subsystem.sv:22]
ERROR: [Synth 8-6156] failed synthesizing module 'ariane' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/deps/cva6/src/ariane.sv:26]
ERROR: [Synth 8-6156] failed synthesizing module 'ara_system' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/src/ara_system.sv:9]
ERROR: [Synth 8-6156] failed synthesizing module 'ara_soc' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/hardware/src/ara_soc.sv:9]
ERROR: [Synth 8-6156] failed synthesizing module 'xilinx_ara_soc' [/home/ckf104/tmp/riscv-vector-ara/build/ara_0/synth-vivado/src/ara_0/fpga/src/xilinx_ara_soc.sv:16]
---------------------------------------------------------------------------------
Finished RTL Elaboration : Time (s): cpu = 00:00:08 ; elapsed = 00:00:08 . Memory (MB): peak = 2795.797 ; gain = 466.184 ; free physical = 205 ; free virtual = 20934
Synthesis current peak Physical Memory [PSS] (MB): peak = 2166.562; parent = 1984.971; children = 181.592
Synthesis current peak Virtual Memory [VSS] (MB): peak = 3781.852; parent = 2795.801; children = 986.051
---------------------------------------------------------------------------------
RTL Elaboration failed
INFO: [Common 17-83] Releasing license: Synthesis
383 Infos, 148 Warnings, 29 Critical Warnings and 9 Errors encountered.
synth_design failed
ERROR: [Common 17-69] Command failed: Synthesis failed - please see the console or run log file for details
INFO: [Common 17-206] Exiting Vivado at Mon Apr 17 17:38:10 2023...
[Mon Apr 17 17:38:21 2023] synth_1 finished
WARNING: [Vivado 12-13638] Failed runs(s) : 'synth_1'
wait_on_runs: Time (s): cpu = 00:00:27 ; elapsed = 00:00:37 . Memory (MB): peak = 1331.492 ; gain = 0.000 ; free physical = 2159 ; free virtual = 22888
source ara_0_run.tcl -notrace
ERROR: [Common 17-70] Application Exception: Failed to launch run 'impl_1' due to failures in the following run(s):
synth_1
These failed run(s) need to be reset prior to launching 'impl_1' again.

INFO: [Common 17-206] Exiting Vivado at Mon Apr 17 17:38:21 2023...
make: *** [Makefile:16: ara_0.bit] Error 1
ERROR: Failed to build ::ara:0 : '['make']' exited with an error: 2

By searching, I found that module tc_clk_gating actually is in file hardware/deps/tech_cells_generic/src/fpga/tc_clk_xilinx.sv, but this file is not included in ara.core. Is this a error or may I mistake something?
Extra info: fusesoc pulls common_cells dependency verison 1.20.0 and I try another part xczu7ev-ffvf1517-1-i instead of xcvu9p-flgb2104-2-e(In fact, I don't find xcvu9p-flgb2104-2-e in my vivado, should I reinstall vivado with Vivado ML Enterprise edition? Currently I use Vivado ML Standard edition).

Hi @ckf104 , Regarding the merge, many things in Ara have changed since we opened this PR. We do have an updated version of this PR locally and @elisabethumblet's results are based on that using xcvu9p-flgb2104-2-e. She just updated the PR and you should be able to merge your code. Regarding the FPGA version, we are using Vivado : ML Editions, version 2021.1. As you see in the PR we were able to synthesize on both xcu280-fsvh2892-2l-e and xcvu9p-flgb2104-2-e part numbers.

Hi, @hossein1387 , I tried to merge the new PR but still got the same error. Finally, I found the problem is that cva6 tracked by ara leads feat.fpga by a commit, which introduces the module tc_clk_gating. After a checkout to earlier version, the problem disappeared.

But when running DRC stage of vivado implementation, I got the following error.

ERROR: [DRC NSTD-1] Unspecified I/O Standard: 68 out of 68 logical ports use I/O standard (IOSTANDARD) value 'DEFAULT', instead of a user assigned specific value. This may cause I/O contention or incompatibility with the board power or connectivity affecting performance, signal integrity or in extreme cases cause damage to the device or the components to which it is connected. To correct this violation, specify all I/O standards. This design will fail to generate a bitstream unless all logical ports have a user specified I/O standard value defined. To allow bitstream creation with unspecified I/O standard values (not recommended), use this command: set_property SEVERITY {Warning} [get_drc_checks NSTD-1].  NOTE: When using the Vivado Runs infrastructure (e.g. launch_runs Tcl command), add this command to a .tcl file and add that file as a pre-hook for write_bitstream step for the implementation run. Problem ports: exit_o[63:0], clk_i, rst_ni, rx_i, and tx_o.
ERROR: [DRC UCIO-1] Unconstrained Logical Port: 68 out of 68 logical ports have no user assigned specific location constraint (LOC). This may cause I/O contention or incompatibility with the board power or connectivity affecting performance, signal integrity or in extreme cases cause damage to the device or the components to which it is connected. To correct this violation, specify all pin locations. This design will fail to generate a bitstream unless all logical ports have a user specified site LOC constraint defined.  To allow bitstream creation with unspecified pin locations (not recommended), use this command: set_property SEVERITY {Warning} [get_drc_checks UCIO-1].  NOTE: When using the Vivado Runs infrastructure (e.g. launch_runs Tcl command), add this command to a .tcl file and add that file as a pre-hook for write_bitstream step for the implementation run.  Problem ports: exit_o[63:0], clk_i, rst_ni, rx_i, and tx_o.

It may be expected because no IO pin constraints in xdc file. It seems I should add set_property SEVERITY {Warning} [get_drc_checks UCIO-1], set_property SEVERITY {Warning} [get_drc_checks NSTD-1] in a tcl script. But where should I add, maybe file fpga/scripts/run.tcl? I don't find instruments about how to add some tcl commands into vivado build process in fusesoc docs. Could you enlighten me?

@hossein1387
Copy link
Contributor Author

@ckf104 thanks for your reply. As you mentioned, the error you are getting is because there is no constraint on the I/Os. We will soon share a working constraint file. For fusesoc, you dont have to do much to add a constraint file. Have a look here.

@ckf104
Copy link

ckf104 commented Jun 13, 2023

Hello @hossein1387 , I recently try to synthesis this design on my genesys2 board. When I tried add a new xdc file, I had some questions about xcvu9p.svh and xcu280.svh header file. It seems like global configuration files for each board. But in ara.core file, these files has been added into rtl sources despite the target is xcvu9p or xcu280. And in run.tcl file, they have been set is_global_include. In vivado's documentation, it says

The Vivado IDE supports designating one of more Verilog or Verilog Header source files as global ‘include files and processes those files before any other sources.

Although I am not very understand what this means, I think it something like implicitly adding `include "xcvu9p.svh" to each verilog file. If my understanding is correct, why don't you place xcvu9p.svh file in its own target xcvu9p to avoid different parts configuration conflict ?

Edit: another qeustion about define NrLanes 4 in xcu280.svh, I'm doubting changing this value will affect actual lane numbers. Because top module's NrLanes is a parameter ( in module xilinx_ara_soc ), it seems to have nothing to do with macro NrLanes.

@elisabethumblet
Copy link

elisabethumblet commented Jun 14, 2023

Hi @ckf104, I am currently testing an ara.core file where both xcvu9p.svh and xcu280.svh files are added in their respective filesets. I also separated the run.tcl file into two distinct files, also added to the boards' filesets. Important note: the tcl file calling the svh file needs to be located after the svh file in the list.

The synthesis and implementation haven't finished yet, but for now it seems to work like this.
For your remark on the is_global_include property, from what I understand, it just means those files are compiled before any other file (I think that's what they mean by "processes before any other sources").

About the NrLanes macro, I'm not exactly sure why we put them there, since in the end it is the value set in the module xilinx_ara_soc that changes the configuration.

Once everything is checked, I will update the PR.

@ckf104
Copy link

ckf104 commented Aug 14, 2023

Hi, recently I have generated bitstream on board genesys2 successfully. If my understanding is correct, current soc only has a uart and xilinx sram in place of ddr memory, and ariane's boot pc is the first byte of sram. So if we initialize xilinx sram with apps (test applications in apps directory of project) , it should run successfully and write output into uart?

In detail, I convert elf file of application(e.g., hello_world application) into mem file, then use vivado updatemem tool to update generated bitstream. Unfortunately, I can't see any output in my minicom terminal. But verilator simulation can work properly(the only difference between fpga and simulation is that we replace xilinx sram by fake sram and initialize it by verilator DPI-C interface).

Based on these things, I think hardware connection should be correct and my initialization of xilinx sram goes wrong somewhere. So I'm curious that have you ever tried to run test applications in fpga? which may help me find where something goes wrong.

Thanks in advance.

@mp-17
Copy link
Collaborator

mp-17 commented Jul 3, 2024

Bringing the discussion here. Let me know if you want to re-open this ;-)

@mp-17 mp-17 closed this Jul 3, 2024
@grigohas
Copy link

grigohas commented Jul 16, 2024

hello, this pr implements only ara for xcvu9p or with the cva6 too ? if no, how can i implement cva6 with ara for example to genesys 2?

@fulcrum34
Copy link

Hi, recently I have generated bitstream on board genesys2 successfully. If my understanding is correct, current soc only has a uart and xilinx sram in place of ddr memory, and ariane's boot pc is the first byte of sram. So if we initialize xilinx sram with apps (test applications in apps directory of project) , it should run successfully and write output into uart?

In detail, I convert elf file of application(e.g., hello_world application) into mem file, then use vivado updatemem tool to update generated bitstream. Unfortunately, I can't see any output in my minicom terminal. But verilator simulation can work properly(the only difference between fpga and simulation is that we replace xilinx sram by fake sram and initialize it by verilator DPI-C interface).

Based on these things, I think hardware connection should be correct and my initialization of xilinx sram goes wrong somewhere. So I'm curious that have you ever tried to run test applications in fpga? which may help me find where something goes wrong.

Thanks in advance.

That's Exactly the point where I am stuck right now.
The Core-V APU from CVA6 uses bootrom and you can run baremetal examples through JTAG (OpenOCD, GDB) and it works fine. I am leaning towards the idea of adding bootrom and dmi( includes JTAG) to ara_soc and try to run apps that way.

@mp-17
Copy link
Collaborator

mp-17 commented Oct 17, 2024

The official PR for Ara on FPGA + Linux flow can be found here: pulp-platform/cheshire#160

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants