Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for VCK5000 over PCIe #77

Open
LorenzoSun-V opened this issue Jan 18, 2022 · 34 comments
Open

Add support for VCK5000 over PCIe #77

LorenzoSun-V opened this issue Jan 18, 2022 · 34 comments
Assignees
Labels
enhancement New feature or request

Comments

@LorenzoSun-V
Copy link

I only have a VCK5000 board but you built and test the project on VCK190. Will it work correctly if I follow your workflow on VCK5000?

@stephenneuendorffer
Copy link
Collaborator

Hi Lorenzo... Most of the flow would be the same for a VCK5000. The main difference would be the base system design, which would probably have to be modified slightly. Our intention is to do this at some point, but it hasn't happened yet.

@LorenzoSun-V
Copy link
Author

LorenzoSun-V commented Jan 18, 2022

Thanks for your reply!
I have another question. If I develop a kernel based on mlir-aie tools on VCK190 and it could run on VCK190 correctly. Could this kernel also run correctly on VCK5000 without redevelopment?
All in all, what I want to ask can be summarized as following:

  1. I'm going to develop my own kernel if I buy a VCK190 and run your test files successfully following your documents. After my kernel can be run successfully on VCK190, do I need to redevelop the kernel on VCK5000? Or I can reuse what I develop on VCK190 and run the kernel correctly on VCK5000.
  2. If I still use VCK5000 to develop following your documents, what does "base system design" mean? I only use VCK5000 as an accelerator card like U50. Can I burn an OS like petalinux VCK190 using into VCK5000 so that I can run your test files in your project on VCK5000.

I'm looking forward to reciving your reply ASAP. :-D

@stephenneuendorffer
Copy link
Collaborator

  1. Basically yes. They have the same device, so anything that is internal to the device works the same way. Interfacing is different: The VCK190 has 3 memory banks, and a mix of DDR4 and LPDDR4 memory. The VCK5000 has 4 memory banks, all DDR4. This can affect the design that you would want to build and some VCK5000 designs architected for 4 memory banks would simply not work on the VCK190
  2. We use a simple FPGA design to configure the NOC, enabling the AIEngine processors to access external memory. This design would have to be different for the VCK5000. Another difference is that the VCK5000 does not have an on-board SD card, as it is designed to be configured over PCIe. Today we use a Petalinux kernel with an ubuntu rootfs that boots from the SD card. For VCK5000, one approach would be to run the control software on the X86 host, instead of the on-board ARM processor.

@stephenneuendorffer
Copy link
Collaborator

If you're interested prototyping the PCIe path for VCK5000, we're happy to provide some support to getting things up and running.

@elliottbinder
Copy link

Hello -- I also have a VCK5000 I'd like to target. How should the installation instructions change? I'm currently unsure what I should be doing to provide the sysroot directory when building MLIR-AIE.

@stephenneuendorffer
Copy link
Collaborator

@elliottbinder Probably the easiest way to bring up the VCK5000 would be to avoid cross-compiling the mlir-aie tools. Instead, they would run on x86 and configure the AIE device over PCIe. This would remove the need for a sysroot. I believe with the new libXAIE v2 that this should work, but we have not tested it. We are currently bringing up libXAIEv2 on ARM first.

@elliottbinder
Copy link

Is omitting the VitisSysroot definition in the build instructions sufficient? Building with ninja proceeded fine, but I'm getting this error when testing with check-aie:

[0/1] Running the aie regression tests
llvm-lit: /home/elliott/software/llvm-project/llvm/utils/lit/lit/TestingConfig.py:102: fatal: unable to parse config file '/home/elliott/software/mlir-aie/test/lit.cfg.py', traceback: Traceback (most recent call last):
  File "/home/elliott/software/llvm-project/llvm/utils/lit/lit/TestingConfig.py", line 91, in load_from_path
    exec(compile(data, path, 'exec'), cfg_globals, None)
  File "/home/elliott/software/mlir-aie/test/lit.cfg.py", line 84, in <module>
    result = subprocess.run([os.path.join(config.peano_tools_dir, 'llc'),'-mtriple=aie','--version'],stdout=subprocess.PIPE,stderr=subprocess.PIPE)
  File "/usr/lib/python3.8/subprocess.py", line 493, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.8/subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '<unset>/bin/llc'

It looks like this could be because peano isn't installed, but I don't see that listed as a prerequisite.

@stephenneuendorffer
Copy link
Collaborator

@elliottbinder Sorry about that... I've checked in a fix.

@elliottbinder
Copy link

Your fix resolved that issue, thanks! I'm able to get through the building step, but cmake is saying

-- Could NOT find LibXAIE (missing: XILINX_XAIE_INCLUDE_DIR XILINX_XAIE_LIBS) 

which, according to another issue, looks like it might be related to the sysroot -- is that right? What can I do to locate or install LibXAIE?
I was still able to build and run through ninja check-aie. Here's a summary from the tests:

  Unsupported: 14
  Passed     : 85
  Unresolved : 14
  Failed     : 25

@hanchenye
Copy link
Contributor

Hi @elliottbinder, first I believe the failed test cases are due to the missing of LibXAIE. Basically you need to follow the instructions here to build the vck190_bare platform @stephenneuendorffer mentioned. You will find the sysroot at platforms/vck190_bare/petalinux/sysroot/sysroots/aarch64-xilinx-linux once you build up the platform.

@stephenneuendorffer
Copy link
Collaborator

stephenneuendorffer commented Jan 21, 2022

If you're cross-compiling, hanchen is correct. If you're just compiling to run on X86, then you can get libXAIE sources here: https://github.com/Xilinx/embeddedsw/tree/master/XilinxProcessorIPLib/drivers/aiengine. After compiling the you can point cmake to the right location with -DLibXAIE_ROOT=...

@stephenneuendorffer stephenneuendorffer added the enhancement New feature or request label Jan 21, 2022
@stephenneuendorffer stephenneuendorffer changed the title Can I use the aie tools and build the platform on VCK5000? Add support for VCK5000 over PCIe Jan 21, 2022
@elliottbinder
Copy link

I tried making aiengine, but got a missing file error on #include <metal/alloc.h>. Making aienginev2 proceeded without any errors.
Trying to build MLIR-AIE again, I passed in the src directory of aienginev2 to cmake with the define on LibXAIE_DIR as you suggested, but cmake is saying it wasn't used. I also tried passing in the aienginev2 directory via the variables cmake is saying are missing (XILINX_XAIE_INCLUDE_DIR, XILINX_XAIE_LIBS). Cmake no longer complains, but it doesn't seem to make a difference in the test results.

@LorenzoSun-V
Copy link
Author

LorenzoSun-V commented Jan 24, 2022

Hi @elliottbinder, I also get stuck on building AIE tools. Are you willing to share your workflow after cloning mlir-aie projects? I downloaded https://github.com/Xilinx/embeddedsw and changed the instruction of building AIE tools: I replaced "-DVitisSysroot=${SYSROOT}" with "-DLibXAIE_DIR=${aienginev2_path}" and I also encountered missing of head file.
In addition, loud is the noise from fans of VCK5000. Do you have any solutions? I want to disconnect the 8-pin interface which supplies electricity for fans but I'm not sure if it will affect its normal work.

@LorenzoSun-V
Copy link
Author

Hi @stephenneuendorffer, thank you very much for your support! I'm doing some initial research about MLIR-AIE tools for my team. My team will work on prototype develop if I run the test files successfully on VCK5000. According to your answer, we have confidence in doing this. As your saying "avoid to cross-compile", should I replace "-DLLVM_TARGETS_TO_BUILD:STRING="X86;ARM;AArch64;" with "-DLLVM_TARGETS_TO_BUILD:STRING="X86" in the step of building llvm by cmake? What's more, I also downloaded the libXAIE you mentioned above and added its path in cmake by -DLibXAIE_DIR. But it failed to build.
Since the difference between VCK190 and VCK5000, would you please write another document for building MLIR-AIE tools on VCK5000?

@stephenneuendorffer
Copy link
Collaborator

stephenneuendorffer commented Jan 24, 2022

I think you need this too: https://github.com/OpenAMP/libmetal.
You'll also need to build libxaie into a library.

@stephenneuendorffer
Copy link
Collaborator

As your saying "avoid to cross-compile", should I replace "-DLLVM_TARGETS_TO_BUILD:STRING="X86;ARM;AArch64;" with "-DLLVM_TARGETS_TO_BUILD:STRING="X86" in the step of building llvm by cmake?

In general, when you run cmake on an x86 machine, you're going to build x86 binaries. By default you'll build a compiler (An X86 binary) capable of generating code for multiple architectures (X86, Arm32 and Arm64). If you make this change, the newly built compiler will only generate X86 code. This might be sufficient for a VCK5000 environment.

It's also possible to cross-compile the mlir-aie tools.. We normally do this to build mlir-aie tools that run on ARM. Those tools will run on ARM and be able to generate code for multiple architectures (X86, Arm32 and Arm64). In this case, the X86 code generation is probably unnecessary, since we're probably mostly generating code for Arm64... (Note that in LLVM AArch64 depends on ARM).

@stephenneuendorffer
Copy link
Collaborator

I want to disconnect the 8-pin interface which supplies electricity for fans but I'm not sure if it will affect its normal work.

In general, I wouldn't recommend this.. The fan is there for a reason... Probably you can run without the fan for light workloads, but the device can generate quite a bit of heat when fully utilized. I'm used to the VCK190 where the fans slow down greatly after powerup. I'm surprised the VCK5000 doesn't have similar management

@LorenzoSun-V
Copy link
Author

I'm surprised the VCK5000 doesn't have similar management

Once I turn on my PC after connecting VCK5000, the room is full of noise and it impacts greatly all people in the office... So I can only remove VCK5000 from PC and connect it when I need to use.

@LorenzoSun-V
Copy link
Author

I think you need this too: https://github.com/OpenAMP/libmetal.
You'll also need to build libxaie into a library.

Thanks Stephen, I'll try again.

@stephenneuendorffer
Copy link
Collaborator

I'm surprised the VCK5000 doesn't have similar management

Once I turn on my PC after connecting VCK5000, the room is full of noise and it impacts greatly all people in the office... So I can only remove VCK5000 from PC and connect it when I need to use.

Can you verify what board version you have and what system controller firmware version?

@LorenzoSun-V
Copy link
Author

I'm surprised the VCK5000 doesn't have similar management

Once I turn on my PC after connecting VCK5000, the room is full of noise and it impacts greatly all people in the office... So I can only remove VCK5000 from PC and connect it when I need to use.

Can you verify what board version you have and what system controller firmware version?

The noise from fans was there when I connected VCK5000 to my PC before I refreshed the firmware.
I use VCK5000 Production edition:
image
And I use lspci -vd 10ee: to check VCK5000 connecting correctly:
Processing accelerators: Xilinx Corporation Device 504423 Subsystem: Xilinx Corporation Device 000e45 Flags: bus master, fast devsel, latency 0,IRQ16,NUMA node 067 Memory at 380030000000(64-bit, prefetchable)[size=128M] ...
About firmware, I use "xilinx_vck5000_gen3x16_xdma_base_1" version:
image

Furthermore, to read the power waste data of VCK5000, we used "xbmgmt examine" according to the official document(ug1531-vck5000_WtMkX.pdf), but it showed N/A.

@elliottbinder
Copy link

I built libmetal from the github without an issue, but then when building aiengine I get another missing header file (metal/shmem-provider.h). It looks like this is present in a different version of libmetal -- the one packaged with embeddedsw -- but when building that version of libmetal, I get an error partway through the build process:

/home/elliott/software/embeddedsw/ThirdParty/sw_services/libmetal/src/libmetal/test/system/linux/mutex.c: In function ‘mutex’:
/home/elliott/software/embeddedsw/ThirdParty/sw_services/libmetal/src/libmetal/test/system/linux/mutex.c:56:23: error: ‘METAL_MUTEX_INIT’ undeclared (first use in this function)
   56 |  metal_mutex_t lock = METAL_MUTEX_INIT;
      |                       ^~~~~~~~~~~~~~~~

It looks like this might have been fixed in a later version of libmetal (I was able to build from the github repo no problem), but the newer version isn't what aiengine is expecting to work with.

I'm not sure what would be best: getting aiengine working by fixing libmetal and any other issues that come up, or try to get aienginev2 up and running with MLIR-AIE. I'm building MLIR-AIE with these addition flags:
-DXILINX_XAIE_INCLUDE_DIR=/home/elliott/software/embeddedsw/XilinxProcessorIPLib/drivers/aienginev2/include -DXILINX_XAIE_LIBS=/home/elliott/software/embeddedsw/XilinxProcessorIPLib/drivers/aienginev2/src
Does that look right? It doesn't seem to change the number of tests that pass.

I also ran into the fan issue, as others have: https://support.xilinx.com/s/question/0D52E00006khDFQSA2/vck5000-fan-noise-alternating?language=en_US

@stephenneuendorffer
Copy link
Collaborator

I'm surprised the VCK5000 doesn't have similar management

Once I turn on my PC after connecting VCK5000, the room is full of noise and it impacts greatly all people in the office... So I can only remove VCK5000 from PC and connect it when I need to use.

This is apparently expected and you're not the only one who has complained. :) I'm still trying to find out whether this is something that might be changed in a future system controller firmware version.

@stephenneuendorffer
Copy link
Collaborator

Note that the cmake 'hint' for locating libXAIE should be -DLibXAIE_ROOT=....

@LorenzoSun-V
Copy link
Author

I built libmetal from the github without an issue, but then when building aiengine I get another missing header file (metal/shmem-provider.h). It looks like this is present in a different version of libmetal -- the one packaged with embeddedsw -- but when building that version of libmetal, I get an error partway through the build process:

/home/elliott/software/embeddedsw/ThirdParty/sw_services/libmetal/src/libmetal/test/system/linux/mutex.c: In function ‘mutex’:
/home/elliott/software/embeddedsw/ThirdParty/sw_services/libmetal/src/libmetal/test/system/linux/mutex.c:56:23: error: ‘METAL_MUTEX_INIT’ undeclared (first use in this function)
   56 |  metal_mutex_t lock = METAL_MUTEX_INIT;
      |                       ^~~~~~~~~~~~~~~~

Hi @elliottbinder, I also ran into the same error when I built libmetal from https://github.com/Xilinx/embeddedsw/tree/master/ThirdParty/sw_services/libmetal/src/libmetal but I built libmetal successfully from https://github.com/OpenAMP/libmetal.
How to get aiengine up after building libmetal? I even don't have the include folder in aiengine driver in https://github.com/Xilinx/embeddedsw/tree/master/XilinxProcessorIPLib/drivers/aiengine/src.

@gyreflyr
Copy link

gyreflyr commented May 23, 2022 via email

@HadXu
Copy link

HadXu commented May 26, 2022

git clone https://github.com/Xilinx/cmakeModules
git clone https://github.com/Xilinx/mlir-aie
mkdir build; cd build
cmake -GNinja \
    -DLLVM_DIR=${absolute path to LLVMBUILD}/lib/cmake/llvm \
    -DMLIR_DIR=${absolute path to LLVMBUILD}/lib/cmake/mlir \
    -DLibXAIE_DIR=${absolute path to LibXAIE} \
    -DCMAKE_MODULE_PATH=${absolute path to cmakeModules}/ \
    -DVitisSysroot=${SYSROOT} \
    -DCMAKE_BUILD_TYPE=Debug \
    ..
ninja; ninja check-aie; ninja mlir-doc; ninja install

Woo, great! how to compile? how to set LibXAIE path? Thanks!

VCK5000 works great with a few tweaks. Thanks.

On Mon, May 23, 2022 at 2:16 AM ZhenLei Xu @.> wrote: have not tested it. We are currently bring how is it now? — Reply to this email directly, view it on GitHub <#77 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYWKQPPYC7TOQCYOM4UWXMLVLNEGFANCNFSM5MGMFXLQ . You are receiving this because you are subscribed to this thread.Message ID: @.>

@gyreflyr
Copy link

gyreflyr commented May 26, 2022 via email

@HadXu
Copy link

HadXu commented May 27, 2022

thanks, I try it!

@gyreflyr
Copy link

gyreflyr commented May 27, 2022 via email

@HadXu
Copy link

HadXu commented May 27, 2022

I had try it, but faild. #106 @jgmelber

@gyreflyr
Copy link

From https://github.com/nqdtan/mlir-aie.git set the following environment variables before running cmake:

export LIBXAIENGINEV1_PATH=/path/embeddedsw/XilinxProcessorIPLib/drivers/aiengine
export LIBMETAL_PATH=/path/embeddedsw/ThirdParty/sw_services/libmetal/src/libmetal/install/usr/local/lib
export LIBAMP_PATH=/path/embeddedsw/ThirdParty/sw_services/openamp/src/open-amp/install/usr/local/lib
export LIBXAIENGINEV2_PATH=/path/embeddedsw/XilinxProcessorIPLib/drivers/aienginev2
export XRT_PATH=/path/xrt

@gabrielrodcanal
Copy link
Contributor

https://github.com/nqdtan/vck5000_vivado_custom_ulp_design has excellent documentation. I needed to rename "data_mover_mm2mm" to "data_mover_mm2mm:data_mover_mm2mm" in line 42 of host.cpp. The xrt::ip call in line 991 in mlir-aie/runtime_lib/test_library.cpp needs the same change. Line 990 in test_library.cpp needs to point to the ulp.xclbin from the vck5000_vivado_custom_ulp_design/host_sw_with_aie folder.

On Wed, May 25, 2022 at 7:48 PM ZhenLei Xu @.> wrote: git clone https://github.com/Xilinx/cmakeModules git clone https://github.com/Xilinx/mlir-aie mkdir build; cd build cmake -GNinja \ -DLLVM_DIR=${absolute path to LLVMBUILD}/lib/cmake/llvm \ -DMLIR_DIR=${absolute path to LLVMBUILD}/lib/cmake/mlir \ -DLibXAIE_DIR=${absolute path to LibXAIE} \ -DCMAKE_MODULE_PATH=${absolute path to cmakeModules}/ \ -DVitisSysroot=${SYSROOT} \ -DCMAKE_BUILD_TYPE=Debug \ .. ninja; ninja check-aie; ninja mlir-doc; ninja install Woo, great! how to compile? how to set LibXAIE path? Thanks! — Reply to this email directly, view it on GitHub <#77 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYWKQPJTLJ23SN63MPKJNCDVL3RBTANCNFSM5MGMFXLQ . You are receiving this because you commented.Message ID: @.>

Is this custom design necessary to use the AIEs on the VCK5000 through PCIe? Do we have official support for this nowadays? I've run into an issue because I only have access to the xdma platform, whereas this solution is based on qdma.

@jgmelber
Copy link
Collaborator

Looking into this with #570 @eddierichter-amd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants