This repository contains vector-add test cases in cmake and makefile, to test different oneAPI compilation environments targeting different FPGA devices.
The source code is from oneAPI-samples, with additional scripts added by me.
The goal here is to create a specific vector-add environment for each oneAPI version and test it to figure out what works and what doesn't when compiling the design for emulator, for report, for hardware, and running the hardware design.
Two things were noticed to be important: the version of the oneAPI environment being used and the node on which you are running. Certain nodes may not be properly configured, and some oneAPI environments may not function correctly for a specific device target.
TLDR : find the results for
The goal here is to create a specific vector-add environment for each oneAPI version and test it to figure out what works and what doesn't when compiling the design for emulator, for report, for hardware, and running the hardware design.
All available versions of oneAPI can be obtained from Intel's DevCloud using :
ls /glob/development-tools/versions/oneapi/*/oneapi/setvars.sh | sed -n 's/.*oneapi\/\(.*\)\/oneapi.*/\1/p'
Iterating over each oneAPI version, a new script file is created to generate a test-folder environment, load the specified oneAPI version, and attempt to compile the emulator, report, and hardware. If hardware compilation is successful, the design is executed.
The script generators can be found as scripts/gen_tests_<device>_hw.sh
.
- Generate scripts
cd scripts
./gen_tests_agilex_hw.sh
cd ..
- Run all scripts from the root folder, not the scripts folder.
for f in scripts/test_run_agilex*; do qsub $f; done
After testing all available oneAPI environments, we can see that those work.
The tables below show the success state by device type and oneAPI version for each target (emulator, report, hardware, and run_hardware).
oneapi_version | emulator | report | hardware | run_hardware |
---|---|---|---|---|
2022.3.1 | NO | NO | NO | - |
2023.0.1 | YES | NO | YES | YES |
2023.0 | YES | NO | YES | YES |
2023.1.0.4 | YES | NO | NO | - |
2023.1.1_cmake | YES | NO | NO | - |
2023.1.1 | NO | NO | NO | - |
2023.1.2 | NO | NO | NO | - |
2023.2.0 | NO | NO | NO | - |
2023.2.0.1 | NO | NO | NO | - |
2023.2.0.2_s | YES | YES | YES | YES |
2024.0.2.1 | YES | YES | YES | YES |
2024.0.2 | YES | NO | YES | YES |
2024.0 | NO | NO | NO | - |
test | YES | NO | NO | - |
oneapi_version | emulator | report | hardware | run_hardware |
---|---|---|---|---|
2022.3.1 | NO | NO | NO | - |
2023.0.1 | YES | YES | YES | NO -42 sycl::_V1::runtime_error |
2023.0 | YES | YES | YES | NO -42 sycl::_V1::runtime_error |
2023.1.0.4 | YES | YES | YES | NO -42 sycl::_V1::runtime_error |
2023.1.1_cmake | YES | YES | YES | NO -42 sycl::_V1::runtime_error |
2023.1.1 | YES | YES | YES | YES |
2023.1.2 | YES | YES | NO | - |
2023.2.0 | YES | YES | YES | YES |
2023.2.0.1 | YES | YES | YES | YES |
2023.2.0.2_s | YES | YES | YES | YES |
2024.0.2.1 | YES | NO | YES | YES |
2024.0.2 | YES | NO | YES | - |
2024.0 | YES | NO | YES | YES |
test | YES | NO | YES | - |
oneapi_version | emulator | report | hardware | run_hardware |
---|---|---|---|---|
2022.3.1 | NO | NO | NO | - |
2023.0.1 | YES | YES | NO | - |
2023.0 | YES | YES | NO | - |
2023.1.0.4 | YES | YES | NO | - |
2023.1.1 | YES | YES | NO | - |
2023.1.1_cmake | YES | YES | NO | - |
2023.1.2 | YES | YES | NO | - |
2023.2.0.2_s | YES | YES | NO | - |
2023.2.0 | YES | YES | NO | - |
2023.2.0.1 | YES | YES | YES | YES |
2024.0.2.1 | YES | YES | NO | - |
2024.0.2 | YES | YES | YES | YES |
2024.0 | YES | YES | YES | YES |
test | YES | YES | NO | - |
Is suggested to run source /glob/development-tools/versions/oneapi/<oneAPI_version>/oneapi/setvars.sh --force
in your PBS script to ensure a valid working oneAPI compilation environment
All failing reports generates the static_reports_creation_log.temp
file, which content is for each and everyone of them :
/glob/development-tools/versions/oneapi/${oneapi_version}/oneapi/compiler/${oneapi_version}/opt/oclfpga/share/lib/reports/standalone_report/fpga_report: error while loading shared libraries: libnss3.so: cannot open shared object file: No such file or directory
TLDR if testing on Intel's devcloud : Some nodes are not properly configured, find all properly configured nodes here
On Internet, this errors doesn't yield much information. A post from the intel community forums indicate about an incorrect correct board variant initialised, but no more information.
So I set a small experiment :
I had already managed to compile the vector-add for Arria10 using oneAPI version 2024.0.2.1, so I already had the executable at ./build-arria-2024.0.2.1/vector-add-buffers.fpga
I then started two terminals, connected both to the devcloud, sourced oneAPI version 2024.0.2.1 and then qsub an interactive arria10 partition.
ssh devcloud
cd fpga-vector-add/Makefile
source /glob/development-tools/versions/oneapi/2024.0.2.1/oneapi/setvars.sh --force
./build-arria-2024.0.2.1/vector-add-buffers.fpga
- first interactive session gives me the
s001-n081
node
@s001-n081 Makefile ❯ ./build-arria-2024.0.2.1/vector-add-buffers.fpga
Running on device: Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz
Vector size: 10000
An exception is caught for vector add.
terminate called after throwing an instance of 'sycl::_V1::runtime_error'
what(): Native API failed. Native API returns: -42 (PI_ERROR_INVALID_BINARY) -42 (PI_ERROR_INVALID_BINARY)
Aborted
- second interactive session gives me the
s001-n083
node
@s001-n083 Makefile ❯ ./build-arria-2024.0.2.1/vector-add-buffers.fpga
Running on device: pac_a10 : Intel PAC Platform (pac_ee00000)
Vector size: 10000
[0]: 0 + 0 = 0
[1]: 1 + 1 = 2
[2]: 2 + 2 = 4
...
[9999]: 9999 + 9999 = 19998
Vector add successfully completed on device.
Has the executable been compiled using node s001-n083
? Is it node-dependent? What are the differences between the two nodes?
Let's compare the two environments to see if we can identify a cause.
- first node
s001-n081
:
@s001-n081 Makefile ❯ aoc -list-boards
...
pac_a10 (default)
Board Package: /opt/intel/oneapi/intel_a10gx_pac
...
- second node
s001-n083
:
@s001-n083 Makefile ❯ aoc -list-boards
...
pac_a10 (default)
Board Package: /opt/intel_2023.1/oneapi/intel_a10gx_pac
...
a diff -r /opt/intel/oneapi/intel_a10gx_pac /opt/intel_2023.1/oneapi/intel_a10gx_pac
reveals nothing. There are the same folders
- first node
s001-n081
:
@s001-n081 Makefile ❯ sycl-ls
[opencl:cpu:0] Intel(R) OpenCL, Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz OpenCL 3.0 (Build 0) [2023.16.11.0.22_160000]
- first node
s001-n083
:
@s001-n081 Makefile ❯ sycl-ls
[opencl:cpu:0] Intel(R) OpenCL, Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz OpenCL 3.0 (Build 0) [2023.16.11.0.22_160000]
[opencl:acc:1] Intel(R) FPGA SDK for OpenCL(TM), pac_a10 : Intel PAC Platform (pac_ee00000) OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 2024.0 [2024.0]
[opencl:acc:2] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2023.16.11.0.22_160000]
There is definitely something going on here. The first node doesn't seem to be configured well.
So I opened a new terminal and did the same steps. Got node s001-n084
.
@s001-n084 Makefile ❯ ./build-arria-2024.0.2.1/vector-add-buffers.fpga
Running on device: pac_a10 : Intel PAC Platform (pac_ee00000)
Vector size: 10000
[0]: 0 + 0 = 0
[1]: 1 + 1 = 2
[2]: 2 + 2 = 4
...
[9999]: 9999 + 9999 = 19998
Vector add successfully completed on device.
@s001-n084 Makefile ❯ sycl-ls
[opencl:cpu:0] Intel(R) OpenCL, Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz OpenCL 3.0 (Build 0) [2023.16.11.0.22_160000]
[opencl:acc:1] Intel(R) FPGA SDK for OpenCL(TM), pac_a10 : Intel PAC Platform (pac_ee00000) OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 2024.0 [2024.0]
[opencl:acc:2] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2023.16.11.0.22_160000]
@s001-n084 Makefile ❯ aoc -list-boards
...
pac_a10 (default)
Board Package: /opt/intel/oneapi/intel_a10gx_pac
...
Nodes s001-n081
, s001-n083
and s001-n084
show the exact same properties when using pbsnodes
. So it shouldn't be this...
Looking at the aocl diagnose all
report, the non-working node (s001-n081
) reports :
WARNING: Skipping ICD validation since there is no ICD entry at this location
...
ERROR: No ICD entry and all files specified by OCL_ICD_FILENAMES are invalid
...
ICD diagnostics FAILED
while the working ones report ICD diagnostics PASSED
Tho, both non-working and working node reports DIAGNOSTIC_PASSED
Knowing that some environments are not properly configured, I set up some scripts to check which nodes are working.
I used those commands to get the nodelist, then run a small script that sources the oneAPI environment, print some environement stuff then tries executing the fpga design.
pbsnodes | grep arria -B4 | grep s001 # Arria10 nodes
pbsnodes | grep stratix -B4 | grep s001 # Stratix10 nodes
pbsnodes | grep agilex -B4 | grep s006 # Agilex nodes
The scripts can be found at scripts/devcloud_working_<device>_nodes.sh
where device is arria, stratix or agilex. These scripts will run for the test_node_<device>.sh
script for all nodes, which will tell if the node is configured correctly or not.
Arria10 nodes | Working state |
---|---|
s001-n081 | NO |
s001-n082 | |
s001-n083 | YES |
s001-n084 | YES |
s001-n085 | |
s001-n086 | NO |
s001-n087 | YES |
s001-n088 | |
s001-n089 | NO |
s001-n090 | |
s001-n091 | |
s001-n092 |
Stratix10 nodes | Working state |
---|---|
s001-n010 | |
s001-n142 | NO |
s001-n143 | YES |
s001-n144 | NO |
Agilex nodes | Working state |
---|---|
s006-n001 | |
s006-n002 | YES |
s006-n003 | YES |
Bad news it that once the working nodes were spotted, attempting to re-compile the vector-add using the oneAPI versions that result int the -42 sycl::_V1::runtime_error
error, still allow for a working compilation for hardware design, it still couldn't be runned on the node, and returns the same error.
Source code from Intel's oneAPI-samples
Scripts are mine