Skip to content
This repository has been archived by the owner on Jan 26, 2024. It is now read-only.

Building devprogram.cpp.o fails #64

Open
FinnStokes opened this issue Jan 4, 2019 · 4 comments
Open

Building devprogram.cpp.o fails #64

FinnStokes opened this issue Jan 4, 2019 · 4 comments

Comments

@FinnStokes
Copy link

Since 184c0ef, I am having issues building the OpenCL runtime on master. With the rock-dkms package installed, and ROCm compile from source, I run

~/bin/repo init -u https://github.com/RadeonOpenCompute/ROCm-OpenCL-Runtime.git -b master -m opencl.xml
~/bin/repo sync
cd opencl
mkdir build
cd build
cmake3 -DCMAKE_INSTALL_PREFIX=/opt/rocm/opencl -DCMAKE_PREFIX_PATH=/opt/rocm -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
make

The compilation runs fine all through LLVM and Clang but fails when it reaches oclruntime, with the error

[ 96%] Building CXX object runtime/CMakeFiles/oclruntime.dir/device/devprogram.cpp.o
In file included from /home/fstokes/OpenCL2/opencl/runtime/device/devprogram.cpp:16:0:
/home/fstokes/OpenCL2/opencl/build/runtime/device/rocm/libraries.amdgcn.inc:2:10: fatal error: oclc_correctly_rounded_sqrt_off.amdgcn.inc: No such file or directory
 #include "oclc_correctly_rounded_sqrt_off.amdgcn.inc"
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It seems like there is an issue with the dependencies computed by cmake. runtime/device/devprogram.cpp (which was added in f6629e2 / 184c0ef) imports libraries.amdgcn.inc which in turn imports a number of dynamically generated amdgcn.inc files which are not generated before trying to compile devprogram.cpp.

My current workaround is running

pushd runtime/device/rocm
make
popd
make

after make fails for the first time. This generates the missing include files before continuing. However, I do not know enough about cmake to determine what the actual fix should be to get this dependency ordering correct.

I've attached my cmake and make output in case it is relevant, but I think this is a bug in the cmake configuration that should not be specific to my setup.

@FinnStokes FinnStokes changed the title Building devprogram.cpp.o fails Building devprogram.cpp.o fails Jan 4, 2019
@jlgreathouse
Copy link

Hi @FinnStokes

The problem you're running into here is that our master branch repo manifest.xml currently pulls from master on many of the sub-projects. This is incorrect, because this means that it will try to pull in changes that have not actually been tested against our OpenCL runtime. The correct thing has been pushed to our roc-2.0.x branch, where we have "pinned" the manifest to the correct commits in other projects.

If you're still interested in building the ROCm OpenCL runtime, you might want to check out our Experimental ROC project and use these the component build scripts for your distro to build it. For example, if you are on Ubuntu 18.04, you could run Experimental_ROC/distro_install_scripts/Ubuntu/Ubuntu_18.04/src_install/component_scripts/01_07_opencl.sh. You can see the arguments to these scripts in the README files. The branches in Experimental ROC correspond to particular ROCm releases.

@FinnStokes
Copy link
Author

Hi @jlgreathouse

I tried building the roc-2.0.x branch. Because the 2.0.0 tag does not include the fix to ROCm/ROCm-OpenCL-Driver#76, I had to patch the relevant CMakeLists.txt. When it got to compiling oclruntime, it once again had the same error:

~/bin/repo init -u https://github.com/RadeonOpenCompute/ROCm-OpenCL-Runtime.git -b roc-2.0.x -m opencl.xml
~/bin/repo sync
cd opencl/
sed -i -e 's/link_directories(${binary_dir}\/googletest)/link_directories(${binary_dir}\/lib)/' compiler/driver/src/unittest/CMakeLists.txt
mkdir build
cd build
scl enable devtoolset-7 bash
cmake3 -DCMAKE_INSTALL_PREFIX=/opt/rocm/opencl -DCMAKE_PREFIX_PATH=/opt/rocm -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
make
...
[ 94%] Building CXX object runtime/CMakeFiles/oclruntime.dir/device/devprogram.cpp.o
In file included from /home/fstokes/OpenCL2/opencl/runtime/device/devprogram.cpp:16:0:
/home/fstokes/OpenCL2/opencl/build/runtime/device/rocm/libraries.amdgcn.inc:2:10: fatal error: oclc_correctly_rounded_sqrt_off.amdgcn.inc: No such file or directory
 #include "oclc_correctly_rounded_sqrt_off.amdgcn.inc"
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

On the other hand the Experimental ROC script builds fine. It turns out that the only difference here that seems to matter is the use of the -j flag to parallelise the build across multiple cores. With -j 12, the build runs fine for master, the rox-2.0.x branch, or via the Experimental ROC script, whereas building single threaded fails on all three.

@ulyssesrr
Copy link

ulyssesrr commented Mar 16, 2019

Hi @jlgreathouse I'm mantaining an rocm-opencl-runtime AUR package and some users are reporting this issue(even after the repo pinning). Theses users often claim that building in parallel solves the issue. Since I always built with -j8 I was never hit by it.

After some debugging, it seems that the target that generates the missing header is added as dependency of target oclrocm in
runtime/device/rocm/CMakeLists.txt

add_custom_target(${header}_target ALL DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/${header})
add_dependencies(oclrocm  ${header}_target)

devprogram.cpp is part of oclruntime target and I couldn't find any direct/indirect dependency between oclruntime and oclrocm that would ensure proper build order, so I added it to runtime/CMakeLists.txt.
https://aur.archlinux.org/cgit/aur.git/tree/fix_rocm_opencl_build_order.patch?h=rocm-opencl-runtime&id=ad3d0221a63257abe0cc92f527d152156e6aefc5

Now the build seems to be stable. Could you verify please?

@acowley
Copy link

acowley commented Mar 16, 2019

I did something related for NixOS for the same issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants