Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cmake hip module restrict to cxx #2280

Closed

Conversation

tomsang
Copy link
Contributor

@tomsang tomsang commented May 27, 2021

Fix path issue in #2190

Robert Maynard added 2 commits December 7, 2020 10:16
Previous to these any target that has mixed languages ( C++ && Fortran )
would break as the HIP specific flags such as '-x hip' would be past
to the Fortran compiler.
@ax3l
Copy link

ax3l commented May 27, 2021

Hi @tomsang, I created a reproducer for you under
https://github.com/ax3l/hip_cxxf_cmake

Simply build with:

CXX=$(which clang++) FC=$(which gfortran) cmake -S . -B build
cmake --build build

The original error that you will see is:

gfortran: error: unrecognized command line option ‘-mllvm’
clang-12: warning: argument unused during compilation: '-amdgpu-function-calls=false' [-Wunused-command-line-argument]
gfortran: error: unrecognized command line option ‘-amdgpu-early-inline-all=true’
gfortran: error: unrecognized command line option ‘-amdgpu-function-calls=false’
gfortran: error: unrecognized command line option ‘--offload-arch=gfx900’; did you mean ‘--offload-abi=ilp32’?
gfortran: error: unrecognized command line option ‘--offload-arch=gfx906’; did you mean ‘--offload-abi=ilp32’?
gfortran: error: unrecognized command line option ‘--offload-arch=gfx908’; did you mean ‘--offload-abi=ilp32’?

@tomsang
Copy link
Contributor Author

tomsang commented May 27, 2021

@yxsamliu Do you know the issues @ax3l reported ?
Looks like they are related to
INTERFACE_COMPILE_OPTIONS "$<$<COMPILE_LANGUAGE:CXX>:SHELL:-mllvm;-amdgpu-early-inline-all=true;-mllvm;-amdgpu-function-calls=false>"

@ax3l
Copy link

ax3l commented May 27, 2021

I just went ahead, used an existing ROCm 4.1.0 install on OLCF Spock, copied and modified a local hip-config.cmake and tested with the reproducer repo I posted above.

https://github.com/ax3l/hip_cxxf_cmake/tree/test-hipconfig

This PR fixes the problem 🎉

@ax3l
Copy link

ax3l commented May 27, 2021

INTERFACE_COMPILE_OPTIONS

Yes, this and all other changes in this PR need to be added.
The problem was caused by missing CXX specifications for flags and missing SHELL guarding for flags.

The CXX generator expressions make sure things only land on the CXX compiler & linker.
And the SHELL specifiers make sure that duplicate flags, like the -mllvms in -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false don't get deduplicated to -mllvm -amdgpu-early-inline-all=true -amdgpu-function-calls=false

Copy link

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for readying this up.
I appreciate it :)

@tomsang tomsang requested a review from mangupta May 28, 2021 01:03
set_property(TARGET hip::device APPEND PROPERTY
INTERFACE_COMPILE_OPTIONS -xhip
set_property(TARGET hip::device APPEND PROPERTY
INTERFACE_COMPILE_OPTIONS "$<$<COMPILE_LANGUAGE:CXX>:SHELL:-x hip>"
Copy link

@ax3l ax3l May 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did another test (replicating the logic elsewhere) and this line was causing trouble unless I wrote it as:

Suggested change
INTERFACE_COMPILE_OPTIONS "$<$<COMPILE_LANGUAGE:CXX>:SHELL:-x hip>"
INTERFACE_COMPILE_OPTIONS "$<$<COMPILE_LANGUAGE:CXX>:SHELL:-xhip>"

With ... "-x hip" ...:

clang-12: error: language not recognized: ' hip'

Seen on Spock with HIP 4.1.0

I am not sure why this flag is added in quotes on the command line, maybe that's also causing it. Anyway, removing the space solves it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if doing "$<$<COMPILE_LANGUAGE:CXX>:SHELL:-x> $<$<COMPILE_LANGUAGE:CXX>:SHELL:hip>" would work as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe $<$<COMPILE_LANGUAGE:CXX>:SHELL:-x;hip> like what is done for the other flags.

Copy link

@ax3l ax3l May 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem in my test was, that I had both -xhip and "-x hip" on my command line. Not entirely sure where the "" came from, I am playing if I can find a work around until the PR herein lands:

       target_compile_options(amrex PUBLIC
           "$<$<COMPILE_LANGUAGE:CXX>:SHELL:-mllvm;-amdgpu-early-inline-all=true;-mllvm;-amdgpu-function-calls=false;-xhip>

If I added ...;-x hip my CLI looked like this: clang++ ... "-x hip" -xhip. Without the space it worked. Probably related to list-handing in a SHELL section.
AMReX-Codes/amrex#2031

$ clang++ -x hip
# ok
$ clang++ -xhip
# ok
$ clang++ "-x hip" -xhip
clang-12: error: language not recognized: ' hip'
$ clang++ "-x hip"
clang-12: error: language not recognized: ' hip'
$ clang++ -x hip
# ok

Anyway, I think that might just be from my work-around. The stand-alone test I did worked, as reported earlier.

Copy link

@ax3l ax3l May 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe $&lt;$&lt;COMPILE_LANGUAGE:CXX>:SHELL:-x;hip> like what is done for the other flags.

That might be a bit risky, the idea of SHELL is to keep options that belong together also in lists together. The list ; would tokenize them and potentially shuffle them later on.
https://cmake.org/cmake/help/git-master/command/target_compile_options.html#option-de-duplication

Copy link

@ax3l ax3l May 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I take this back, I think the current patch is ok. We should trust on @robertmaynard's expertise in CMake things :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thus I don't need update it .

@mangupta will help us merge in weekend or next monday

Copy link

@ax3l ax3l May 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed. The problem was on my side, writing

       target_compile_options(amrex PUBLIC
           "$<$<COMPILE_LANGUAGE:CXX>:SHELL:-mllvm;-amdgpu-early-inline-all=true;-mllvm;-amdgpu-function-calls=false;-x hip>")

where I should have written

       target_compile_options(amrex PUBLIC
           "$<$<COMPILE_LANGUAGE:CXX>:SHELL:-mllvm;-amdgpu-early-inline-all=true;-mllvm;-amdgpu-function-calls=false>")
       target_compile_options(amrex PUBLIC
           "$<$<COMPILE_LANGUAGE:CXX>:SHELL:-x hip>")

Current diff in this PR is good to go :)

yxsamliu
yxsamliu previously approved these changes May 28, 2021
Copy link
Contributor

@yxsamliu yxsamliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, if HIP is recognized by cmake as a language, we just need to change COMPILE_LANGUAGE:CXX to COMPILE_LANGUAGE:HIP or add similar options for COMPILE_LANGUAGE:HIP.

@pfultz2
Copy link
Contributor

pfultz2 commented May 28, 2021

In the future, if HIP is recognized by cmake as a language, we just need to change COMPILE_LANGUAGE:CXX to COMPILE_LANGUAGE:HIP or add similar options for COMPILE_LANGUAGE:HIP.

No, this target is to use for CXX and switching it to COMPILE_LANGUAGE:HIP could break builds that do not have source files set to HIP language in cmake(such as .cpp files).

For the HIP language, there is a hip-lang-config cmake file and corresponding hip-lang::device target:

#2230

CMake will use this hip-lang-config file and hip-lang::device when using the HIP language. The reason its seperate is not only because it needs to apply the flags to a different language but also some of the flags are actually different. CMake will pass the --offload-arch and -x hip for the hip language so we shouldn't add those flags.

So the hip::device target is meant for CXX, and the hip-lang::device is meant for HIP. Of course, we could look at unifying how we set the more common flags, so there is not so much repetition.(especially since hip-lang-config.cmake picks the wrong compiler path and should use the same mechanism in hip-config.cmake).

ax3l
ax3l previously approved these changes May 28, 2021
@ax3l
Copy link

ax3l commented Jun 2, 2021

Hi @pfultz2 @yxsamliu @tomsang, is this one ready to go in? :)

@tomsang
Copy link
Contributor Author

tomsang commented Jun 7, 2021

The patch has been merged internally. Will be publicly released to github in Rocm4.4 .

@ax3l
Copy link

ax3l commented Jun 8, 2021

Thanks for the update, that's great!
Is the merge-window for this bugfix for ROCm 4.3 already closed?

@tomsang
Copy link
Contributor Author

tomsang commented Jun 9, 2021

Yes, the merge-window for this bugfix for ROCm 4.3 is already closed.

@tomsang
Copy link
Contributor Author

tomsang commented Jun 11, 2021

@ax3l After that patch is merged internally, we experienced building failure in some math libs.
I have some questions:

  1. What's the minimum cmake version for that change? To my knowledge, it's 3.18.0.
  2. Is there any other dependent change ?

@pfultz2
Copy link
Contributor

pfultz2 commented Jun 12, 2021

What's the minimum cmake version for that change? To my knowledge, it's 3.18.0.

I think its 3.13 because it relies on the SHELL: selector which I think was added in 3.13. We should not require such a newer version in hip-config.cmake, unless we plan to drop support for ubuntu 18.04 since it only has 3.10. Instead, we should conditionally use SHELL when its available. We could try setting a cmake variable:

set(_HIP_SHELL "SHELL:")
if(CMAKE_VERSION VERSION_LESS 3.13)
    set(_HIP_SHELL "")
endif()
set_property(TARGET hip::device APPEND PROPERTY
    INTERFACE_COMPILE_OPTIONS 
    "$<$<COMPILE_LANGUAGE:CXX>:${_HIP_SHELL}-mllvm;-amdgpu-early-inline-all=true;-mllvm;-amdgpu-function-calls=false>"
)

I haven't tested this to know if this works. So we might need to only conditionally enabled the CXX restriction when using the newer cmake.

@ax3l
Copy link

ax3l commented Jun 12, 2021

Hi,

the CXX restrictions and SHELL variable are two independent things.

The SHELL prefix makes sure that - independent of the restriction to the proper language targets - CMake does not deduplicate your -mllvm <something> and -mllvm <somethingElse> flags into -mllvm <something> <somethignElse>, which is an existing bug for all targets.

The :CXX if-s in the generator expressions make sure you don't propagate flags that only apply to HIP C++ targets to other CMake objects in languages like Fortran, Objective-C, etc.

CMake features in this PR by @robertmaynard (#2190), according to docs:

If you like to be backwards compatible, you can do the if(CMAKE_VERSION VERSION_LESS 3.13) switches that @pfultz2 suggests.

You could alternatively, like Intel oneAPI, build & ship a recent CMake package with the ROCm packages. Or document to install a recent CMake for older Ubuntus from Kitware's apt repo: https://apt.kitware.com

@tomsang
Copy link
Contributor Author

tomsang commented Jun 14, 2021

Thus the minimum cmake version is 3.18.6 in terms of last comment .
rocBlas install.sh can automatically update cmake to 3.16.8 if the existing cmake version is lower(for example, cmake 3.10 on ubuntu 18.04).
See https://github.com/ROCmSoftwarePlatform/rocBLAS/blob/develop/install.sh#L581

rocBlas tester just confirmed 3.20.2 can work. But I think it's too high.
I'm asking him to test with cmake 3.18.6.
I'm also asking the rocBlas owner whether it's acceptable to update cmake to 3.18.6 .

@pfultz2
Copy link
Contributor

pfultz2 commented Jun 14, 2021

rocBlas tester just confirmed 3.20.2 can work. But I think it's too high.

3.20 is whats in ubuntu 20.04, so if we want to bump the version we should just require ubuntu 20.04.

I'm also asking the rocBlas owner whether it's acceptable to update cmake to 3.18.6 .

There is more than just rocblas. It would be best to fallback on the old way in older versions of cmake, and then use CXX, and SHELL: when the cmake version is greater than 3.18. This wont break any projects building with hip as most libraries just use whats in ubuntu.

@ax3l
Copy link

ax3l commented Jun 14, 2021

I think that's do-able.

Note that personally, I would recommend to just bump your CMake requirement and document how to acquire CMake on older systems.

Besides the apt-repos, there are tons of resources to quickly get a pre-compiled CMake binary for legacy systems (apt/ppa, conda, pip, Spack, conan, vcpkg, Nuget, Chocolatey, MSI packages, ZIP files, ...) and as a last resort can quickly be bootstrapped from source.
That way, the logic that you have to maintain across the ROCm eco-system will stay maintainable. I think moving fast(er) on bugs like this and adding features (like the upcoming HIP language feature) trumps the need to support old CMake versions - especially since it's trivial to upgrade for users to newer CMake releases (i.e. let's simply document how).

If you want a look at other GPU vendor software products: Intel oneAPI and Nvidia RAPIDS ship the latest CMake to all platforms in order to move fast. CMake is a fully self-contained binary alongside text-based scripts. I would personally join the club and use recent releases, otherwise you will constantly have to maintain legacy work-arounds and bugs filed against them.

@tomsang
Copy link
Contributor Author

tomsang commented Jun 16, 2021

After updating cmake to 3.18.6 we still experience failures in a few mathlibs projects which cannot deal with something like "$<LINK_LANGUAGE:CXX>" and "SHELL:" . I will have to rework patch to be backward compatible.

@tomsang tomsang dismissed stale reviews from ax3l and yxsamliu via b1354bf June 17, 2021 16:08
@tomsang
Copy link
Contributor Author

tomsang commented Jun 17, 2021

I've updated the PR to check cmake version for SHELL and LINK_LANGUAGE.
I ignore checking cmake 3.3 for COMPILE_LANGUAGE because nobody will use such a low version, as I know.
If he really uses 3.3, we can ask him to update cmake.

@tomsang
Copy link
Contributor Author

tomsang commented Jun 18, 2021

@ax3l could you verify whether the new update can still work for your projects?

@ax3l
Copy link

ax3l commented Jun 21, 2021

Thanks for the ping! Visually, the patch looks good to me.

I am a bit busy with deadlines right now, please feel free to test with the self-contained reproducer here:
#2280 (comment)

After updating cmake to 3.18.6 we still experience failures in a few mathlibs projects which cannot deal with something like "$<LINK_LANGUAGE:CXX>" and "SHELL:" . I will have to rework patch to be backward compatible.

That's curious, which exact syntax did not work?

@ax3l
Copy link

ax3l commented Jun 30, 2021

@tomsang just to confirm. I now used my example in https://github.com/ax3l/hip_cxxf_cmake, patched my local ROCm 4.2.0 install with the latest state of this PR, and can confirm that with CMake 3.20.5 the demonstrator now compiles. (And it fails with e.g., CMake 3.16 as we patch out the fixes.)

Do you want to add the test I linked maybe to your integration tests, so we can be sure multi-language CMake projects will continue to build with HIP targets included?

Any news on potentially shipping a CMake package with your ROCm packages (as Intel oneAPI and NVIDIA RAPIDS do as well)?

@tomsang
Copy link
Contributor Author

tomsang commented Jul 12, 2021

This feature will be available in Rocm4.4. But the change is in another format in order to fix mathlib build issues.
A CXX+Fortran compiling sample is added as a test case.

@tomsang
Copy link
Contributor Author

tomsang commented Jul 28, 2021

LINK_LANGUAGE has issue in CMake version from 3.18.0 to 3.19.8 when complex lib dependence exists. This PR failed building
some projects such as rocHPCG and rocHPL. Thus I have to make it jump to 3.20.0 that is known
to work well with LINK_LANGUAGE. The change will look like

if(CMAKE_VERSION VERSION_LESS 3.20)
set_property(TARGET hip::device APPEND PROPERTY
INTERFACE_LINK_LIBRARIES --hip-link
)
else()
set_property(TARGET hip::device APPEND PROPERTY
INTERFACE_LINK_LIBRARIES "$<$<LINK_LANGUAGE:CXX>:--hip-link>"
)
endif()

@mangupta mangupta closed this Aug 5, 2021
@mangupta mangupta reopened this Aug 5, 2021
@tomsang
Copy link
Contributor Author

tomsang commented Aug 24, 2021

Close it as the file was removed and another PR in a new project(hipamd) has been merged for the same purpose.

@tomsang tomsang closed this Aug 24, 2021
@tomsang tomsang deleted the cmake_hip_module_restrict_to_cxx branch August 24, 2021 00:31
@ax3l
Copy link

ax3l commented Jan 20, 2022

It looks to me that at least the missing SHELL: escapes are also in that project:
https://github.com/ROCm-Developer-Tools/hipamd/blob/d2d2cacfe210307ec10c77400e1dafdeafefbc0f/hip-config.cmake.in#L225

I see issues with that using ROCm 4.5.2 for CXX targets, I fear.
(Seen with Cray CC compiler wrappers on OLCF Crusher today.)

Migrated issue to ROCm/hipamd#12

WeiqunZhang pushed a commit to AMReX-Codes/amrex that referenced this pull request Jan 21, 2022
## Summary

By now (HIP 4.5), we should be able to link also against Fortran targets.

Let's remove work-arounds and ditch old HIP releases, so improve maintainability and avoid surprises.

## Additional background

X-ref:
- ROCm/HIP#2280 (ROCm/HIP#2190)
- ROCm/HIP#2275
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants