Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLVM] Add LLVM_<proj>_RUNTIME_TARGETS to set targets per-project #81557

Closed
wants to merge 1 commit into from

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Feb 13, 2024

Summary:
Currently, the runtimes build allow users to optionally build for
multiple target architectures via the LLVM_RUNTIME_TARGETS option.
Once problem is that this applies to every single runtime the user has
enabled, when it's possible that we may wish to enable a different set
for just one target.

The motivating example here is for handling GPU targets as
cross-compiling. We want to be able to perform something like
LLVM_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda to build
runtimes targeting the GPU. However, the current support would force the
other runtimes to be built for the GPU, when this definitely will not
work.

This patch attempts to work around this in a generic way by letting
individual targets overload the set of enabled runtimes triples. The
expected use would be to enable something like the following for
targeting the GPU libc and in the future the GPU libcxx.

-DLLVM_ENABLE_RUNTIMES='openmp;libcxx;libcxx-abi;libc'
-DLLVM_LIBC_RUNTIME_TARGETS='nvptx64-nvidia-cuda;amdgcn-amd-amdhsa'

@petrhosek
Copy link
Member

petrhosek commented Feb 13, 2024

You can already do the following:

-DLLVM_RUNTIME_TARGETS='nvptx64-nvidia-cuda;amdgcn-amd-amdhsa'
-DRUNTIMES_nvptx64-nvidia-cuda_LLVM_ENABLE_RUNTIMES='libc'
-DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES='libc'

This is already used in existing builds. Is that sufficient for your use case?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 13, 2024

You can already do the following:

-DLLVM_RUNTIME_TARGETS='nvptx64-nvidia-cuda;amdgcn-amd-amdhsa'
-DRUNTIMES_nvptx64-nvidia-cuda_LLVM_ENABLE_RUNTIMES='libc'
-DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES='libc'

This is already used in existing builds. Is that sufficient for your use case?

Will that allow the other targets to still be built? For example, like in the main comment where I have openmp among others that still use default. The approach introduced in this patch seems more straightforward personally, but if it's sufficient I could make it work.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 13, 2024

So, I tired it out and it doesn't seem to have the semantics that I need. I want to be able to specify that only a specific project is built with the desired architecture, that means omitting it from the standard targets if present. That doesn't happen with the above -DRUNTIMES_nvptx64-nvidia-cuda_LLVM_ENABLE_RUNTIMES='libc' method, because it will then cause other runtimes like openmp or libunwind to be built for the GPU which is not desired. Additionally, that syntax is a lot more difficult to read and not as simple to put in a default build.

@petrhosek
Copy link
Member

petrhosek commented Feb 13, 2024

I'm not sure I follow, why are other runtimes like openmp and libunwind even being built if they're not listed in RUNTIMES_nvptx64-nvidia-cuda_LLVM_ENABLE_RUNTIMES='libc'? Can you post the full set of options you're using?

The reason I'm not a fan of the proposed solution is that it seems to be specifically tailored to libc and I'm worried that it won't generalize. Specifically, runtimes are typically layered on top of each other, that is libunwind is built on top of libc, libcxxabi is built on top of libunwind, libcxx is built on top of libcxxabi, compiler-rt is built on top of libcxx. Whereas LLVM_LIBC_RUNTIME_TARGETS='nvptx64-nvidia-cuda;amdgcn-amd-amdhsa' makes sense since libc is the bottommost layer, something like LLVM_LIBCXXABI_RUNTIME_TARGETS='nvptx64-nvidia-cuda;amdgcn-amd-amdhsa' likely won't ever be of any use.

In your description, you also mentioned LLVM_ENABLE_PROJECTS='openmp;libcxx;libcxx-abi;libc' but that's not even a supported configuration, in particular libcxx and libcxxabi only support LLVM_ENABLE_RUNTIMES, putting them in LLVM_ENABLE_PROJECTS will result in an error (that will soon be the case for all runtimes). Can you explain what the intended goal there is? I'd like to see if there's another way we can support this.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 13, 2024

I'm not sure I follow, why are other runtimes like openmp and libunwind even being built if they're not listed in RUNTIMES_nvptx64-nvidia-cuda_LLVM_ENABLE_RUNTIMES='libc'? Can you post the full set of options you're using?

What I meant is that it required passing LLVM_RUNTIMES_TARGETS which would then apply it to all my builds, when I want them to all be default. The option was just like this to test.

-DLLVM_ENABLE_RUNTIMES='libc;libcxx;libunwind;openmp'
-DLLVM_RUNTIME_TARGETS='default;x86_64-pc-linux-gnu'
-DRUNTIMES-x86_64-pc-linux-gnu_LLVM_ENABLE_RUNTIMES=libc

And the issue was that it was still building default for libc.

The reason I'm not a fan of the proposed solution is that it seems to be specifically tailored to libc and I'm worried that it won't generalize. Specifically, runtimes are typically layered on top of each other, that is libunwind is built on top of libc, libcxxabi is built on top of libunwind, libcxx is built on top of libcxxabi, compiler-rt is built on top of libcxx. Whereas LLVM_LIBC_RUNTIME_TARGETS='nvptx64-nvidia-cuda;amdgcn-amd-amdhsa' makes sense since libc is the bottommost layer, something like LLVM_LIBCXXABI_RUNTIME_TARGETS='nvptx64-nvidia-cuda;amdgcn-amd-amdhsa' likely won't ever be of any use.

That's easily doable if the user manually configures the override to include either default or the other values in LLVM_ENABLE_RUNTIMES. I pretty much just need a way to enable additional targets on a per-project basis. It needs to be able to prevent the default build as well. If I were doing this and wanted the others to be built on top of it, I would just do

-DLLVM_RUNTIME_TARGETS=default
-DLLVM_LIBC_RUNTIME_TARGETS=default;amdgcn-amd-amdhsa

I think the semantics there are straightforward enogh.

In your description, you also mentioned LLVM_ENABLE_PROJECTS='openmp;libcxx;libcxx-abi;libc' but that's not even a supported configuration, in particular libcxx and libcxxabi only support LLVM_ENABLE_RUNTIMES, putting them in LLVM_ENABLE_PROJECTS will result in an error (that will soon be the case for all runtimes). Can you explain what the intended goal there is? I'd like to see if there's another way we can support this.

That's a typo, I meant LLVM_ENABLE_RUNTIMES but my brain typed the other thing.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 14, 2024

Also, like you said, this might not be explicitly useful beyond libc for now. I could simply make it a specific LLVM_LIBC_RUNTIME_TARGETS instead? But I'm planning ahead because I think in the future we will want a libc++ for the GPU that builds off of the libc for the GPU.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 15, 2024

Would it be possible to make this move along faster if I just implemented it as LLVM_LIBC_RUNTIME_TARGETS and handled it specially there? I have the GPU libc rewrite done so I'm hoping to land it and get to work updating everything else since that patch solves pretty much every remaining issue I had before I could say the GPU libc was "done".

jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 15, 2024
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
llvm#81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly suprerior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Depends on llvm#81557
@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 15, 2024

Also @petrhosek, I've noticed that the non-default targets don't use PASSTHROUGH_PREFIXES. Is there a reason for this? It's important to be able to pass arguments to the runtime. IMHO we should always pass ${canon_name}_ prefixes. I also have a need for CUDA so that users can pick up the CUDA Toolkit directory.

jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 16, 2024
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
llvm#81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly suprerior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Depends on llvm#81557
jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 16, 2024
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
llvm#81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly suprerior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Depends on llvm#81557
Copy link
Collaborator

@JonChesterfield JonChesterfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unconvinced by the runtime layer argument. Specifically, I'm interested in building libcxx for amdgpu without building libc for it, and it's plausible that libcxx on GPUs will not involve libunwind at all.

Likewise building compiler-rt for x64 and not nvptx, while building libc for nvptx and not x64 is probably a default configuration.

I would very much prefer cmake let us specify specific combinations instead of allowing a few blessed stacks.

GPU targets as cross compiling things, instead of as magic weird things with their own configuration, is the right direction.

@JonChesterfield
Copy link
Collaborator

@petrhosek please shout out if you want to block this in the current form - I have no doubt that the cmake could be better, but that's always true, and this would really help the gpu project. When someone sees a better way to express the same capability we'll be delighted to adopt it.

include(${LLVM_BINARY_DIR}/runtimes/${name}/Components.cmake OPTIONAL)
set_property(DIRECTORY APPEND PROPERTY CMAKE_CONFIGURE_DEPENDS ${LLVM_BINARY_DIR}/runtimes/${name}/Components.cmake)

set(runtime_names ${RUNTIME_NAMES})
set(LLVM_ENABLE_RUNTIMES ${ARG_RUNTIMES})
Copy link
Member

@petrhosek petrhosek Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing to set this variable here since it may imply that it is somehow related to the loop below while it isn't, this should be moved to line 343.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done.

@jhuber6 jhuber6 force-pushed the RuntimesOverload branch 2 times, most recently from a70563f to c0495ee Compare February 16, 2024 16:29
@@ -236,6 +236,7 @@ function(runtime_default_target)
endif()

set_enable_per_target_runtime_dir()
set(LLVM_ENABLE_RUNTIMES ${ARG_RUNTIMES})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we would passthrough LLVM_ENABLE_RUNTIMES as a global variable, but now we always pass in RUNTIMES as a function argument, and then set LLVM_ENABLE_RUNTIMES as a local variable shadowing the global one and rely on the passthrough mechanism.

That's quite obscure and potentially error prone, I think we should stop using the passthrough machinery and instead pass LLVM_ENABLE_RUNTIMES as an extra argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I tired adding it to CMAKE_ARGS but it was being overridden by the passthrough version like you said. Changing PASSTHROUGH was a much bigger chance so I tried to keep it small. Maybe I could set LLVM_ENABLE_RUNTIMES to "" to make it more obvious.

@petrhosek
Copy link
Member

petrhosek commented Feb 16, 2024

@jhuber6 @JonChesterfield Can you give me a more concrete example of how you intend to build the full set of runtimes (including compiler-rt, libc and libcxx) for GPU using this mechanism? Maybe that would help me better understand the motivation.

The reason I'm asking is because I'm still unconvinced by the simplicity argument. The reason for pushing this feature is to simplify the runtimes build compared to current implementation that requires multiple variables, but in my experience, building runtimes typically requires a number of variables regardless, so saving one variable for setting the list of runtimes makes it difficult to justify more complexity in the runtimes build for me and I'm worried we'll be adding a mechanism that in practice will only be useful in limited cases.

For reference, here's an example from our runtimes build for Linux (which includes compiler-rt, libunwind, libcxxabi and libcxx):

# Set the per-target builtins options.
list(APPEND BUILTIN_TARGETS "${target}")
set(BUILTINS_${target}_CMAKE_SYSTEM_NAME Linux CACHE STRING "")
set(BUILTINS_${target}_CMAKE_BUILD_TYPE RelWithDebInfo CACHE STRING "")
set(BUILTINS_${target}_CMAKE_C_FLAGS "--target=${target}" CACHE STRING "")
set(BUILTINS_${target}_CMAKE_CXX_FLAGS "--target=${target}" CACHE STRING "")
set(BUILTINS_${target}_CMAKE_ASM_FLAGS "--target=${target}" CACHE STRING "")
set(BUILTINS_${target}_CMAKE_SYSROOT ${LINUX_${target}_SYSROOT} CACHE STRING "")
set(BUILTINS_${target}_CMAKE_SHARED_LINKER_FLAGS "-fuse-ld=lld" CACHE STRING "")
set(BUILTINS_${target}_CMAKE_MODULE_LINKER_FLAGS "-fuse-ld=lld" CACHE STRING "")
set(BUILTINS_${target}_CMAKE_EXE_LINKER_FLAG "-fuse-ld=lld" CACHE STRING "")
set(BUILTINS_${target}_COMPILER_RT_BUILD_STANDALONE_LIBATOMIC ON CACHE BOOL "")
# Set the per-target runtimes options.
list(APPEND RUNTIME_TARGETS "${target}")
set(RUNTIMES_${target}_CMAKE_SYSTEM_NAME Linux CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_BUILD_TYPE RelWithDebInfo CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_C_FLAGS "--target=${target}" CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_CXX_FLAGS "--target=${target}" CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_ASM_FLAGS "--target=${target}" CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_SYSROOT ${LINUX_${target}_SYSROOT} CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_SHARED_LINKER_FLAGS "-fuse-ld=lld" CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_MODULE_LINKER_FLAGS "-fuse-ld=lld" CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_EXE_LINKER_FLAGS "-fuse-ld=lld" CACHE STRING "")
set(RUNTIMES_${target}_COMPILER_RT_CXX_LIBRARY "libcxx" CACHE STRING "")
set(RUNTIMES_${target}_COMPILER_RT_USE_BUILTINS_LIBRARY ON CACHE BOOL "")
set(RUNTIMES_${target}_COMPILER_RT_USE_LLVM_UNWINDER ON CACHE BOOL "")
set(RUNTIMES_${target}_COMPILER_RT_CAN_EXECUTE_TESTS ON CACHE BOOL "")
set(RUNTIMES_${target}_COMPILER_RT_BUILD_STANDALONE_LIBATOMIC ON CACHE BOOL "")
set(RUNTIMES_${target}_LIBUNWIND_ENABLE_SHARED OFF CACHE BOOL "")
set(RUNTIMES_${target}_LIBUNWIND_USE_COMPILER_RT ON CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXXABI_USE_COMPILER_RT ON CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXXABI_ENABLE_SHARED OFF CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXXABI_USE_LLVM_UNWINDER ON CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXXABI_INSTALL_LIBRARY OFF CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXX_USE_COMPILER_RT ON CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXX_ENABLE_SHARED OFF CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXX_ENABLE_STATIC_ABI_LIBRARY ON CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXX_ABI_VERSION 2 CACHE STRING "")
set(RUNTIMES_${target}_LLVM_ENABLE_ASSERTIONS OFF CACHE BOOL "")
set(RUNTIMES_${target}_SANITIZER_CXX_ABI "libc++" CACHE STRING "")
set(RUNTIMES_${target}_SANITIZER_CXX_ABI_INTREE ON CACHE BOOL "")
set(RUNTIMES_${target}_SANITIZER_TEST_CXX "libc++" CACHE STRING "")
set(RUNTIMES_${target}_SANITIZER_TEST_CXX_INTREE ON CACHE BOOL "")
set(RUNTIMES_${target}_LLVM_TOOLS_DIR "${CMAKE_BINARY_DIR}/bin" CACHE BOOL "")
set(RUNTIMES_${target}_LLVM_ENABLE_RUNTIMES "compiler-rt;libcxx;libcxxabi;libunwind" CACHE STRING "")

This inherent complexity is why we use cache files in practice, see for example https://github.com/llvm/llvm-project/tree/ea226d6693022b707f6acc6e399c12da8261de11/libcxx/cmake/caches.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 16, 2024

@jhuber6 @JonChesterfield Can you give me a more concrete example of how you intend to build the full set of runtimes (including compiler-rt, libc and libcxx) for GPU using this mechanism? Maybe that would help me better understand the motivation.

The reason I'm asking is because I'm still unconvinced by the simplicity arguments. The reason for pushing this feature is to simplify the runtimes build compared to current implementation that requires multiple variables, but in my experience, building runtimes typically requires a number of variables regardless, so saving one variable for setting the list of runtimes makes it difficult to justify more complexity in the runtimes build for me.

For reference, here's an example from our runtimes build for Linux (which includes compiler-rt, libunwind, libcxxabi and libcxx):

# Set the per-target builtins options.
list(APPEND BUILTIN_TARGETS "${target}")
set(BUILTINS_${target}_CMAKE_SYSTEM_NAME Linux CACHE STRING "")
set(BUILTINS_${target}_CMAKE_BUILD_TYPE RelWithDebInfo CACHE STRING "")
set(BUILTINS_${target}_CMAKE_C_FLAGS "--target=${target}" CACHE STRING "")
set(BUILTINS_${target}_CMAKE_CXX_FLAGS "--target=${target}" CACHE STRING "")
set(BUILTINS_${target}_CMAKE_ASM_FLAGS "--target=${target}" CACHE STRING "")
set(BUILTINS_${target}_CMAKE_SYSROOT ${LINUX_${target}_SYSROOT} CACHE STRING "")
set(BUILTINS_${target}_CMAKE_SHARED_LINKER_FLAGS "-fuse-ld=lld" CACHE STRING "")
set(BUILTINS_${target}_CMAKE_MODULE_LINKER_FLAGS "-fuse-ld=lld" CACHE STRING "")
set(BUILTINS_${target}_CMAKE_EXE_LINKER_FLAG "-fuse-ld=lld" CACHE STRING "")
set(BUILTINS_${target}_COMPILER_RT_BUILD_STANDALONE_LIBATOMIC ON CACHE BOOL "")
# Set the per-target runtimes options.
list(APPEND RUNTIME_TARGETS "${target}")
set(RUNTIMES_${target}_CMAKE_SYSTEM_NAME Linux CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_BUILD_TYPE RelWithDebInfo CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_C_FLAGS "--target=${target}" CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_CXX_FLAGS "--target=${target}" CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_ASM_FLAGS "--target=${target}" CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_SYSROOT ${LINUX_${target}_SYSROOT} CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_SHARED_LINKER_FLAGS "-fuse-ld=lld" CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_MODULE_LINKER_FLAGS "-fuse-ld=lld" CACHE STRING "")
set(RUNTIMES_${target}_CMAKE_EXE_LINKER_FLAGS "-fuse-ld=lld" CACHE STRING "")
set(RUNTIMES_${target}_COMPILER_RT_CXX_LIBRARY "libcxx" CACHE STRING "")
set(RUNTIMES_${target}_COMPILER_RT_USE_BUILTINS_LIBRARY ON CACHE BOOL "")
set(RUNTIMES_${target}_COMPILER_RT_USE_LLVM_UNWINDER ON CACHE BOOL "")
set(RUNTIMES_${target}_COMPILER_RT_CAN_EXECUTE_TESTS ON CACHE BOOL "")
set(RUNTIMES_${target}_COMPILER_RT_BUILD_STANDALONE_LIBATOMIC ON CACHE BOOL "")
set(RUNTIMES_${target}_LIBUNWIND_ENABLE_SHARED OFF CACHE BOOL "")
set(RUNTIMES_${target}_LIBUNWIND_USE_COMPILER_RT ON CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXXABI_USE_COMPILER_RT ON CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXXABI_ENABLE_SHARED OFF CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXXABI_USE_LLVM_UNWINDER ON CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXXABI_INSTALL_LIBRARY OFF CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXX_USE_COMPILER_RT ON CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXX_ENABLE_SHARED OFF CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXX_ENABLE_STATIC_ABI_LIBRARY ON CACHE BOOL "")
set(RUNTIMES_${target}_LIBCXX_ABI_VERSION 2 CACHE STRING "")
set(RUNTIMES_${target}_LLVM_ENABLE_ASSERTIONS OFF CACHE BOOL "")
set(RUNTIMES_${target}_SANITIZER_CXX_ABI "libc++" CACHE STRING "")
set(RUNTIMES_${target}_SANITIZER_CXX_ABI_INTREE ON CACHE BOOL "")
set(RUNTIMES_${target}_SANITIZER_TEST_CXX "libc++" CACHE STRING "")
set(RUNTIMES_${target}_SANITIZER_TEST_CXX_INTREE ON CACHE BOOL "")
set(RUNTIMES_${target}_LLVM_TOOLS_DIR "${CMAKE_BINARY_DIR}/bin" CACHE BOOL "")
set(RUNTIMES_${target}_LLVM_ENABLE_RUNTIMES "compiler-rt;libcxx;libcxxabi;libunwind" CACHE STRING "")

The issue with the current support is that it does not provide a way to disable a project for one of the runtimes targets. Right now I will say that a majority of users do not want the LLVM CPU libc, but let's say they want the GPU libc and the CPU libc++.

Right now the suggested method would be.

-DLLVM_ENABLE_RUNTIMES="libc;libc++;openmp"
-DLLVM_RUNTIMES_TARGETS="default;amdgcn-amd-amdhsa"
-DRUNTIMES_amdgcn-amd-amdhsa_ENABLE_RUNTIMES='libc'

This would have the desired behavior of making the libc project compile under the AMDGPU triple. But it would still make it present under the default target, which is not the desired behavior. The syntax is also more difficult to understand, but we could possible just make another user facing variable that just sets that stuff automatically. So, what I wanted was the ability to "per-project" remove some of the other projects. So the above would be

-DLLVM_ENABLE_RUNTIMES="libc;libc++;openmp"
-DLLVM_LIBC_RUNTIMES_TARGETS="amdgcn-amd-amdhsa"

If there were some alternate way to specify "I want this project to only build with X runtimes" then I would also take that.

I should also add that in my expected use-case if the user wanted CPU libc builds they would add default or whatever triple to that list. Then in the future if we begin working on a libc++ it would then be

-DLLVM_LIBCXX_RUNTIMES_TARGETS="default;amdgcn-amd-amdhsa"
-DLLVM_LIBC_RUNTIMES_TARGETS="amdgcn-amd-amdhsa"

Or something.

@petrhosek
Copy link
Member

Right now the suggested method would be.

-DLLVM_ENABLE_RUNTIMES="libc;libc++;openmp"
-DLLVM_RUNTIMES_TARGETS="default;amdgcn-amd-amdhsa"
-DRUNTIMES_amdgcn-amd-amdhsa_ENABLE_RUNTIMES='libc'

This would have the desired behavior of making the libc project compile under the AMDGPU triple. But it would still make it present under the default target, which is not the desired behavior.

If you don't want libc being present in the default target, why not simply omit it from LLVM_ENABLE_RUNTIMES?

-DLLVM_ENABLE_RUNTIMES="libc++;openmp"
-DLLVM_RUNTIMES_TARGETS="default;amdgcn-amd-amdhsa"
-DRUNTIMES_amdgcn-amd-amdhsa_ENABLE_RUNTIMES="libc"

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 16, 2024

Today I have learned about the joys of ; versus | in CMake.

@petrhosek I somewhat understand the difficulties here. It seems that the existing support tries to do triples=projects while this does projects=triples. I think both can coexist, and have somewhat different use-cases. Do you have any more qualms about this patch? I tried getting the alternative format you provided to work but it didn't seem to do what I wanted even after patching up libc.

jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 17, 2024
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
llvm#81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly suprerior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Depends on llvm#81557
jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 17, 2024
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
llvm#81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly suprerior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Depends on llvm#81557
jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 19, 2024
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
llvm#81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly suprerior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Depends on llvm#81557
jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 19, 2024
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
llvm#81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly suprerior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Depends on llvm#81557
@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 19, 2024

Any remaining concerns @petrhosek? I'd like to land this but would like to wait for your blessing.

Summary:
Currently, the runtimes build allow users to optionally build for
multiple target architectures via the `LLVM_RUNTIME_TARGETS` option.
Once problem is that this applies to every single runtime the user has
enabled, when it's possible that we may wish to enable a different set
for just one target.

The motivating example here is for handling GPU targets as
cross-compiling. We want to be able to perform something like
`LLVM_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda` to build
runtimes targeting the GPU. However, the current support would force the
other runtimes to be built for the GPU, when this definitely will not
work.

This patch attempts to work around this in a generic way by letting
individual targets overload the set of enabled runtimes triples. The
expected use would be to enable something like the following for
targeting the GPU `libc` and in the future the GPU `libcxx`.

```
-DLLVM_ENABLE_PROJECTS='openmp;libcxx;libcxx-abi;libc'
-DLLVM_LIBC_RUNTIME_TARGETS='nvptx64-nvidia-cuda;amdgcn-amd-amdhsa'
```
jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 19, 2024
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
llvm#81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly suprerior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Depends on llvm#81557
jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 20, 2024
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
llvm#81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly suprerior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Depends on llvm#81557
@JonChesterfield
Copy link
Collaborator

This is a prereq for pull/81921 which is a huge decrease in overall complexity of our build scripts. Hopefully @petrhosek has some more ideas for cleaning up this area that we can apply post-commit.

@petrhosek
Copy link
Member

I looked at all "libc" IN_LISTS LLVM_ENABLE_RUNTIMES cases but neither of those seemed like blockers and could be worked around by setting LIBC_HDRGEN_ONLY=ON and LLVM_ENABLE_PROJECTS=libc which IMHO is the correct solution rather than complicating the existing logic by iterating over all variables to check for "libc" IN_LISTS RUNTIMES_<target>_LLVM_ENABLE_RUNTIMES for every <target> in LLVM_RUNTIME_TARGETS.

I'm still not especially enthused about this feature, it adds additional complexity to the runtimes build and from what I can tell so far, you can achieve the same result with the existing implementation, and although that might require setting more options, I'm not sure it warrants the additional complexity. If there some specific things you believe aren't possible, it'd be great to provide an example so I can see if we could support those with the existing features.

If we decide to go forward with change regardless, I'd like you to setup a builder that exercises it, ideally both presubmit and postsubmit. The problem we already have is that runtimes build has too many ways to spell the same thing which is making it frustratingly difficult to make changes. For example, in case of libc I have to verify that LLVM_ENABLE_PROJECTS=libc, LLVM_ENABLE_RUNTIMES=libc and RUNTIMES_<target>_LLVM_ENABLE_RUNTIMES=libc still works; with this change there's yet another variation and I don't have resources to try every one of them every time I make some change to the build and I'd like to avoid a situation where I make a change and then hear back some time later that I broke a downstream user.

I'd also be curious to hear from @smeenai and @llvm-beanz as other reviewers of the runtimes build changes if they have other thoughts.

@JonChesterfield
Copy link
Collaborator

Have you looked at pull/81921? That is a huge decrease in spurious complexity that we can't have without something functionally equivalent to this PR. I don't mind what horrendous cmake invocation is required to build things but I do want 81921, it lets me get rid of some really fragile infra for working with libc.

@petrhosek
Copy link
Member

AFAICT pull/81921 is not dependent on this change, this change just adds syntax sugar on top of the existing functionality provided by the runtimes build and so shouldn't be considered a blocker.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 20, 2024

I looked at all "libc" IN_LISTS LLVM_ENABLE_RUNTIMES cases but neither of those seemed like blockers and could be worked around by setting LIBC_HDRGEN_ONLY=ON and LLVM_ENABLE_PROJECTS=libc which IMHO is the correct solution rather than complicating the existing logic by iterating over all variables to check for "libc" IN_LISTS RUNTIMES_<target>_LLVM_ENABLE_RUNTIMES for every <target> in LLVM_RUNTIME_TARGETS.

Personally, I don't like splitting up the logic required to check if a runtime is enabled.

I'm still not especially enthused about this feature, it adds additional complexity to the runtimes build and from what I can tell so far, you can achieve the same result with the existing implementation, and although that might require setting more options, I'm not sure it warrants the additional complexity. If there some specific things you believe aren't possible, it'd be great to provide an example so I can see if we could support those with the existing features.

Fundamentally, this comes down to the question of whether or not runtime=targets or target=runtimes is the more natural way to set these things. My perspective is that both have cases where they are the most convenient way to get the idea across. You can write both in terms of the other with enough effort and some fixes, but this seems to be coming down to the former being easier for my use-case and the latter being easier for your use-case. I personally would like to support both as they are the easiest solution in different cases.

If we decide to go forward with change regardless, I'd like you to setup a builder that exercises it, ideally both presubmit and postsubmit. The problem we already have is that runtimes build has too many ways to spell the same thing which is making it frustratingly difficult to make changes. For example, in case of libc I have to verify that LLVM_ENABLE_PROJECTS=libc, LLVM_ENABLE_RUNTIMES=libc and RUNTIMES_<target>_LLVM_ENABLE_RUNTIMES=libc still works; with this change there's yet another variation and I don't have resources to try every one of them every time I make some change to the build and I'd like to avoid a situation where I make a change and then hear back some time later that I broke a downstream user.

We will have an AMDGPU and NVPTX builder that will use it at a minimum. I think there was some talk with @jplehr to have some AMD presubmit support, but I don't think that's set up yet. This patch also somewhat simplifies the complexity here by explicitly setting up a list of runtimes per target I believe.

I'd also be curious to hear from @smeenai and @llvm-beanz as other reviewers of the runtimes build changes if they have other thoughts.

@JonChesterfield
Copy link
Collaborator

Both runtime=targets and target=runtimes syntaxes look identically good/bad to me. I have to copy&paste cmake invocations from our documentation to get llvm to build anyway.

Keeping the current syntax and shipping the better behaviour today is much better than spending another week trying to make the syntax look different while the behavioural change bitrots.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 20, 2024

So, I think my main problem with the other support is that it forces explicitly adding default to the list of enabled. As far as I'm aware, the vast majority of users want the default target when building. Maybe we could just have an alternate way to se them that implies default so users don't need to change as much if they want to copy paste something.

Maybe LLVM_EXTRA_RUNTIME_TARGETS?

@petrhosek
Copy link
Member

So, I think my main problem with the other support is that it forces explicitly adding default to the list of enabled. As far as I'm aware, the vast majority of users want the default target when building. Maybe we could just have an alternate way to se them that implies default so users don't need to change as much if they want to copy paste something.

I'm already working on a change for that since this is something I wanted to address for a while, I should have a PR ready later today.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 20, 2024

So, I think my main problem with the other support is that it forces explicitly adding default to the list of enabled. As far as I'm aware, the vast majority of users want the default target when building. Maybe we could just have an alternate way to se them that implies default so users don't need to change as much if they want to copy paste something.

I'm already working on a change for that since this is something I wanted to address for a while, I should have a PR ready later today.

Okay, I will await that one eagerly. If we can get something that makes it just require something like the following, I could wrap this into the existing LIBC_GPU_BUILD=ON to simply pre-populate it, as it would then no longer have an effect on the rest of the user's build.

-DRUNTIMES_nvptx64-nvidia-cuda_LLVM_ENABLE_RUNTIMES=libc
-DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=libc
-DLLVM_EXTRA_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda

@JonChesterfield JonChesterfield dismissed their stale review February 20, 2024 20:30

uninterested in contentious invocation syntax

jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 22, 2024
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
llvm#81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly suprerior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Depends on llvm#81557
jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 22, 2024
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
llvm#81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly suprerior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Depends on llvm#81557
jhuber6 added a commit that referenced this pull request Feb 22, 2024
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
#81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly superior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Some certain utilities need to be built with 
`--target=${LLVM_HOST_TRIPLE}` as well. I think this is a fine
workaround as we
will always assume that the GPU `libc` is a cross-build with a
functioning host.

Depends on #81557
@smeenai
Copy link
Collaborator

smeenai commented Feb 23, 2024

Apologies for the radio silence here, I'm about to go on an extended leave soon and have had lots to wrap up before then. I generally agree with Petr that adding another configuration option (and in particular one that goes from project -> target, rather than target -> project) is adding extra testing burden that we'd ideally find a way to avoid. I haven't been able to look at the other diffs to weigh that against the simplifications permitted by this though.

@jhuber6 jhuber6 closed this Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants