Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build and include C++ standard library #1603

Open
wants to merge 184 commits into
base: main
Choose a base branch
from

Conversation

bettinaheim
Copy link
Collaborator

@bettinaheim bettinaheim commented May 2, 2024

This PR changes the CUDA-Q assets to build and use the LLVM C++ standard library (libc++) instead of the GNU C++ standard library (libstdc++). The assets are currently only used to build the installer; the Docker image and Python wheels are built as before.

RFC: #1819

  • As can be seen in the updated installer validation, with this change there is no longer a need to install any C++ tools in addition to CUDA-Q; the CUDA-Q installer now provides a self-contained installation, except for glibc, which is/has to be picked up from the operating system.
  • The changes should also make it easier to build Mac wheels in the future. I haven't tried it yet, but I've added links that can serve as a good starting point to look into that.
  • I am now also building openmp support in the custom built toolchain, finally addressing Clang builds and OpenMP #175 (not yet done: build a GPU-offload-capable toolchain)
  • The PR fixes Execution hangs on OpenSUSE when installed via installer #1874 (discovered as part of validating this PR)
  • The PR also fixes Nlopt tests are failing when building the cross-plat binaries #1103 and reenables these tests.
  • Both static and dynamic libraries are built for the runtime, and the static libc++ is configured to be fully self-contained (all dependencies are included) and hermetically closed (symbols are not exposed, if LLVM is to be trusted)
  • This approach gives us much more freedom to update the LLVM version and C++ standard independently on their support within CUDA; we don't necessarily have to have support for older C++ standards in the future, since we ship all necessary C++ tools with CUDA-Q

There are a bunch of edits I needed to make across the board to make this work, including

  • splitting out the CUDA + thrust code into its own library that is compiled with nvcc; we use gcc as a host compiler for that one (intentionally, since the llvm/clang version we use for the other CUDA-Q libs will not be supported by CUDA once we update the LLVM commit) - this requires that we have a proper C-like interface to call into it and do the proper data conversions e.g. for vectors
  • being careful with exceptions; I opted to statically link the CUDA-Q libraries in the installer against libc++. This gives us a nice clean setup where source code can link against these even if it is built with gcc (sizes of the libraries are not exactly small, but not horribly big, so I think that's fine). However, since the setup intentionally does not expose any symbols, it also means that exceptions within each library are specific to that library and won't be caught via type matching. These exceptions can be caught via a catch all, and information about their type can be/is retrieved via __cxxabi calls instead when necessary. Exceptions within each library work as before (exception handling in general is enabled in the build).
  • a small LLVM patch (details are in the README under tpls/customizations/llvm)
  • a small Crow patch (details are in the README under tpls/customizations/Crow)
  • using a clang++.cfg file to link its own libc++ by default, and add the necessary paths as rpath to built binaries
  • manually invoking the cmake_install.cmake for builtins; this looks like a bug in the LLVM build setup - these get installed perfectly fine if compiler-rt is defined as a project, but not if it is built as a runtime
  • building openmp as project (building it as runtime won't build libomp.so) - OpenMP doesn't seem to define distribution components; I hence chose to build the omp target and then install the necessary components manually by invoking the corresponding cmake_install
  • disabled building any of the sanitizer or profiling runtime libraries - more work may be needed to build these without accidentally picking up a gcc_s dependency
  • removing include(HandleLLVMOptions) from top-level CMakeLists.txt; This handles options for the LLVM build and should only be included when directly referencing the MLIR or LLVM headers. In particular, this potentially adds C++ compiler and linker flags, that are plain wrong for some projects in the build (e.g. for the CUDA host compiler)
  • using llvm-config.h to identify the correct LLVM_TARGET_TRIPLE, which is needed to fix the include paths for cudaq-quake
  • building CUDA kernels used for the statevector simulators separately according to the guidance I added to the docs; the scripts are set up such that any CUDA compatible host compiler, or the clang version included with CUDA-Q (unless it is too new...) can be used as host compiler.
  • build OpenSSL with zlib support (we depend on zlib anyway, so I figured I might as well enable it, no strong preferences, though)
  • minor updates where libc++ doesn't support some of the things used in the code
  • fixing some includes

Other changes made:

  • Updated the docs to reflect the changes. The guidance given for building CUDA libraries that can be used with CUDA-Q also applies/works when building against libstdc++.
  • Split out the LLVM options into a CMake cache file for better maintainability
  • Enable building tests, specifically the mlir tests, when building Python wheels (additional work is needed to properly run these tests in all relevant pipelines, see also [Tests] [Python] Follow-up for 'pycudaq-mlir' tests #1532)
  • Add a check during the build to confirm that the built libraries do not depend on GCC
  • Clean up the devdeps docker image; there is not really a reason to manually build bits and pieces rather than using the script to build all prerequisites at once
  • Remove the CUDAQ_BUILD_RELOCATABLE_PACKAGE option; this was never fully functional and needs to be revised in any case
  • update the build scripts to work with dnf as well as apt
  • updated the grover example since this seems misconfigured such that it produced garbage
  • making OpenMP required in the build by default, opt-out via CUDAQ_DISABLE_OPENMP environment variable

A couple of other comments:

  • I configured the build such that the entire toolchain we build has no gcc dependency, not even gcc_s/libatomic. LLVM has its own implementation of that logic in the builtins and we are using that instead. However, while using CUDA and cuQuantum libraries works perfectly fine, most of them do depend on gcc_s (the exception being cudart itself).
  • The PR removes the LLVM pipeline from the dev environment and Docker image builds. This is now redundant and fully replaced by the new setup already used in the installer.
  • A couple of C++ llmv-lit tests are still failing, since updates to the bridge need to be made to properly support libc++. Most tests, including for GPU-based components pass, however. The failing tests are tracked in Failing tests when building against libc++ #1712.
  • This change effectively removes (no longer runs) a good part of the C++17 tests, since now C++20 is/can be used even in environments that otherwise do not have C++20 support available.
  • While checking the installer, I realized we had this pre-existing bug: Installer not working as documented with CUDA 12 #1718 I thought I had tested this manually prior to 0.6.0, but I can't be sure anymore. I re-checked the 0.6.0 release, and this issue exists there as well (0.6.0 was the first release of the installer).

Pipelines:


Copy link

github-actions bot commented Jul 2, 2024

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

github-actions bot pushed a commit that referenced this pull request Jul 2, 2024
Copy link
Collaborator

@bmhowe23 bmhowe23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these changes touch so many parts of the build process, I would suggest we consider doing some additional testing on top of the standard per-PR CI testing. Here's what I was thinking (below). Maybe you have already done some of these.

  1. Go through the full deployment and publishing pipeline with this branch to verify everything works across all the GPU backends, perhaps creating as experimental branch help facilitate the process.
  2. Perform some sort of throughput testing to verify that our throughput doesn't significantly change.
  3. Verify that this is compatible with PR Fix LLVM aarch64 relocation overflow #1444 (speaking of which - which PR do you think should be merged first? 1444 or this one?)
  4. Probably some standalone developer testing on Gorby.
  5. Potentially some hardware provider testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Change breaks backwards compatibility
Projects
None yet
3 participants