[libc] Initial support for exhaustive math tests on the GPU #73720

jhuber6 · 2023-11-29T00:05:08Z

Summary:
We want to perform exhaustive math tests for the GPU implementations of
standard math functions to ensure that they are correct within some
bounds. The problem is that the current GPU test suite relies on using
RPC calls to perform host services. All the math implementations we can
compare against are implemented on the host CPU, so we cannot link
against MPFR or call the libc CPU math.

Due to the extreme specificity of this problem, I found it prudent to
make an entirely separate facility for exhaustive testing on the GPU.
This will use OpenMP to do host / device offloading so we can compute
on the GPU and copy it back to compare it against MPFR.

This works by manually inserting all the libc compiled dependncies
into the device portion of the compilation via -Xoffload-linker. The
downside is that this doesn't work currently on Nvidia because I cannot
use -Wl, due to default arguments, and it would need to be renamed
from foo.o to foo.cubin.

llvmbot · 2023-11-29T00:05:37Z

@llvm/pr-subscribers-libc

Author: Joseph Huber (jhuber6)

Changes

Summary:
We want to perform exhaustive math tests for the GPU implementations of
standard math functions to ensure that they are correct within some
bounds. The problem is that the current GPU test suite relies on using
RPC calls to perform host services. All the math implementations we can
compare against are implemented on the host CPU, so we cannot link
against MPFR or call the libc CPU math.

Due to the extreme specificity of this problem, I found it prudent to
make an entirely separate facility for exhaustive testing on the GPU.
This will use OpenMP to due host / device offloading so we can compute
on the GPU and copy it back to compare it against MPFR.

This works by manually inserting all the libc compiled dependncies
into the device portion of the compilation via -Xoffload-linker. The
downside is that this doesn't work currently on Nvidia because I cannot
use -Wl, due to default arguments, and it would need to be renamed
from foo.o to foo.cubin.

This is a WIP because the only test doesn't really do anything. Just
putting this up so the libc people can look at it.

Full diff: https://github.com/llvm/llvm-project/pull/73720.diff

8 Files Affected:

(modified) libc/cmake/modules/LLVMLibCCheckMPFR.cmake (-2)
(modified) libc/test/src/CMakeLists.txt (+1-1)
(modified) libc/test/src/math/CMakeLists.txt (+5)
(added) libc/test/src/math/gpu/CMakeLists.txt (+134)
(added) libc/test/src/math/gpu/Test.cpp (+15)
(modified) libc/utils/MPFRWrapper/CMakeLists.txt (+1-1)
(modified) openmp/libomptarget/DeviceRTL/CMakeLists.txt (+4-1)
(modified) openmp/libomptarget/test/lit.cfg (+2-2)

diff --git a/libc/cmake/modules/LLVMLibCCheckMPFR.cmake b/libc/cmake/modules/LLVMLibCCheckMPFR.cmake
index 9e361f5fd811289..46f679f1330d3ba 100644
--- a/libc/cmake/modules/LLVMLibCCheckMPFR.cmake
+++ b/libc/cmake/modules/LLVMLibCCheckMPFR.cmake
@@ -2,8 +2,6 @@ set(LLVM_LIBC_MPFR_INSTALL_PATH "" CACHE PATH "Path to where MPFR is installed (
 
 if(LLVM_LIBC_MPFR_INSTALL_PATH)
   set(LIBC_TESTS_CAN_USE_MPFR TRUE)
-elseif(LIBC_TARGET_ARCHITECTURE_IS_GPU)
-  set(LIBC_TESTS_CAN_USE_MPFR FALSE)
 else()
   try_compile(
     LIBC_TESTS_CAN_USE_MPFR
diff --git a/libc/test/src/CMakeLists.txt b/libc/test/src/CMakeLists.txt
index 52452cd1037dbfb..ae434e94632c405 100644
--- a/libc/test/src/CMakeLists.txt
+++ b/libc/test/src/CMakeLists.txt
@@ -8,7 +8,7 @@ function(add_fp_unittest name)
   )
 
   if(MATH_UNITTEST_NEED_MPFR)
-    if(NOT LIBC_TESTS_CAN_USE_MPFR)
+    if(NOT LIBC_TESTS_CAN_USE_MPFR OR LIBC_TARGET_ARCHITECTURE_IS_GPU)
       message(VERBOSE "Math test ${name} will be skipped as MPFR library is not available.")
       return()
     endif()
diff --git a/libc/test/src/math/CMakeLists.txt b/libc/test/src/math/CMakeLists.txt
index fcb47449748dcac..87869538653f58f 100644
--- a/libc/test/src/math/CMakeLists.txt
+++ b/libc/test/src/math/CMakeLists.txt
@@ -1722,3 +1722,8 @@ if(NOT LLVM_LIBC_FULL_BUILD)
   add_subdirectory(exhaustive)
   add_subdirectory(differential_testing)
 endif()
+
+# The GPU build uses special case exhaustive math tests.
+if(LIBC_TARGET_ARCHITECTURE_IS_GPU)
+  add_subdirectory(gpu)
+endif()
diff --git a/libc/test/src/math/gpu/CMakeLists.txt b/libc/test/src/math/gpu/CMakeLists.txt
new file mode 100644
index 000000000000000..19a02b2c2894327
--- /dev/null
+++ b/libc/test/src/math/gpu/CMakeLists.txt
@@ -0,0 +1,134 @@
+add_libc_exhaustive_testsuite(libc-math-gpu-exhaustive-tests)
+
+if(LIBC_GPU_TARGET_ARCHITECTURE_IS_NVPTX)
+  message(WARNING "Exhaustive GPU tests are not currently supported on NVPTX")
+  return()
+endif()
+
+if((NOT "openmp" IN_LIST LLVM_ENABLE_RUNTIMES) AND
+   (NOT "openmp" IN_LIST LLVM_ENABLE_PROJECTS))
+  message(STATUS "The 'openmp' runtime must be enabled to run exhaustive "
+                 "GPU tests.")
+  return()
+endif()
+
+# Attempt to locate the libraries required for offloading.
+if(TARGET omptarget.devicertl AND TARGET omptarget AND TARGET omp)
+  set(LIBC_OPENMP_RUNTIME omptarget.devicertl omptarget omp)
+else()
+  find_library(omptarget.devicertl NAMES omptarget.devicertl
+               PATHS ${LLVM_LIBRARY_OUTPUT_INTDIR} ${LLVM_LIBRARY_DIR}
+               NO_DEFAULT_PATH
+  )
+  find_library(omptarget NAMES omptarget
+               PATHS ${LLVM_LIBRARY_OUTPUT_INTDIR} ${LLVM_LIBRARY_DIR}
+               NO_DEFAULT_PATH
+  )
+  find_library(omp NAMES omp
+               PATHS ${LLVM_LIBRARY_OUTPUT_INTDIR} ${LLVM_LIBRARY_DIR}
+               NO_DEFAULT_PATH
+  )
+  if(NOT omptarget.devicertl OR NOT omptarget OR NOT omp)
+    message(WARNING "Could not find the OpenMP runtime for exhaustive tests")
+    return()
+  endif()
+  set(LIBC_OPENMP_RUNTIME ${omptarget.devicertl} ${omptarget} ${omp})
+endif()
+
+if(${CMAKE_HOST_SYSTEM_NAME} MATCHES "Linux")
+  list(APPEND LIBC_OPENMP_RUNTIME "-Wl,-rpath,${LLVM_LIBRARY_DIR}"
+                                  "-Wl,-rpath,${LLVM_LIBRARY_OUTPUT_INTDIR}")
+endif()
+
+function(add_gpu_exhaustive_test name)
+  cmake_parse_arguments(
+    "MATH_GPU_EXHAUSTIVE_TEST"
+    "NEED_MPFR" # Optional arguments
+    "" # Single value arguments
+    "SRCS;HDRS;DEPENDS;ARGS;ENV;COMPILE_OPTIONS;LINK_LIBRARIES" # Multi-value arguments
+    ${ARGN}
+  )
+
+  if(NOT MATH_GPU_EXHAUSTIVE_TEST_SRCS)
+    message(FATAL_ERROR "'add_gpu_exhaustive_test' target requires a SRCS list "
+                        "of .cpp files.")
+  endif()
+  if(NOT MATH_GPU_EXHAUSTIVE_TEST_DEPENDS)
+    message(FATAL_ERROR "'add_gpu_exhaustive_test' target requires a DEPENDS "
+                        "list of 'add_entrypoint_object' targets.")
+  endif()
+
+  if(MATH_GPU_EXHAUSTIVE_TEST_NEED_MPFR)
+    if(NOT LIBC_TESTS_CAN_USE_MPFR)
+      message(VERBOSE "Math test ${name} will be skipped as MPFR library is "
+                      "not available.")
+      return()
+    endif()
+  endif()
+
+
+  if(MATH_GPU_EXHAUSTIVE_TEST_NEED_MPFR)
+    list(APPEND MATH_GPU_EXHAUSTIVE_TEST_LINK_LIBRARIES libcMPFRWrapper)
+  endif()
+
+  get_fq_target_name(${name} fq_target_name)
+  get_fq_deps_list(fq_deps_list ${MATH_GPU_EXHAUSTIVE_TEST_DEPENDS})
+  list(REMOVE_DUPLICATES fq_deps_list)
+
+  get_object_files_for_test(
+      link_object_files skipped_entrypoints_list ${fq_deps_list})
+  if(skipped_entrypoints_list)
+    message(STATUS "Skipping unittest ${fq_target_name} as it has missing deps:"
+                   " ${skipped_entrypoints_list}.")
+    return()
+  endif()
+
+  foreach(link_object_file ${link_object_files})
+    list(APPEND MATH_GPU_EXHAUSTIVE_TEST_LINK_LIBRARIES 
+                "-Xoffload-linker ${link_object_file}")
+  endforeach()
+
+  set(fq_exhaustive_target_name ${fq_target_name}.__exhaustive__)
+  add_executable(
+    ${fq_exhaustive_target_name}
+    EXCLUDE_FROM_ALL
+    ${MATH_GPU_EXHAUSTIVE_TEST_SRCS}
+    ${MATH_GPU_EXHAUSTIVE_TEST_HDRS}
+  )
+  set(LIBC_GPU_TEST_OPTIONS -fopenmp -nogpulib -nogpuinc -foffload-lto
+                            -fopenmp-offload-mandatory
+                            --offload-arch=${LIBC_GPU_TARGET_ARCHITECTURE})
+
+  target_compile_options(
+    ${fq_exhaustive_target_name} 
+    PRIVATE ${LIBC_GPU_TEST_OPTIONS} -fpie -fno-exceptions -fno-rtti
+  )
+  target_link_libraries(
+    ${fq_exhaustive_target_name}
+    PRIVATE 
+    ${LIBC_OPENMP_RUNTIME} ${MATH_GPU_EXHAUSTIVE_TEST_LINK_LIBRARIES}
+    "--offload-link"
+  )
+
+  add_dependencies(${fq_exhaustive_target_name} ${fq_deps_list})
+  set_target_properties(${fq_exhaustive_target_name}
+    PROPERTIES RUNTIME_OUTPUT_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
+
+  add_custom_target(
+    ${fq_target_name}
+    COMMAND OMP_TARGET_OFFLOAD=MANDATORY 
+            $<TARGET_FILE:${fq_exhaustive_target_name}>
+    COMMAND_EXPAND_LISTS
+    COMMENT "Running exhaustive GPU test ${fq_target_name}"
+  )
+  add_dependencies(libc-math-gpu-exhaustive-tests ${fq_target_name})
+endfunction()
+
+add_gpu_exhaustive_test(
+  test
+  NEED_MPFR
+  SRCS
+    Test.cpp
+  DEPENDS
+    libc.src.math.truncf
+)
diff --git a/libc/test/src/math/gpu/Test.cpp b/libc/test/src/math/gpu/Test.cpp
new file mode 100644
index 000000000000000..3008840f7498bde
--- /dev/null
+++ b/libc/test/src/math/gpu/Test.cpp
@@ -0,0 +1,15 @@
+#include "src/math/truncf.h"
+#include <mpfr.h>
+#include <iostream>
+
+int main() {
+  mpfr_t x;
+  mpfr_init(x);
+  mpfr_clear(x);
+
+  float f = 1.5f;
+#pragma omp target map(tofrom : f)
+  { f = LIBC_NAMESPACE::truncf(f); }
+  std::cerr << f << "\n";
+  return 0;
+}
diff --git a/libc/utils/MPFRWrapper/CMakeLists.txt b/libc/utils/MPFRWrapper/CMakeLists.txt
index 416307a20d7d181..7d3b46df369ca20 100644
--- a/libc/utils/MPFRWrapper/CMakeLists.txt
+++ b/libc/utils/MPFRWrapper/CMakeLists.txt
@@ -21,6 +21,6 @@ if(LIBC_TESTS_CAN_USE_MPFR)
     target_link_directories(libcMPFRWrapper PUBLIC ${LLVM_LIBC_MPFR_INSTALL_PATH}/lib)
   endif()
   target_link_libraries(libcMPFRWrapper LibcFPTestHelpers.unit LibcTest.unit mpfr gmp)
-elseif(NOT LIBC_TARGET_ARCHITECTURE_IS_GPU)
+else()
   message(WARNING "Math tests using MPFR will be skipped.")
 endif()
diff --git a/openmp/libomptarget/DeviceRTL/CMakeLists.txt b/openmp/libomptarget/DeviceRTL/CMakeLists.txt
index df8e4a5126fd443..5a5b882b3d08330 100644
--- a/openmp/libomptarget/DeviceRTL/CMakeLists.txt
+++ b/openmp/libomptarget/DeviceRTL/CMakeLists.txt
@@ -301,7 +301,10 @@ endforeach()
 
 # Archive all the object files generated above into a static library
 add_library(omptarget.devicertl STATIC)
-set_target_properties(omptarget.devicertl PROPERTIES LINKER_LANGUAGE CXX)
+set_target_properties(omptarget.devicertl PROPERTIES
+                      LINKER_LANGUAGE CXX
+                      ARCHIVE_OUTPUT_DIRECTORY ${LLVM_LIBRARY_OUTPUT_INTDIR}
+)
 target_link_libraries(omptarget.devicertl PRIVATE omptarget.devicertl.all_objs)
 
 install(TARGETS omptarget.devicertl ARCHIVE DESTINATION ${OPENMP_INSTALL_LIBDIR})
diff --git a/openmp/libomptarget/test/lit.cfg b/openmp/libomptarget/test/lit.cfg
index 6dab31bd35a9f31..fb221041b4be5a1 100644
--- a/openmp/libomptarget/test/lit.cfg
+++ b/openmp/libomptarget/test/lit.cfg
@@ -156,8 +156,8 @@ def remove_suffix_if_present(name):
 def add_libraries(source):
     if config.libomptarget_has_libc:
         return source + " " + config.llvm_library_dir + "/libcgpu.a " + \
-               config.library_dir + "/libomptarget.devicertl.a"
-    return source + " " + config.library_dir + "/libomptarget.devicertl.a"
+               config.llvm_library_dir + "/libomptarget.devicertl.a"
+    return source + " " + config.llvm_library_dir + "/libomptarget.devicertl.a"
 
 # substitutions
 # - for targets that exist in the system create the actual command.

github-actions · 2023-11-29T00:07:26Z

:white_check_mark: With the latest revision this PR passed the C/C++ code formatter.

Summary: We want to perform exhaustive math tests for the GPU implementations of standard math functions to ensure that they are correct within some bounds. The problem is that the current GPU test suite relies on using RPC calls to perform host services. All the math implementations we can compare against are implemented on the host CPU, so we cannot link against MPFR or call the `libc` CPU math. Due to the extreme specificity of this problem, I found it prudent to make an entirely separate facility for exhaustive testing on the GPU. This will use OpenMP to due host / device offloading so we can compute on the GPU and copy it back to compare it against MPFR. This works by manually inserting all the `libc` compiled dependncies into the device portion of the compilation via `-Xoffload-linker`. The downside is that this doesn't work currently on Nvidia because I cannot use `-Wl,` due to default arguments, and it would need to be renamed from `foo.o` to `foo.cubin`.

arsenm

I assume this needs a rebase at this point

arsenm · 2024-02-06T13:29:20Z

libc/test/src/math/gpu/exhaustive_test.h

+
+      FloatType *ptr = data.data();
+      // Fill the buffer with the computed results from the GPU.
+#pragma omp target teams distribute parallel for map(from : ptr[0 : BLOCK_SIZE])


loop pragma all the way to the left is weird looking

I think that's just how clang-format treats pragmas.

and it's just wrong

jhuber6 requested a review from lntue November 29, 2023 00:05

llvmbot added libc openmp:libomptarget OpenMP offload runtime labels Nov 29, 2023

jhuber6 force-pushed the ExhaustiveGPUMath branch 5 times, most recently from 91c4f18 to 2872d35 Compare November 29, 2023 21:31

jhuber6 changed the title ~~[libc][WIP] Initial support for exhaustive math tests on the GPU~~ [libc] Initial support for exhaustive math tests on the GPU Nov 29, 2023

jhuber6 requested review from arsenm, Artem-B, jdoerfert, shiltian, JonChesterfield and michaelrj-google November 29, 2023 21:32

jhuber6 force-pushed the ExhaustiveGPUMath branch 2 times, most recently from 8486911 to 28d049a Compare November 30, 2023 14:36

jhuber6 force-pushed the ExhaustiveGPUMath branch from 28d049a to 4984049 Compare December 11, 2023 13:57

arsenm requested changes Feb 6, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[libc] Initial support for exhaustive math tests on the GPU #73720

[libc] Initial support for exhaustive math tests on the GPU #73720

jhuber6 commented Nov 29, 2023 •

edited by arsenm

Loading

llvmbot commented Nov 29, 2023

github-actions bot commented Nov 29, 2023 •

edited

Loading

arsenm left a comment

arsenm Feb 6, 2024

jhuber6 Feb 6, 2024

arsenm Feb 6, 2024

[libc] Initial support for exhaustive math tests on the GPU #73720

Are you sure you want to change the base?

[libc] Initial support for exhaustive math tests on the GPU #73720

Conversation

jhuber6 commented Nov 29, 2023 • edited by arsenm Loading

llvmbot commented Nov 29, 2023

github-actions bot commented Nov 29, 2023 • edited Loading

arsenm left a comment

Choose a reason for hiding this comment

arsenm Feb 6, 2024

Choose a reason for hiding this comment

jhuber6 Feb 6, 2024

Choose a reason for hiding this comment

arsenm Feb 6, 2024

Choose a reason for hiding this comment

jhuber6 commented Nov 29, 2023 •

edited by arsenm

Loading

github-actions bot commented Nov 29, 2023 •

edited

Loading