Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the HIP CMAKE issues to run on Frontier #3931

Merged
merged 1 commit into from
Nov 28, 2023

Conversation

anagainaru
Copy link
Contributor

Using the crusher script everything builds well but the HIP architecture is not detected correctly.

Building with ZeroMQ (module load libzmq/4.3.4 or module load libzmq/4.3.3) gives errors:

 Enabled Kokkos devices: SERIAL;HIP
 CMake Error: Could not find cmake module file: CMakeDetermineHIPCompiler.cmake
 CMake Error: Error required internal CMake variable not set, cmake may not be built correctly.
 Missing variable is:
 CMAKE_HIP_COMPILER_ENV_VAR
 CMake Error: Could not find cmake module file:
 /lustre/orion/csc303/proj-shared/againaru/ADIOS2/build-kokkos-frontier/adios2/CMakeFiles/3.20.4/CMakeHIPCompiler.cmake
 CMake Error: Could not find cmake module file: CMakeHIPInformation.cmake
 CMake Error: Could not find cmake module file: CMakeTestHIPCompiler.cmake
 Found ZeroMQ: /sw/frontier/spack-envs/base/opt/linux-sles15-x86_64/gcc-7.5.0/libzmq-4.3.3-zcvpozwnnvjwplgaqootwhpcbgi3uhwc/lib/libzmq.so (found suitable version
 "4.3.3", minimum required is "4.1")

I will update the PR when I figure this out (@vicentebolea do you have any idea?).

@vicentebolea
Copy link
Collaborator

is rocm loaded, if so maybe update its version. Also what cmake version are we running?

@vicentebolea
Copy link
Collaborator

Also try this cmake --system-information | grep -i hip

@anagainaru
Copy link
Contributor Author

anagainaru commented Nov 28, 2023

@vicentebolea I didn't see your answer, sorry.

I am using the crusher script so I am importing the following:

module load rocm/5.4.0
module load craype-accel-amd-gfx90a
module load gcc/11.2.0
module load cmake/3.23.2

Everything seems fine if I use these modules but if I also add module load libzmq/4.3.4 I can no longer compile the HIP backend or examples.

$ cmake --system-information | grep -i hip
CMake Error at /autofs/nccs-svm1_sw/frontier/spack-envs/base/opt/linux-sles15-x86_64/gcc-7.5.0/cmake-3.23.2-4r4mpiba7cwdw2hlakh5i7tchi64s3qd/share/cmake-3.23/Modules/CMakeTestCCompiler.cmake:69 (message):
  The C compiler

    "/opt/cray/pe/craype/2.7.19/bin/cc"

  is not able to compile a simple test program.

  It fails with the following output:

    Change Dir: /lustre/orion/csc303/proj-shared/againaru/ADIOS2/__cmake_systeminformation/CMakeFiles/CMakeTmp

    Run Build Command(s):/usr/bin/gmake -f Makefile cmTC_e7089/fast && /usr/bin/gmake  -f CMakeFiles/cmTC_e7089.dir/build.make CMakeFiles/cmTC_e7089.dir/build
    gmake[1]: Entering directory '/lustre/orion/csc303/proj-shared/againaru/ADIOS2/__cmake_systeminformation/CMakeFiles/CMakeTmp'
    Building C object CMakeFiles/cmTC_e7089.dir/testCCompiler.c.o
    /opt/cray/pe/craype/2.7.19/bin/cc    -o CMakeFiles/cmTC_e7089.dir/testCCompiler.c.o -c /lustre/orion/csc303/proj-shared/againaru/ADIOS2/__cmake_systeminformation/CMakeFiles/CMakeTmp/testCCompiler.c
    Error:
    Unable to determine compiler version.
    Make sure that a cray module is loaded and that CRAY_CC_VERSION is defined
    gmake[1]: *** [CMakeFiles/cmTC_e7089.dir/build.make:78: CMakeFiles/cmTC_e7089.dir/testCCompiler.c.o] Error 255
    gmake[1]: Leaving directory '/lustre/orion/csc303/proj-shared/againaru/ADIOS2/__cmake_systeminformation/CMakeFiles/CMakeTmp'
    gmake: *** [Makefile:127: cmTC_e7089/fast] Error 2





  CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
  CMakeLists.txt:6 (project)


Error: --system-information failed on internal CMake!

@eisenhauer eisenhauer merged commit 49158d0 into ornladios:master Nov 28, 2023
34 checks passed
@anagainaru anagainaru deleted the hip-bug branch December 2, 2023 16:18
pnorbert added a commit to pnorbert/ADIOS2 that referenced this pull request Dec 7, 2023
* master:
  Update readme for heat transfer example with new location and build instructions
  Ignore tests with defects for now
  Adapt libfabric dataplane of SST to Cray CXI provider (ornladios#3672)
  ci: fix path to lsan suppressions, fix broken gh status post
  Use adios2_mode_readRandomAccess in matlab open to make it work for BP5 (ornladios#3956)
  Add Global Array Capabilities and Limitations
  Add Section for Anatomy of an ADIOS Program
  Enable Shell-Check for gh-actions scripts
  Enable Shell-Check for circle CI scripts
  Enable Shell-Check for tau contract scripts
  Enable Shell-Check for scorpio contract scripts
  Enable Shell-Check for lammps contract scripts
  Delete VTK code in examples
  Fix MATLAB bindings for MacOS (ornladios#3950)
  Set the compiler for the Kokkos DataMan example to what is used to build Kokkos
  Fix the HIP architecture CMAKE variable (ornladios#3931)
  perfstubs 2023-11-27 (845d0702) (ornladios#3944)
  Revert "Only rank 0 should print the initialization message in perfstub"
dmitry-ganyushin added a commit to dmitry-ganyushin/ADIOS2 that referenced this pull request Dec 7, 2023
* master:
  Update readme for heat transfer example with new location and build instructions
  Ignore tests with defects for now
  Adapt libfabric dataplane of SST to Cray CXI provider (ornladios#3672)
  ci: fix path to lsan suppressions, fix broken gh status post
  Use adios2_mode_readRandomAccess in matlab open to make it work for BP5 (ornladios#3956)
  Add Global Array Capabilities and Limitations
  Add Section for Anatomy of an ADIOS Program
  Enable Shell-Check for gh-actions scripts
  Enable Shell-Check for circle CI scripts
  Enable Shell-Check for tau contract scripts
  Enable Shell-Check for scorpio contract scripts
  Enable Shell-Check for lammps contract scripts
  Delete VTK code in examples
  Fix MATLAB bindings for MacOS (ornladios#3950)
  Set the compiler for the Kokkos DataMan example to what is used to build Kokkos
  Fix the HIP architecture CMAKE variable (ornladios#3931)
  perfstubs 2023-11-27 (845d0702) (ornladios#3944)
  Revert "Only rank 0 should print the initialization message in perfstub"
  CI Contract: Build examples with external ADIOS
  Example using DataMan with Kokkos buffers
  Propagating the GPU logic inside the DataMan engine
  ci: Use mpich built with ch3:sock:tp for faster tests
  ReadMe.md: Mention 2.9.2 release
  Cleanup server output a bit (ornladios#3914)
  ci: set openmpi and openmp params
  Example using Kokkos buffers with SST
  Changes to MallocV to take into consideration the memory space of a variable
  Change install directory of Gray scott files again
  ci,crusher: increase supported num branches
  ci: add shellcheck coverage to source and testing
  Change install directory of Gray scott files
  Only rank 0 should print the initialization message in perfstub
  Defining and computing derived variables (ornladios#3816)
  Add Remote "-status" command to see if a server is running and where (ornladios#3911)
  examples,hip: use find_package(hip) once in proj
  Add Steps Tutorial
  Add Operators Tutorial
  Add Attributes Tutorial
  Add Variables Tutorial
  Add Hello World Tutorial
  Add Tutorials' Download and Build section
  Add Tutorials' Overview section
  Improve bpStepsWriteRead* examples
  Rename bpSZ to bpOperatorSZWriter
  Convert bpAttributeWriter to bpAttributeWriteRead
  Improve bpWriter/bpReader examples
  Close file after reading for hello-world.py
  Fix names of functions in engine
  Fix formatting warnings
  Add dataspaces.rst in the list of engines
  Add query.rst
  cmake: find threads package first
  docs: update new_release.md
  Bump version to v2.9.2
  ci: update number of task for mpich build
  clang-format: Correct format to old style
  Merge pull request ornladios#3878 from anagainaru/test-null-blocks
  Merge pull request ornladios#3588 from vicentebolea/fix-mpi-dp
  bp5: make RecMap an static anon namespaced var
  Replace LookupWriterRec's linear search on RecList with an unordered_map. For 250k variables, time goes from 21sec to ~1sec in WSL. The order of entries in RecList was not necessary for the serializer to work correctly. (ornladios#3877)
  Fix data length calculation for hash (ornladios#3875)
  Merge pull request ornladios#3823 from eisenhauer/SstMemSel
  gha,ci: update checkout to v4
  Blosc2 USE ON: Fix Module Fallback
  cmake: correct prefer_shared_blosc behavior
  cmake: correct info.h installation path
  ci: disable MGARD static build
  operators: fix module library
  ci: add downloads readthedocs
  cmake: Add Blosc2 2.10.1 compatibility.
  Fix destdir install test (ornladios#3850)
  cmake: update minimum cmake to 3.12 (ornladios#3849)
  MPI: add timeout for conf test for MPI_DP (ornladios#3848)
  MPI_DP: do not call MPI_Init (ornladios#3847)
  install: export adios2 device variables (ornladios#3819)
  Merge pull request ornladios#3799 from vicentebolea/support-new-yaml-cpp
  Merge pull request ornladios#3737 from vicentebolea/fix-evpath-plugins-path
  Partial FFS Upstream, only changes to type_id
  bpls -l  with scalar string variable: print the value (since min/max is empty). This changes the code for all types using Engine.Get() to get the value now.
  Set AWS version requirement to 1.10.15 and also turn it OFF by default as it is not a stable feature of ADIOS just yet.
  Fix local values block reading
  docs,ci: backport fixes for readthedocs
pnorbert added a commit to pnorbert/ADIOS2 that referenced this pull request Dec 12, 2023
* master:
  Have HDF5 write raise error if operator(s) requested (ornladios#3951)
  fix for ASAN issue related to JoinedDimArray handling in BP5 deserializer (ornladios#3963)
  New operator MDR, for refactoring floating point arrays using MGARD's new MDR extension. (ornladios#3826)
  restricted http transport from windows builds.
  XMLConfigTest: Add RemoveIO test
  adios2::core::ADIOS: Initialize new IO objects with config file
  removed unsused variable
  Update readme for heat transfer example with new location and build instructions
  Ignore tests with defects for now
  Adapt libfabric dataplane of SST to Cray CXI provider (ornladios#3672)
  ci: fix path to lsan suppressions, fix broken gh status post
  Use adios2_mode_readRandomAccess in matlab open to make it work for BP5 (ornladios#3956)
  Add Global Array Capabilities and Limitations
  Add Section for Anatomy of an ADIOS Program
  Enable Shell-Check for gh-actions scripts
  Enable Shell-Check for circle CI scripts
  Enable Shell-Check for tau contract scripts
  Enable Shell-Check for scorpio contract scripts
  Enable Shell-Check for lammps contract scripts
  Delete VTK code in examples
  Fix MATLAB bindings for MacOS (ornladios#3950)
  Set the compiler for the Kokkos DataMan example to what is used to build Kokkos
  Fix the HIP architecture CMAKE variable (ornladios#3931)
  perfstubs 2023-11-27 (845d0702) (ornladios#3944)
  Revert "Only rank 0 should print the initialization message in perfstub"
  Formatting
  Formatting
  Revision
  Added buffered data receive in the client side.
  A socket version of HTTP connector. Proxy server host is hardwired to "localhost" and port to 9999 Remote bpls: bpls -E bp4 -T "Library=HTTP" /remote_path/myVector_cpp.bp -d bpInts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants