-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Memory Selection to SST, and prototype a mechanism for running st… #3823
Conversation
55b1d9e
to
684f160
Compare
684f160
to
2884f71
Compare
@eisenhauer Here is what I use to reproduce github actions builds locally. Please feel free to reach out if you run into trouble. StepsMake a root directory where you'll put the source trees and do the builds, this will be mounted as mkdir <path-to-some-working-dir> Check out the code you want to testSource is checked out twice in ci, once from your PR changes, once from cd <path-to-some-working-dir>
git clone <path-to-your-adios2-source> gha
git clone <path-to-your-adios2-source> source Run the docker containerRun the container where the name matches the compiler you want to test. Note that images for the different gcc/clang compilers are based on the same underlying image where all three spack environments are available. So using the docker run --rm -v <path-to-some-working-dir>:/builds -ti adios2:ci-spack-ubuntu20.04-gcc10 Once you're inside the container with the compiler you want to test, you have to set a bunch of variables as github actions would do based on the yaml. Set the parallel valuePick one of these based on what you want to test: export GH_YML_MATRIX_PARALLEL=serial
export GH_YML_MATRIX_PARALLEL=mpi
export GH_YML_MATRIX_PARALLEL=mpich Set the compilerHere, you should choose from these, and make sure to match the container you're running: export GH_YML_MATRIX_COMPILER=gcc8
export GH_YML_MATRIX_COMPILER=gcc9
export GH_YML_MATRIX_COMPILER=gcc10
export GH_YML_MATRIX_COMPILER=gcc11
export GH_YML_MATRIX_COMPILER=clang6
export GH_YML_MATRIX_COMPILER=clang10
export GH_YML_MATRIX_COMPILER=oneapi
export GH_YML_MATRIX_COMPILER=icc Set a branch nameThis build/test will get reported to cdash, so pick a branch name that can help you identify what you were testing when you look at the results in cdash, e.g.: export GITHUB_REF_NAME="mpich_perf_testing_ch4_ofi" Do the build/testThe remaining steps set a bunch of variables, and eventually do the configure/build/test steps as they would be done in gha: cd /builds
export GITHUB_WORKSPACE=/builds
export GITHUB_PATH=/dev/null
export GITHUB_JOB=test
export GITHUB_EVENT_NAME="test_pull_request"
export GH_YML_BASE_OS=ubuntu
export RUNNER_TEMP="/"
export GH_YML_MATRIX_OS=ubuntu20.04
export GH_YML_JOBNAME=${GH_YML_MATRIX_OS}-${GH_YML_MATRIX_COMPILER}-${GH_YML_MATRIX_PARALLEL}
gha/scripts/ci/gh-actions/linux-setup.sh
cp /.local/bin/ninja /usr/bin/ninja
gha/scripts/ci/gh-actions/run.sh update
gha/scripts/ci/gh-actions/run.sh configure
gha/scripts/ci/gh-actions/run.sh build
gha/scripts/ci/gh-actions/run.sh test |
@scottwittenburg So, we've sorted the problem and I'm going to merge this PR, but the nature of the problem maybe points to other issues. When we have an MPI-enabled build, we build both serial and mpi versions of the ADIOS library (and of many of the tests in testing/engines/bp). However, we only ever build one version of the SST runtime (the bulk of the SST engine), which is its own library largely for historical reasons. So even if we have a serial application, built and linked with a serial version of the ADIOS library, SST doesn't know that. Specifically, in this circumstance SST is built with and depends upon MPI, and it also considers @vicentebolea 's MPI data plane a viable possibility. The deadlock I was seeing was happening because the MPI dataplane was trying to initialize MPI deep in the data transport, having noticed that it hadn't been initialized at the application level (this didn't go well). The fix for this PR was to always use the MPI version of the test if it had been built, even if we were only doing a 1 to 1 SST test. But I'm wondering if this is sufficient. Because ADIOS always depends upon SST (unless disabled) and SST always depends upon MPI if it's present, building a non-MPI version of the higher level ADIOS library to avoid an MPI dependency seems moot because it'll inherit that from SST. Should we be building a serial and mpi version of SST also? Is it time to abandon the SST-is-a-separate-library thing (no good reason to keep it that way)? Or is there some advantage in the current situation because a one-rank application might use the MPI data plane to connect to an MPI application? (If so, we should test this and make sure it works rather than deadlocking on some platforms.). Anyhow, something to discuss, perhaps when @vicentebolea is back. I believe that Chuck did a lot of the work behind building Serial and MPI versions of ADIOS, and I'm not sure of the reasoning behind not extending that to SST (and the MPI data plane was not a wrinkle that existed at the time). |
@vicentebolea this is the PR we're talking about |
Add Memory Selection to SST, and prototype a mechanism for running st… (cherry picked from commit c503940)
Add Memory Selection to SST, and prototype a mechanism for running st… (cherry picked from commit c503940)
Merge pull request #3823 from eisenhauer/SstMemSel
* release_29: (29 commits) Bump version to v2.9.2 ci: update number of task for mpich build clang-format: Correct format to old style Merge pull request #3878 from anagainaru/test-null-blocks Merge pull request #3588 from vicentebolea/fix-mpi-dp bp5: make RecMap an static anon namespaced var Replace LookupWriterRec's linear search on RecList with an unordered_map. For 250k variables, time goes from 21sec to ~1sec in WSL. The order of entries in RecList was not necessary for the serializer to work correctly. (#3877) Fix data length calculation for hash (#3875) Merge pull request #3823 from eisenhauer/SstMemSel gha,ci: update checkout to v4 Blosc2 USE ON: Fix Module Fallback cmake: correct prefer_shared_blosc behavior cmake: correct info.h installation path ci: disable MGARD static build operators: fix module library ci: add downloads readthedocs cmake: Add Blosc2 2.10.1 compatibility. Fix destdir install test (#3850) cmake: update minimum cmake to 3.12 (#3849) MPI: add timeout for conf test for MPI_DP (#3848) ...
* master: (126 commits) ReadMe.md: Mention 2.9.2 release Cleanup server output a bit (ornladios#3914) ci: set openmpi and openmp params Example using Kokkos buffers with SST Changes to MallocV to take into consideration the memory space of a variable Change install directory of Gray scott files again ci,crusher: increase supported num branches ci: add shellcheck coverage to source and testing Change install directory of Gray scott files Only rank 0 should print the initialization message in perfstub Defining and computing derived variables (ornladios#3816) Add Remote "-status" command to see if a server is running and where (ornladios#3911) examples,hip: use find_package(hip) once in proj Add Steps Tutorial Add Operators Tutorial Add Attributes Tutorial Add Variables Tutorial Add Hello World Tutorial Add Tutorials' Download and Build section Add Tutorials' Overview section Improve bpStepsWriteRead* examples Rename bpSZ to bpOperatorSZWriter Convert bpAttributeWriter to bpAttributeWriteRead Improve bpWriter/bpReader examples Close file after reading for hello-world.py Fix names of functions in engine Fix formatting warnings Add dataspaces.rst in the list of engines Add query.rst cmake: find threads package first docs: update new_release.md Bump version to v2.9.2 ci: update number of task for mpich build clang-format: Correct format to old style Merge pull request ornladios#3878 from anagainaru/test-null-blocks Merge pull request ornladios#3588 from vicentebolea/fix-mpi-dp Adding tests for writing null blocks with and without compression bp5: make RecMap an static anon namespaced var Replace LookupWriterRec's linear search on RecList with an unordered_map. For 250k variables, time goes from 21sec to ~1sec in WSL. The order of entries in RecList was not necessary for the serializer to work correctly. Replace LookupWriterRec's linear search on RecList with an unordered_map. For 250k variables, time goes from 21sec to ~1sec in WSL. The order of entries in RecList was not necessary for the serializer to work correctly. (ornladios#3877) Fix data length calculation for hash (ornladios#3875) Merge pull request ornladios#3823 from eisenhauer/SstMemSel Merge pull request ornladios#3805 from pnorbert/fix-bpls-string-scalar Merge pull request ornladios#3804 from pnorbert/fix-aws-version Merge pull request ornladios#3759 from pnorbert/bp5dbg-metadata new attempt to commit query support of local array. (ornladios#3868) MPI::MPI_Fortran should be INTERFACE not PUBLIC Fix hip example compilation error (ornladios#3865) Server Improvements (ornladios#3862) ascent,ci: remove unshallow flag Remove Slack as a contact mechanism (ornladios#3866) bug fix: syntax error in json output (ornladios#3857) Update the bpWriterReadHip example's cmake to run on crusher Examples: Use BPFile instead of BP3/4/5 for future-proof inlineMWE example: Close files at the end Examples: Add BeginStep/EndStep wherever it was missing BP5Serializer: handle local variables that use operators (ornladios#3859) gha,ci: update checkout to v4 Blosc2 USE ON: Fix Module Fallback cmake: correct prefer_shared_blosc behavior cmake: correct info.h installation path ci: disable MGARD static build operators: fix module library ci: add downloads readthedocs cmake: Add Blosc2 2.10.1 compatibility. Blosc2 USE ON: Fix Module Fallback (ornladios#3774) Fix destdir install test (ornladios#3850) cmake: update minimum cmake to 3.12 (ornladios#3849) MPI: add timeout for conf test for MPI_DP (ornladios#3848) MPI_DP: do not call MPI_Init (ornladios#3847) install: export adios2 device variables (ornladios#3819) Merge pull request ornladios#3799 from vicentebolea/support-new-yaml-cpp Merge pull request ornladios#3737 from vicentebolea/fix-evpath-plugins-path SST,MPI,DP: soft handle peer error SST,MPI,DP: improve uniq identifier Fix destdir install test (ornladios#3850) cmake: include ctest before detectoptions ci: enable tau check Add/Improve the ReadMe.md files in examples directory Disable BUILD_TESTING and ADIOS2_BUILD_EXAMPLES by default Remove testing based on ADIOS2-examples Fix formatting issue in DetectOptions.cmake Add examples from ADIOS2-Examples Improve existing examples MPI_DP: do not call MPI_Init (ornladios#3847) cmake: update minimum cmake to 3.12 (ornladios#3849) MPI: add timeout for conf test for MPI_DP (ornladios#3848) Tweak Remote class and test multi-threaded file remote access (ornladios#3834) Add prototype testing of remote functionality (ornladios#3830) Try always using the MPI version Try always using the MPI version Import tests from bp to staging common, implement memory selection in SST ci: fix codeql ignore path (ornladios#3772) install: export adios2 device variables (ornladios#3819) added support to query BP5 files (ornladios#3809) Partial FFS Upstream, only changes to type_id ffs 2023-09-19 (67e411c0) Fix abs/rel step in BP5 DoCount fix dummy Win build Pass Array Order of reader to remote server for proper Get() operation ...
* master: Update readme for heat transfer example with new location and build instructions Ignore tests with defects for now Adapt libfabric dataplane of SST to Cray CXI provider (ornladios#3672) ci: fix path to lsan suppressions, fix broken gh status post Use adios2_mode_readRandomAccess in matlab open to make it work for BP5 (ornladios#3956) Add Global Array Capabilities and Limitations Add Section for Anatomy of an ADIOS Program Enable Shell-Check for gh-actions scripts Enable Shell-Check for circle CI scripts Enable Shell-Check for tau contract scripts Enable Shell-Check for scorpio contract scripts Enable Shell-Check for lammps contract scripts Delete VTK code in examples Fix MATLAB bindings for MacOS (ornladios#3950) Set the compiler for the Kokkos DataMan example to what is used to build Kokkos Fix the HIP architecture CMAKE variable (ornladios#3931) perfstubs 2023-11-27 (845d0702) (ornladios#3944) Revert "Only rank 0 should print the initialization message in perfstub" CI Contract: Build examples with external ADIOS Example using DataMan with Kokkos buffers Propagating the GPU logic inside the DataMan engine ci: Use mpich built with ch3:sock:tp for faster tests ReadMe.md: Mention 2.9.2 release Cleanup server output a bit (ornladios#3914) ci: set openmpi and openmp params Example using Kokkos buffers with SST Changes to MallocV to take into consideration the memory space of a variable Change install directory of Gray scott files again ci,crusher: increase supported num branches ci: add shellcheck coverage to source and testing Change install directory of Gray scott files Only rank 0 should print the initialization message in perfstub Defining and computing derived variables (ornladios#3816) Add Remote "-status" command to see if a server is running and where (ornladios#3911) examples,hip: use find_package(hip) once in proj Add Steps Tutorial Add Operators Tutorial Add Attributes Tutorial Add Variables Tutorial Add Hello World Tutorial Add Tutorials' Download and Build section Add Tutorials' Overview section Improve bpStepsWriteRead* examples Rename bpSZ to bpOperatorSZWriter Convert bpAttributeWriter to bpAttributeWriteRead Improve bpWriter/bpReader examples Close file after reading for hello-world.py Fix names of functions in engine Fix formatting warnings Add dataspaces.rst in the list of engines Add query.rst cmake: find threads package first docs: update new_release.md Bump version to v2.9.2 ci: update number of task for mpich build clang-format: Correct format to old style Merge pull request ornladios#3878 from anagainaru/test-null-blocks Merge pull request ornladios#3588 from vicentebolea/fix-mpi-dp bp5: make RecMap an static anon namespaced var Replace LookupWriterRec's linear search on RecList with an unordered_map. For 250k variables, time goes from 21sec to ~1sec in WSL. The order of entries in RecList was not necessary for the serializer to work correctly. (ornladios#3877) Fix data length calculation for hash (ornladios#3875) Merge pull request ornladios#3823 from eisenhauer/SstMemSel gha,ci: update checkout to v4 Blosc2 USE ON: Fix Module Fallback cmake: correct prefer_shared_blosc behavior cmake: correct info.h installation path ci: disable MGARD static build operators: fix module library ci: add downloads readthedocs cmake: Add Blosc2 2.10.1 compatibility. Fix destdir install test (ornladios#3850) cmake: update minimum cmake to 3.12 (ornladios#3849) MPI: add timeout for conf test for MPI_DP (ornladios#3848) MPI_DP: do not call MPI_Init (ornladios#3847) install: export adios2 device variables (ornladios#3819) Merge pull request ornladios#3799 from vicentebolea/support-new-yaml-cpp Merge pull request ornladios#3737 from vicentebolea/fix-evpath-plugins-path Partial FFS Upstream, only changes to type_id bpls -l with scalar string variable: print the value (since min/max is empty). This changes the code for all types using Engine.Get() to get the value now. Set AWS version requirement to 1.10.15 and also turn it OFF by default as it is not a stable feature of ADIOS just yet. Fix local values block reading docs,ci: backport fixes for readthedocs
…andard BP file tests in staging-common with SST