Closed
Conversation
…-gpu into improve-2.0-dev-mpi
…<ConnKeyT, ConnStructT> derived classes, with specialized template class ConnectionTemplate<conn12b_key, conn12b_struct> 12 byte connections
…struct conn16b_struct (8 bytes)
…truct conn16b_struct (8 bytes)
…ak CUDA memory usage of each MPI process, total CUDA memory available for all MPI processes, free CUDA memory available for all MPI processes. Adapted MPI connections CUDA memory check scripts to run on terminal and to check used CUDA memory against theoretical prediction automatically
…-tidy. Prepared wrappers for clang-format and clang-tidy compatibility with CUDA / CUB headers and scripts to run formatting and checks on all source c++/cuda files.
…o find automatically CUDA and MPI default header paths or to accept user-defined path lists for include files, CUDA or MPI headers
…les with clang-format and check them with clang-tidy
…ce for connection structure is
… connection memory check through script benchmark_terminal.sh. Writing some comments on connection-related code.
…g on netgpu class parameters
gmtiddia
reviewed
Mar 25, 2024
Collaborator
There was a problem hiding this comment.
This file should be removed
lucapontisso
reviewed
May 3, 2024
Collaborator
There was a problem hiding this comment.
At line 39
Warning: number of bits representing node index is fixed "
"to 32 and cannot be modified with conn16b connection type"
But at line 127
max_node_nbits_ = 31;
lucapontisso
approved these changes
May 3, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implemented abstract base class and derived template class for connections. Implemented two template specializations (12 bytes and 16 bytes). Reduced gpu memory overhead for remote connection creation. Improved MPI remote connection memory checker with automatic check and summary of the results.