Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation of the coll tuned component #12641

Closed
burlen opened this issue Jun 26, 2024 · 0 comments · Fixed by #12642
Closed

Improve documentation of the coll tuned component #12641

burlen opened this issue Jun 26, 2024 · 0 comments · Fixed by #12642

Comments

@burlen
Copy link

burlen commented Jun 26, 2024

Is your feature request related to a problem? Please describe.
The coll tuned component has sophisticated tuning capabilities yet these are scarcely documented. The thresholds used to select a collective algorithm are based on experiences on existing systems, which leads to the need for users to tune when running on new systems. Tuning may also be generally beneficial since thresholds are set based on average experience across a variety of systems, and may not be the best for an individual system. Lots of digging into the source code, issue tracker and trial and error were needed for to understand how use to use the tuning features.

Describe the solution you'd like
A small amount of documentation in the user guide that summarizes key information can make tuning far easier for those new to it.

Describe alternatives you've considered
The current lack of documentation makes this important feature difficult to use.

Additional context
See also #8157, #7672, #12589, #12547, #12453 among others

burlen pushed a commit to burlen/ompi that referenced this issue Jun 26, 2024
resolves open-mpi#12641

Signed-off-by: Burlen Loring <bloring@nvidia.com>
burlen pushed a commit to burlen/ompi that referenced this issue Jun 26, 2024
resolves open-mpi#12641

Signed-off-by: Burlen Loring <bloring@nvidia.com>
burlen pushed a commit to burlen/ompi that referenced this issue Jun 27, 2024
resolves open-mpi#12641

Signed-off-by: Burlen Loring <bloring@nvidia.com>
jsquyres pushed a commit to jsquyres/ompi that referenced this issue Jul 10, 2024
resolves open-mpi#12641

Signed-off-by: Burlen Loring <bloring@nvidia.com>
(cherry picked from commit 55c9dda)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants