-
Notifications
You must be signed in to change notification settings - Fork 314
[release-3.7] [integ-tests] Use a fixed time MPI job to test job cancellation of MPI processes #5665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…special timing conditions. Signed-off-by: Eddy Mwiti <eddmwiti@amazon.com>
Codecov Report
@@ Coverage Diff @@
## release-3.7 #5665 +/- ##
============================================
Coverage 89.89% 89.89%
============================================
Files 179 179
Lines 15366 15366
============================================
Hits 13813 13813
Misses 1553 1553
Flags with carried forward coverage won't be shown. Click here to find out more. |
|
|
||
| module load intelmpi | ||
| mpirun -n 6 IMB-MPI1 Alltoall -npmin 2 | ||
| mpirun -n 6 bash -c 'sleep 300' -npmin 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do this we are not using MPI at all because each node will launch a process that will sleep for a fixed amount of time, but no interaction will happen between the nodes.
Let's sync up about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test involves checking if cancellation of an MPI process does not leave any stray processes. The message exchange may not be needed but as discussed, this test "implicitly" covers IntelMPI tests so this PR has been opened pending follow up on reviewing the timing issue on Ubuntu2204: #5666
|
Closed in preference of: #5665 |
…special timing conditions. (aws#5665) Signed-off-by: Eddy Mwiti <eddmwiti@amazon.com>
Description of changes
_test_mpi_job_termination) checks if cancellation of an MPI process does not leave any stray processesTests
test_slurm::test_slurm) with Ubuntu2204References
Checklist
developadd the branch name as prefix in the PR title (e.g.[release-3.6]).Please review the guidelines for contributing and Pull Request Instructions.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.