Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test mpi_ring_async_sender_receiver with all MPI completion modes #1151

Merged
merged 11 commits into from
Jul 2, 2024

Conversation

msimberg
Copy link
Contributor

@msimberg msimberg commented Jun 7, 2024

Fixes #1142, except for testing the MPI continuations modes. Those will be enabled separately (#1150).

This:

  • Adds a --pika:mpi-completion-mode flag for setting the mode via command line options. This isn't intended as the primary way to change the mode for users, but is mostly there for the unit test. CMake's ENVIRONMENT property on tests allows setting an environment variable. I originally intended to use this, but the variables are not exported so sub-processes don't see the value, and since the tests are run through mpirun/mpiexec the variable isn't applied.
  • Decrements MPI counters before calling the user-provided callback. This fixes a race in the test checking for zero MPI requests in flight. The assertion is also changed to use PIKA_TEST_EQ so that it's run with all build types and prints the failing value if it fails.
  • Adds a --recv-before-send flag to the unit test, and makes send-before-recv the default, to avoid deadlocks when requests are done inline. Sends will never block indefinitely, so doing the sends first avoids hangs.
  • Some minor additional cleanup in the test (see individual commit messages for descriptions).

@msimberg msimberg self-assigned this Jun 7, 2024
@msimberg msimberg force-pushed the mpi-ring-test-all-modes branch 2 times, most recently from 6907577 to b7b5615 Compare June 7, 2024 12:23
@pika-bot
Copy link
Collaborator

pika-bot commented Jun 7, 2024

Performance test report

pika Performance

Comparison

BENCHMARKRESULT
Task Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
pika Commit0abc08401d7ab0
pika Datetime2024-02-19T15:15:15+00:002024-06-07T12:17:59+00:00
Compiler/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1
Envfile
Datetime2024-02-19T16:26:16.072067+01:002024-06-07T14:23:52.647339+02:00
Clusternamedaintdaint
Hostnamenid00025nid00025

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (>10%)
++/--Large performance improvement/degradation (>10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@msimberg msimberg added this to the 0.26.0 milestone Jun 13, 2024
@biddisco biddisco mentioned this pull request Jun 18, 2024
@msimberg
Copy link
Contributor Author

Not to be merged yet. The issue supposedly fixed by cf856d2 still happens occasionally. Thanks to @biddisco for noticing this.

@msimberg
Copy link
Contributor Author

@biddisco I cherry-picked 1986abe from your PR (#1180). Thank you for looking into that! Was that the only commit required to fix the remaining failures?

Copy link

codacy-production bot commented Jun 21, 2024

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
+0.06% (target: -1.00%) (target: 90.00%)
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (b7987ba) 18009 13584 75.43%
Head commit (61555e4) 18009 (+0) 13594 (+10) 75.48% (+0.06%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#1151) 0 0 ∅ (not applicable)

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences


🚀 Don’t miss a bit, follow what’s new on Codacy.

Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more

@biddisco
Copy link
Contributor

@biddisco I cherry-picked 1986abe from your PR (#1180). Thank you for looking into that! Was that the only commit required to fix the remaining failures?

Should be all that is needed and safe to merge now

Copy link
Contributor

@biddisco biddisco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Fortunately, we're not really using the polling size now, but I'll make sure that stuff get's checked next time I change it.

This avoids data races between writing (set_max_polling_size) and reading (within the MPI polling
loop) the max polling size. Reads are relaxed since we don't care much about stale values being
used.
@biddisco
Copy link
Contributor

Is this PR good to go now?

@msimberg
Copy link
Contributor Author

Is this PR good to go now?

Yes, should be. I was hoping to get CSCS CI back for testing before merging it.

@msimberg msimberg enabled auto-merge July 2, 2024 07:35
@msimberg msimberg added this pull request to the merge queue Jul 2, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Jul 2, 2024
@msimberg msimberg merged commit a151bbe into pika-org:main Jul 2, 2024
36 of 38 checks passed
@msimberg msimberg deleted the mpi-ring-test-all-modes branch July 2, 2024 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Archive
Development

Successfully merging this pull request may close these issues.

Test all MPI completion modes in unit tests
3 participants