Skip to content

Conversation

@artpol84
Copy link
Contributor

Force only procs that are participating in the ne Comm to decide what
CID is appropriate. This will have 2 advantages:
* Speedup Comm creation for small communicators: non-participating procs
  will not interfere
* Reduce CID fragmentation: non-overlaping groups will be allowed to use
  same CID.

Signed-off-by: Artem Polyakov artpol84@gmail.com

@artpol84
Copy link
Contributor Author

Algorithm efficiency verification

I've applied the following patch to verify that algorithm provides performance improvement:

diff --git a/ompi/communicator/comm_cid.c b/ompi/communicator/comm_cid.c                                                                                                
index 6f67d99..35d978f 100644                                                                                                                                           
--- a/ompi/communicator/comm_cid.c                                                                                                                                      
+++ b/ompi/communicator/comm_cid.c                                                                                                                                      
@@ -411,6 +411,15 @@ static int ompi_comm_nextcid_check_flag (ompi_comm_request_t *request)                                                                             
     }                                                                                                                                                                  
.                                                                                                                                                                       
     if (1 == context->rflag) {                                                                                                                                         
+       {                                                                                                                                                               
+           static int comm_count = 0;                                                                                                                                  
+           int rank;                                                                                                                                                   
+           MPI_Comm_rank(MPI_COMM_WORLD, &rank);                                                                                                                       
+           if( 0 == rank ){                                                                                                                                            
+               printf("Comm #%d: CID=%d, iters=%d\n", ++comm_count, context->nextcid, context->iter);                                                                  
+           }                                                                                                                                                           
+      }                                                                                                                                                               
+                                                                                                                                                                    
         if( !participate ) {                                                                                                                                           
             /* we need to provide something sane here                                                                                                                  
              * but we cannot use `nextcid` as we may have it                                                                                                           

This debug output demonstrates significant improvement for my comm_create benchmark (https://github.com/artpol84/poc/tree/master/benchmarks/comm_create) that I derived from Amber app :

  • old_cid_algo.txt
    Here number of iterations is constantly growing up to ~1000 and max CID is 1082
  • new_cid_algo.txt
    In this case max number of iterations is 13 (avg is 3) and max CID is 26. This confirms that we have more compact CID distribution with less fragmentation, because CID's are reused by non-overlapping groups of processes.

Overall new algo is about 80 times faster than the old one.

@artpol84
Copy link
Contributor Author

MTT is clean, I started one MTT with enabled reporter to mtt.open-mpi.org. Will post the link once it's finished.

@artpol84
Copy link
Contributor Author

@hjelmn I also fixed some error handling paths. Could you please double-check me?

    Force only procs that are participating in the ne Comm to decide what
    CID is appropriate. This will have 2 advantages:
    * Speedup Comm creation for small communicators: non-participating procs
      will not interfere
    * Reduce CID fragmentation: non-overlaping groups will be allowed to use
      same CID.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
@artpol84 artpol84 force-pushed the comm_create/master branch from f1f7f20 to 68167ec Compare April 21, 2017 01:33
@bwbarrett-bot
Copy link

Test Passed

@artpol84
Copy link
Contributor Author

artpol84 commented Apr 21, 2017

I performed MTTs for each branches internally with mpirun launches.
But I also tested comm_create/v2.x from PR #3376 branch with our SLURM/pmix MTT that is reported outside to mtt.open-mpi.org. For all of them all tests was successful.
Here are the publicly available results with SLURM/pmix srun launch: https://mtt.open-mpi.org/index.php?do_redir=2417.
Set of tests was the same for both mpirun and srun MTTs.

@jladd-mlnx
Copy link
Member

👍

@jsquyres
Copy link
Member

@bosilca @ggouaillardet please review

@jsquyres
Copy link
Member

Please fix typo in commit message:

Force only procs that are participating in the ne Comm to decide what

I assume "ne" should be "new".

Copy link
Contributor

@ggouaillardet ggouaillardet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please refer to my comments in #3388

@artpol84 artpol84 merged commit 858d8cd into open-mpi:master May 6, 2017
@artpol84
Copy link
Contributor Author

artpol84 commented May 6, 2017

I saw that it was already merged into 3.x and 1.10.
So merge into master.

@rhc54
Copy link
Contributor

rhc54 commented May 6, 2017

Sigh - I thought after this much time it had already been put in master. In future, please be advised that we don't work this way - it must be put into master before opening PRs on release branches. Otherwise, it leads to this kind of confusion.

I won't revert it out of 1.10 - let's see what happens on master and then we'll decide.

@artpol84
Copy link
Contributor Author

artpol84 commented May 6, 2017

I don't mind - All PRs was there and ready to go. I usually don't merge my own PRs.

@rhc54
Copy link
Contributor

rhc54 commented May 6, 2017

Ummm...why would you be expecting someone else to merge your PRs to master? I can't think of anyone else who does that, or any reason for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants