- 
                Notifications
    You must be signed in to change notification settings 
- Fork 929
Topic/get element fix #4883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Topic/get element fix #4883
Conversation
The loop should be updated on each internal iteration. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
| The IBM CI (GNU Compiler) build failed! Please review the log, linked below. Gist: https://gist.github.com/2f5b1e23de66e06b8982acd00e2c336f | 
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
| @jladd-mlnx @Di0gen @yosefe | 
| I pushed a missing include file fix to this PR. If it passes CI, I'll squash it back down to George's original commit and then we can merge. | 
| I compile in debug mode, and the output was available there. I'll squash. | 
| @bosilca Cool -- thanks! | 
fd4dd7f    to
    999de13      
    Compare
  
    | bot:mellanox:retest | 
    
      
        1 similar comment
      
    
  
    | bot:mellanox:retest | 
| @alinask thanks! | 
| Are we good to merge this pr, i.e. has the squash been already done? | 
| @edgargabriel Yes, squash has been done -- we're good to go. | 
| @bosilca This PR changed the default value of the MCA param  Was that intended? That doesn't seem to have anything to do with this GET_ELEMENT fix. | 
| @bosilca Is this fix also applicable to v2.1.x? The cherry picks did not apply anywhere close to clean for me. | 
| @jsquyres the opal_cuda_verbose was not supposed to be part of this. Fortunately, it has no impact on the code. I will change back in master, I have another cleanup pending. The bug is present in 2.1, and as you noticed this patch will not apply as is. One will have to search for the lines in this patch and apply them by hand. Got few minutes before my plane takes of, here is the patch for 2.1 (please test I haven't tried it yet). diff --git a/opal/datatype/opal_datatype_get_count.c b/opal/datatype/opal_datatype_get_count.c
index 7b539fbec8..7b33b73480 100644
--- a/opal/datatype/opal_datatype_get_count.c
+++ b/opal/datatype/opal_datatype_get_count.c
@@ -1,6 +1,6 @@
 /* -*- Mode: C; c-basic-offset:4 ; -*- */
 /*
- * Copyright (c) 2004-2009 The University of Tennessee and The University
+ * Copyright (c) 2004-2018 The University of Tennessee and The University
  *                         of Tennessee Research Foundation.  All rights
  *                         reserved.
  * Copyright (c) 2009      Oak Ridge National Labs.  All rights reserved.
@@ -58,9 +58,8 @@ ssize_t opal_datatype_get_element_count( const opal_datatype_t* datatype, size_t
             continue;
         }
         if( OPAL_DATATYPE_LOOP == pElems[pos_desc].elem.common.type ) {
-            ddt_loop_desc_t* loop = &(pElems[pos_desc].loop);
             do {
-                PUSH_STACK( pStack, stack_pos, pos_desc, OPAL_DATATYPE_LOOP, loop->loops, 0 );
+                PUSH_STACK( pStack, stack_pos, pos_desc, OPAL_DATATYPE_LOOP, pElems[pos_desc].loop.loops, 0 );
                 pos_desc++;
             } while( OPAL_DATATYPE_LOOP == pElems[pos_desc].elem.common.type ); /* let's start another loop */
             DDT_DUMP_STACK( pStack, stack_pos, pElems, "advance loops" );
@@ -121,9 +120,8 @@ int32_t opal_datatype_set_element_count( const opal_datatype_t* datatype, size_t
             continue;
         }
         if( OPAL_DATATYPE_LOOP == pElems[pos_desc].elem.common.type ) {
-            ddt_loop_desc_t* loop = &(pElems[pos_desc].loop);
             do {
-                PUSH_STACK( pStack, stack_pos, pos_desc, OPAL_DATATYPE_LOOP, loop->loops, 0 );
+                PUSH_STACK( pStack, stack_pos, pos_desc, OPAL_DATATYPE_LOOP, pElems[pos_desc].loop.loops, 0 );
                 pos_desc++;
             } while( OPAL_DATATYPE_LOOP == pElems[pos_desc].elem.common.type ); /* let's start another loop */
             DDT_DUMP_STACK( pStack, stack_pos, pElems, "advance loops" ); | 
Fix the calculation of the number of elements in the datatype. As the loop variable has been set outside the internal loop, we were not correctly updating the loop stats, and we were counting more elements than actually existing in the datatype representation.