Skip to content

Conversation

jjhursey
Copy link
Member

  • Negative priority cleanup
  • Fixes a cleanup segv in hcoll if it is asked to take a negative priority and has to cleanup.
  • Improves coll/base verbose messages making it easier to see the set of collectives selected and being queried (as some might be rejected based on negative priority).

@jjhursey jjhursey added this to the v2.0.1 milestone Jun 30, 2016
@jjhursey jjhursey self-assigned this Jun 30, 2016
@ibm-ompi
Copy link

Build Failed with GNU compiler! Please review the log, and get in touch if you have questions.

@ibm-ompi
Copy link

Build Failed with XL compiler! Please review the log, and get in touch if you have questions.

@jjhursey
Copy link
Member Author

(IBM Jenkins) Per #1833 the cluster seems to be having problems - I'm disabling the IBM tests for now while I diagnose.

@jjhursey
Copy link
Member Author

bot:retest

@ibm-ompi
Copy link

Build Failed with XL compiler! Please review the log, and get in touch if you have questions.
https://gist.github.com/e130fac462df6a9a255ffd233ac73fef

@jjhursey
Copy link
Member Author

bot:ibm:retest

1 similar comment
@jjhursey
Copy link
Member Author

bot:ibm:retest

@hjelmn
Copy link
Member

hjelmn commented Jul 1, 2016

:bot:retest

@jjhursey
Copy link
Member Author

jjhursey commented Jul 1, 2016

Just testing our Jenkins setup:
bot:ibm:retest

jjhursey added 2 commits July 1, 2016 13:41
 * Print a verbose message if the component was disqualified because of
   a negative priority.
 * If a disqualified component provided a module, release it.
 * Display list of selected components in priority order
   - During the process of volunteering collective functions for a
     communicator, print the component name and priority. This will
     cause the verbose messages to be displayed in reverse priority
     order (lowest priority first, up to highest). This is helpful
     when determining which collective components are active in which
     order for a given communicator.
     To see the messages you need the following MCA parameter set to 9
     or higher: `-mca coll_base_verbose 9`
 * Adjust verbose for commonly needed verbose output from 10 to 9 to
   make it easier to access this information.
 * If hcoll is given a negative priority, but not enabled=0 then
   the module is constructed, but then destructed before calling
   it's query(). So the previous pointers are not initialized.
   If we try to OBJ_RELEASE them in a debug build an assert will fire.
   This commit adds some protection against that and initializes
   the _module pointers to NULL.
@jjhursey jjhursey force-pushed the topic/coll-base-verbose branch from 950b8a5 to 0a09f8b Compare July 1, 2016 18:41
@jjhursey
Copy link
Member Author

jjhursey commented Jul 1, 2016

Mellanox failure is unrelated. It is the:

[jenkins01:09635] listen_thread: accept() failed: Invalid argument (22).

failure that we have been seeing in a lot of other PRs - I think this is a problem with the master branch.

@jjhursey jjhursey merged commit 8634a63 into open-mpi:master Jul 1, 2016
@jjhursey jjhursey deleted the topic/coll-base-verbose branch July 1, 2016 20:21
@jsquyres
Copy link
Member

jsquyres commented Jul 1, 2016

@jjhursey Has an issue been filed on master about that listen_thread error?

@jjhursey
Copy link
Member Author

jjhursey commented Jul 1, 2016

No, but it should be...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants