Skip to content

Conversation

awlauria
Copy link
Contributor

@awlauria awlauria commented Feb 10, 2021

  • 'userdata' is not available when the topology is distributed via
    shared memory. Fix some areas where it wasn't protected.
  • Allow users to pass in topo files via --mca opal_hwloc_base_topo_file.
  • Consistency: taking an xml file would not filter cpus, so don't
    filter cpus from PMIx either.
  • Make sure the cache line is set wherever the topology is taken from.

Signed-off-by: Austen Lauria awlauria@us.ibm.com

@awlauria
Copy link
Contributor Author

bot:aws:retest

1 similar comment
@awlauria
Copy link
Contributor Author

bot:aws:retest

@jsquyres
Copy link
Member

I don't really grok what was done in the opal_hwloc_base_*() functions, so I'm adding @rhc54 to the review.

@awlauria
Copy link
Contributor Author

@jsquyres those changes protect the access of userdata for only when shmem was not used to share the topology.

Technically I don't think we need to access userdata anymore since the bindings are propagated to via PMIX to opal_process_info.cpuset now anyway.

@awlauria
Copy link
Contributor Author

Reviewing with @gpaulsen revealed some more gotcha's. Gonna push up a new change shortly.

@awlauria awlauria force-pushed the fix_mpiext_segv branch 4 times, most recently from fed8d6b to f4da47d Compare February 12, 2021 18:58
@awlauria
Copy link
Contributor Author

Fixes this segv:

(gdb) bt
#0  0x00002000006a6218 in opal_hwloc_base_cset2str (str=0x7fffe30e49d8 "", len=1024, topo=0x386c1670, cpuset=0x38a9d550)
    at base/hwloc_base_util.c:1184
#1  0x000020000026c860 in get_rsrc_current_binding (str=0x7fffe30e49d8 "") at mpiext_affinity_str.c:163
#2  0x000020000026c47c in OMPI_Affinity_str (fmt_type=OMPI_AFFINITY_RSRC_STRING_FMT, 
    ompi_bound=0x7fffe30e45d8 "Not bound (i.e., bound to all processors)", current_binding=0x7fffe30e49d8 "", exists=0x7fffe30e4dd8 "")
    at mpiext_affinity_str.c:75
#3  0x000000001000096c in main (argc=1, argv=0x7fffe30e5608) at str.c:19
(gdb) 

@awlauria awlauria force-pushed the fix_mpiext_segv branch 2 times, most recently from f91fc71 to 0a06f12 Compare February 12, 2021 19:38
@@ -389,12 +401,7 @@ int opal_hwloc_base_get_topology(void)
return OPAL_ERROR;
}
free(val);
/* filter the cpus thru any default cpu set */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need to pass thru "filter_cpus" in the case where we get the topology from other than shmem - was there some reason to remove it? I can see not doing so if you pull the topology from a file, and we obviously cannot do it if the topology is in shmem, but it is necessary otherwise.

Copy link
Contributor Author

@awlauria awlauria Feb 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the case where it gets the topology from the xml file that is set created by PMIx, whos topology should be the same out of shmem. If there's a reason it's needed here I can add it though.

The other case, when the user provided its own file, also didn't filter cpus.

The case where the topology is loaded manually is covered.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need it in this case (where the topology comes from the xml file) as the file is generally written by the daemon, which is not subject to the user's cpu specification. We don't do it in the case where the user provides the file because that is most commonly a developer debug use-case, not something we see done in practice, though one could make an argument that we should do it there too (so a developer could use one known topology and still test a variety of cgroup specifications against it).

We technically should do it even when we get the topology from shmem, but we can't due to the read-only nature of the shmem connection versus the way we chose to cache the results of the filter. This is something we'll have to address at some point (probably need a "shadow" tree that contains just the filtered information we used to store under "userdata").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I updated the PR to put that block of code back. I might have time to take a stab at addressing the shmem issue in the coming week.

- 'userdata' is not available when the topology is distributed via
  shared memory. Fix some areas where it wasn't protected.
- Allow users to pass in topo files via --mca opal_hwloc_base_topo_file.
- Make sure the cache line is set wherever the topology is taken from.

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
@awlauria
Copy link
Contributor Author

bot:ibm:retest

@open-mpi open-mpi deleted a comment from ibm-ompi Feb 13, 2021
@open-mpi open-mpi deleted a comment from ibm-ompi Feb 13, 2021
@open-mpi open-mpi deleted a comment from ibm-ompi Feb 13, 2021
@open-mpi open-mpi deleted a comment from ibm-ompi Feb 13, 2021
@open-mpi open-mpi deleted a comment from ibm-ompi Feb 13, 2021
@open-mpi open-mpi deleted a comment from ibm-ompi Feb 13, 2021
@awlauria
Copy link
Contributor Author

bot:ibm:retest

@ibm-ompi
Copy link

The IBM CI (GNU/Scale) build failed! Please review the log, linked below.

Gist: https://gist.github.com/ibm-ompi/88b480fb98dfd2bdac939f023e1f8fe6

@ibm-ompi
Copy link

The IBM CI (PGI) build failed! Please review the log, linked below.

Gist: https://gist.github.com/ibm-ompi/88b480fb98dfd2bdac939f023e1f8fe6

@ibm-ompi
Copy link

The IBM CI (XL) build failed! Please review the log, linked below.

Gist: https://gist.github.com/ibm-ompi/88b480fb98dfd2bdac939f023e1f8fe6

@awlauria
Copy link
Contributor Author

bot:ibm:retest

@awlauria awlauria merged commit e534c57 into open-mpi:master Feb 15, 2021
@awlauria awlauria deleted the fix_mpiext_segv branch February 15, 2021 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants