forked from idaholab/moose
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add RankMap and use it to get some more diagnostic capability for par…
…titioning and fix a big in MemoryUsageReporter closes idaholab#12629
- Loading branch information
Showing
20 changed files
with
418 additions
and
80 deletions.
There are no files selected for viewing
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+420 KB
framework/doc/content/media/vectorpostprocessors/work_balance_hardware_id.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# HardwareIDAux | ||
|
||
!syntax description /AuxKernels/HardwareIDAux | ||
|
||
## Description | ||
|
||
One of the main purposes of this object is to aid in the diagnostic of mesh partitioners. One metric to look at for mesh partitioners is how well they keep down inter-node (compute node) communication. `HardwareIDAux` allows you to visually see the mapping of elements to compute nodes in your job. | ||
|
||
This is particularly interesting in the case of the [PetscExternalPartitioner](PetscExternalPartitioner.md) which has the capability to do "hierarchical" partitioning. Hierarchical partitioning makes it possible to partition over compute-nodes first... then within compute nodes, in order to better respect the physical topology of the compute cluster. | ||
|
||
One important aspect of that is that how you launch your parallel job can matter quite a bit to partitioning. In-general, it's better for partitioners if all of the ranks of your job are contiguously assigned to each compute node. Here are four different ways, and the outcome using `HardwareIDAux`, to launch a job using a 100x100 generated mesh on 16 processes and 4 ndoes with two different partitioner... | ||
|
||
Top left (METIS): | ||
|
||
``` | ||
mpiexec -n 16 -host lemhi0002,lemhi0003,lemhi0004,lemhi0005 ../../../moose_test-opt -i hardware_id_aux.i | ||
``` | ||
|
||
Top right (Hierarchic): | ||
|
||
``` | ||
mpiexec -n 16 -host lemhi0002,lemhi0003,lemhi0004,lemhi0005 ../../../moose_test-opt -i hardware_id_aux.i -mat_partitioning_hierarchical_nfineparts 4 | ||
``` | ||
|
||
Bottom left (METIS): | ||
|
||
``` | ||
mpiexec -n 16 -host lemhi0002,lemhi0003,lemhi0004,lemhi0005 -ppn 4 ../../../moose_test-opt -i hardware_id_aux.i | ||
``` | ||
|
||
Bottom right (Hierarchic): | ||
|
||
``` | ||
mpiexec -n 16 -host lemhi0002,lemhi0003,lemhi0004,lemhi0005 -ppn 4 ../../../moose_test-opt -i hardware_id_aux.i -mat_partitioning_hierarchical_nfineparts 4 | ||
``` | ||
|
||
It should be immediately apparent that the bottom right partitioning is best (will reduce the amount of inter-node communication). That result was achieved by using hierarchical partitioning and using `-ppn 4` to tell `mpiexec` to put `4` processes on each compute node... which will cause those four processes to be contiguous on each node. The top two examples, which omit the `-ppn` option, end up getting "striped" mpi processes (one process is placed on each node and then it wraps around) causing a jumbly mess of partitioning which will increase the communication cost for the job (and decrease scalability). | ||
|
||
!media media/auxkernels/hardware_id_aux.png style=width:75% | ||
|
||
!syntax parameters /AuxKernels/HardwareIDAux | ||
|
||
!syntax inputs /AuxKernels/HardwareIDAux | ||
|
||
!syntax children /AuxKernels/HardwareIDAux | ||
|
||
!bibtex bibliography |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
//* This file is part of the MOOSE framework | ||
//* https://www.mooseframework.org | ||
//* | ||
//* All rights reserved, see COPYRIGHT for full restrictions | ||
//* https://github.com/idaholab/moose/blob/master/COPYRIGHT | ||
//* | ||
//* Licensed under LGPL 2.1, please see LICENSE for details | ||
//* https://www.gnu.org/licenses/lgpl-2.1.html | ||
|
||
#ifndef HARDWAREIDAUX_H | ||
#define HARDWAREIDAUX_H | ||
|
||
#include "AuxKernel.h" | ||
|
||
// Forward Declarations | ||
class HardwareIDAux; | ||
|
||
template <> | ||
InputParameters validParams<HardwareIDAux>(); | ||
|
||
/** | ||
* "Paints" the ID of of the physical "node" in the cluster the element | ||
* is located on. Useful for examining partition schemes. | ||
*/ | ||
class HardwareIDAux : public AuxKernel | ||
{ | ||
public: | ||
HardwareIDAux(const InputParameters & parameters); | ||
|
||
protected: | ||
virtual Real computeValue() override; | ||
|
||
const RankMap & _rank_map; | ||
}; | ||
|
||
#endif // HARDWAREIDAUX_H |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
//* This file is part of the MOOSE framework | ||
//* https://www.mooseframework.org | ||
//* | ||
//* All rights reserved, see COPYRIGHT for full restrictions | ||
//* https://github.com/idaholab/moose/blob/master/COPYRIGHT | ||
//* | ||
//* Licensed under LGPL 2.1, please see LICENSE for details | ||
//* https://www.gnu.org/licenses/lgpl-2.1.html | ||
|
||
#ifndef RANKMAP_H | ||
#define RANKMAP_H | ||
|
||
#include "PerfGraphInterface.h" | ||
|
||
#include "libmesh/parallel_object.h" | ||
|
||
/** | ||
* Builds lists and maps that help in knowing which physical hardware nodes each rank is on. | ||
* | ||
* Note: large chunks of this code were originally committed by @dschwen in PR #12351 | ||
* | ||
* https://github.com/idaholab/moose/pull/12351 | ||
*/ | ||
class RankMap : ParallelObject, PerfGraphInterface | ||
{ | ||
public: | ||
/** | ||
* Constructs and fills the map | ||
*/ | ||
RankMap(const Parallel::Communicator & comm, PerfGraph & perf_graph); | ||
|
||
/** | ||
* Returns the "hardware ID" (a unique ID given to each physical compute node in the job) | ||
* for a given processor ID (rank) | ||
*/ | ||
unsigned int hardwareID(processor_id_type pid) const { return _rank_to_hardware_id[pid]; } | ||
|
||
/** | ||
* Returns the ranks that are on the given hardwareID (phsical node in the job) | ||
*/ | ||
const std::vector<processor_id_type> & ranks(unsigned int hardware_id) const | ||
{ | ||
auto item = _hardware_id_to_ranks.find(hardware_id); | ||
if (item == _hardware_id_to_ranks.end()) | ||
mooseError("hardware_id not found"); | ||
|
||
return item->second; | ||
} | ||
|
||
/** | ||
* Vector containing the hardware ID for each PID | ||
*/ | ||
const std::vector<unsigned int> & rankHardwareIds() const { return _rank_to_hardware_id; } | ||
|
||
protected: | ||
PerfID _construct_timer; | ||
|
||
/// Map of hardware_id -> ranks on that node | ||
std::map<unsigned int, std::vector<processor_id_type>> _hardware_id_to_ranks; | ||
|
||
/// Each entry corresponds to the hardware_id for that PID | ||
std::vector<unsigned int> _rank_to_hardware_id; | ||
}; | ||
|
||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
//* This file is part of the MOOSE framework | ||
//* https://www.mooseframework.org | ||
//* | ||
//* All rights reserved, see COPYRIGHT for full restrictions | ||
//* https://github.com/idaholab/moose/blob/master/COPYRIGHT | ||
//* | ||
//* Licensed under LGPL 2.1, please see LICENSE for details | ||
//* https://www.gnu.org/licenses/lgpl-2.1.html | ||
|
||
#include "HardwareIDAux.h" | ||
|
||
registerMooseObject("MooseApp", HardwareIDAux); | ||
|
||
template <> | ||
InputParameters | ||
validParams<HardwareIDAux>() | ||
{ | ||
InputParameters params = validParams<AuxKernel>(); | ||
params.addClassDescription( | ||
"Creates a field showing the assignment of partitions to physical nodes in the cluster."); | ||
return params; | ||
} | ||
|
||
HardwareIDAux::HardwareIDAux(const InputParameters & parameters) | ||
: AuxKernel(parameters), _rank_map(_app.rankMap()) | ||
{ | ||
} | ||
|
||
Real | ||
HardwareIDAux::computeValue() | ||
{ | ||
if (isNodal()) | ||
return _rank_map.hardwareID(_current_node->processor_id()); | ||
else | ||
return _rank_map.hardwareID(_current_elem->processor_id()); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.