Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
f27a623
Create a new topology framework using the TreeMatch library developped
bosilca Mar 5, 2015
f74bb2c
Add a monitoring PML. This PML track all data exchanges by the processes
bosilca Mar 5, 2015
2cb27fc
Fix few typos.
bosilca Mar 27, 2015
1b8b998
Thanks Jeff.
bosilca Mar 27, 2015
d2d02a1
ckpt
Mar 28, 2015
d07dc36
Ensure we can authenticate when crossing security domains by includin…
Mar 29, 2015
0c553c2
Merge pull request #502 from nkogteva/master
hppritcha Mar 30, 2015
79b90a5
Remove stale and unused component
Mar 30, 2015
bc01661
Merge pull request #501 from rhc54/topic/sec2
Mar 30, 2015
91e9788
Create a new topology framework using the TreeMatch library developped
bosilca Mar 5, 2015
e1ecea1
Add a monitoring PML. This PML track all data exchanges by the processes
bosilca Mar 5, 2015
6846fa0
Fix few typos.
bosilca Mar 27, 2015
6927acf
Thanks Jeff.
bosilca Mar 27, 2015
cafd758
Don't include the monitoring test in 'make check'.
bosilca Mar 30, 2015
c28a01a
Whitespace cleanup.
bosilca Mar 30, 2015
e2b99ce
Merge branch 'treematch' of github.com:ICLDisco/ompi into treematch
bosilca Mar 30, 2015
9d6353d
Fix the indentation and protect with __DEBUG__ one fprintf.
bosilca Mar 30, 2015
7c127af
Add the Cecill-B license to the imported library.
bosilca Mar 30, 2015
5f70b2f
Per Brice suggestion make all data count and message length be
bosilca Apr 2, 2015
ce8ce66
Fix a compiler warning.
bosilca Apr 2, 2015
a60e046
Restrict the TreeMatch dependencies.
bosilca Apr 3, 2015
090061f
The TreeMatch software is released under BSD3 (as indicated by their
bosilca Apr 13, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# Copyright (c) 2004-2009 The Trustees of Indiana University and Indiana
# University Research and Technology
# Corporation. All rights reserved.
# Copyright (c) 2004-2014 The University of Tennessee and The University
# Copyright (c) 2004-2015 The University of Tennessee and The University
# of Tennessee Research Foundation. All rights
# reserved.
# Copyright (c) 2004-2007 High Performance Computing Center Stuttgart,
Expand Down Expand Up @@ -1460,6 +1460,7 @@ AC_CONFIG_FILES([
test/support/Makefile
test/threads/Makefile
test/util/Makefile
test/monitoring/Makefile
])
AC_CONFIG_FILES([contrib/dist/mofed/debian/rules],
[chmod +x contrib/dist/mofed/debian/rules])
Expand Down
41 changes: 41 additions & 0 deletions ompi/mca/pml/monitoring/Makefile.am
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#
# Copyright (c) 2013-2015 The University of Tennessee and The University
# of Tennessee Research Foundation. All rights
# reserved.
# Copyright (c) 2013-2015 Inria. All rights reserved.
# $COPYRIGHT$
#
# Additional copyrights may follow
#
# $HEADER$
#

monitoring_sources = \
pml_monitoring.c \
pml_monitoring.h \
pml_monitoring_comm.c \
pml_monitoring_comm.h \
pml_monitoring_component.c \
pml_monitoring_component.h \
pml_monitoring_hdr.h \
pml_monitoring_iprobe.c \
pml_monitoring_irecv.c \
pml_monitoring_isend.c \
pml_monitoring_start.c

if MCA_BUILD_ompi_pml_monitoring_DSO
component_noinst =
component_install = mca_pml_monitoring.la
else
component_noinst = libmca_pml_monitoring.la
component_install =
endif

mcacomponentdir = $(pkglibdir)
mcacomponent_LTLIBRARIES = $(component_install)
mca_pml_monitoring_la_SOURCES = $(monitoring_sources)
mca_pml_monitoring_la_LDFLAGS = -module -avoid-version

noinst_LTLIBRARIES = $(component_noinst)
libmca_pml_monitoring_la_SOURCES = $(monitoring_sources)
libmca_pml_monitoring_la_LDFLAGS = -module -avoid-version
181 changes: 181 additions & 0 deletions ompi/mca/pml/monitoring/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@

Copyright (c) 2013-2015 The University of Tennessee and The University
of Tennessee Research Foundation. All rights
reserved.
Copyright (c) 2013-2015 Inria. All rights reserved.
$COPYRIGHT$

Additional copyrights may follow

$HEADER$

===========================================================================

Low level communication monitoring interface in Open MPI

Introduction
------------
This interface traces and monitors all messages sent by MPI before they go to the
communication channels. At that levels all communication are point-to-point communications:
collectives are already decomposed in send and receive calls.

The monitoring is stored internally by each process and output on stderr at the end of the
application (during MPI_Finalize()).


Enabling the monitoring
-----------------------
To enable the monitoring add --mca pml_monitoring_enable x to the mpirun command line.
If x = 1 it monitors internal and external tags indifferently and aggregate everything.
If x = 2 it monitors internal tags and external tags separately.
If x = 0 the monitoring is disabled.
Other value of x are not supported.

Internal tags are tags < 0. They are used to tag send and receive coming from
collective operations or from protocol communications

External tags are tags >=0. They are used by the application in point-to-point communication.

Therefore, distinguishing external and internal tags help to distinguish between point-to-point
and other communication (mainly collectives).

Output format
-------------
The output of the monitoring looks like (with --mca pml_monitoring_enable 2):
I 0 1 108 bytes 27 msgs sent
E 0 1 1012 bytes 30 msgs sent
E 0 2 23052 bytes 61 msgs sent
I 1 2 104 bytes 26 msgs sent
I 1 3 208 bytes 52 msgs sent
E 1 0 860 bytes 24 msgs sent
E 1 3 2552 bytes 56 msgs sent
I 2 3 104 bytes 26 msgs sent
E 2 0 22804 bytes 49 msgs sent
E 2 3 860 bytes 24 msgs sent
I 3 0 104 bytes 26 msgs sent
I 3 1 204 bytes 51 msgs sent
E 3 1 2304 bytes 44 msgs sent
E 3 2 860 bytes 24 msgs sent

Where:
- the first column distinguishes internal (I) and external (E) tags.
- the second column is the sender rank
- the third column is the receiver rank
- the fourth column is the number of bytes sent
- the last column is the number of messages.

In this example process 0 as sent 27 messages to process 1 using point-to-point call
for 108 bytes and 30 messages with collectives and protocol related communication
for 1012 bytes to process 1.

If the monitoring was called with --mca pml_monitoring_enable 1 everything is aggregated
under the internal tags. With te above example, you have:
I 0 1 1120 bytes 57 msgs sent
I 0 2 23052 bytes 61 msgs sent
I 1 0 860 bytes 24 msgs sent
I 1 2 104 bytes 26 msgs sent
I 1 3 2760 bytes 108 msgs sent
I 2 0 22804 bytes 49 msgs sent
I 2 3 964 bytes 50 msgs sent
I 3 0 104 bytes 26 msgs sent
I 3 1 2508 bytes 95 msgs sent
I 3 2 860 bytes 24 msgs sent

Monitoring phases
-----------------
If one wants to monitor phases of the application, it is possible to flush the monitoring
at the application level. In this case all the monitoring since the last flush is stored
by every process in a file.

An example of how to flush such monitoring is given in test/monitoring/monitoring_test.c

Moreover, all the different flushed phased are aggregated at runtime and output at the end
of the application as described above.

Example
-------
A working example is given in test/monitoring/monitoring_test.c
It features, MPI_COMM_WORLD monitoring , sub-communicator monitoring, collective and
point-to-point communication monitoring and phases monitoring

To compile:
> make monitoring_test

Helper scripts
--------------
Two perl scripts are provided in test/monitoring
- aggregate_profile.pl is for aggregating monitoring phases of different processes
This script aggregates the profiles generated by the flush_monitoring function.
The files need to be in in given format: name_<phase_id>_<process_id>
They are then aggregated by phases.
If one needs the profile of all the phases he can concatenate the different files,
or use the output of the monitoring system done at MPI_Finalize
in the example it should be call as:
./aggregate_profile.pl prof/phase to generate
prof/phase_1.prof
prof/phase_2.prof

- profile2mat.pl is for transforming a the monitoring output into a communication matrix.
Take a profile file and aggregates all the recorded communicator into matrices.
It generated a matrices for the number of messages, (msg),
for the total bytes transmitted (size) and
the average number of bytes per messages (avg)

The output matrix is symmetric

Do not forget to enable the execution right to these scripts.

For instance, the provided examples store phases output in ./prof

If you type:
> mpirun -np 4 --mca pml_monitoring_enable 2 ./monitoring_test
you should have the following output
Proc 3 flushing monitoring to: ./prof/phase_1_3.prof
Proc 0 flushing monitoring to: ./prof/phase_1_0.prof
Proc 2 flushing monitoring to: ./prof/phase_1_2.prof
Proc 1 flushing monitoring to: ./prof/phase_1_1.prof
Proc 1 flushing monitoring to: ./prof/phase_2_1.prof
Proc 3 flushing monitoring to: ./prof/phase_2_3.prof
Proc 0 flushing monitoring to: ./prof/phase_2_0.prof
Proc 2 flushing monitoring to: ./prof/phase_2_2.prof
I 2 3 104 bytes 26 msgs sent
E 2 0 22804 bytes 49 msgs sent
E 2 3 860 bytes 24 msgs sent
I 3 0 104 bytes 26 msgs sent
I 3 1 204 bytes 51 msgs sent
E 3 1 2304 bytes 44 msgs sent
E 3 2 860 bytes 24 msgs sent
I 0 1 108 bytes 27 msgs sent
E 0 1 1012 bytes 30 msgs sent
E 0 2 23052 bytes 61 msgs sent
I 1 2 104 bytes 26 msgs sent
I 1 3 208 bytes 52 msgs sent
E 1 0 860 bytes 24 msgs sent
E 1 3 2552 bytes 56 msgs sent

you can parse the phases with:
> /aggregate_profile.pl prof/phase
Building prof/phase_1.prof
Building prof/phase_2.prof

And you can build the different communication matrices of phase 1 with:
> ./profile2mat.pl prof/phase_1.prof
prof/phase_1.prof -> all
prof/phase_1_size_all.mat
prof/phase_1_msg_all.mat
prof/phase_1_avg_all.mat

prof/phase_1.prof -> external
prof/phase_1_size_external.mat
prof/phase_1_msg_external.mat
prof/phase_1_avg_external.mat

prof/phase_1.prof -> internal
prof/phase_1_size_internal.mat
prof/phase_1_msg_internal.mat
prof/phase_1_avg_internal.mat

Credit
------
Designed by George Bosilca <bosilca@icl.utk.edu> and
Emmanuel Jeannot <emmanuel.jeannot@inria.fr>
Loading