GEOPM - Global Extensible Open Power Manager
SEE COPYING FILE FOR LICENSE INFORMATION.
2018 December 20
Christopher Cantalupo email@example.com
The Global Extensible Open Power Manager (GEOPM) is a framework for exploring power and energy optimizations targeting high performance computing. The GEOPM package provides many built-in features. A simple use case is reading hardware counters and setting hardware controls with platform independant syntax using a command line tool on a particular compute node. An advanced use case is dynamically coordinating hardware settings across all compute nodes used by an application is response to the application's behavior and requests from the resource manager. The dynamic coordination is implemented as a hierarchical control system for scalable communication and decentralized control. The hierarchical control system can optimize for various objective functions including maximizing global application performance within a power bound or minimizing energy consumption with marginal degradation of application performance. The root of the control hierarchy tree can communicate with the system resource manager to extend the hierarchy above the individual MPI application and enable the management of system power resources for multiple MPI jobs and multiple users by the system resource manager.
The GEOPM package provides two libraries: libgeopm for use with MPI applications, and libgeopmpolicy for use with applications that do not link to MPI. There are several command line tools included in GEOPM which have dedicated manual pages. The geopmlaunch(1) command line tool is used to launch an MPI application while enabling the GEOPM runtime to create a GEOPM Controller thread on each compute node. The Controller loads plugins and executes the Agent algorithm to control the compute application. The geopmlaunch(1) command is part of the geopmpy python package that is included in the GEOPM installation. See the GEOPM overview man page for further documentation and links: geopm(7).
The GEOPM runtime is extended through three plugin classes: Agent, IOGroup, and Comm. New implementations of these classes can be dynamically loaded at runtime by the GEOPM Controller. The Agent class defines which data are collected, how control decisions are made, and what messages are communicated between Agents in the tree hierarchy. The reading of data and writing of controls from within a compute node is abstracted from the Agent through the PlatformIO interface. This interface provides access to the IOGroup implementations that provide a variety of signals and controls. IOGroup plugins can be developed independently of the Agents to extend the read and write capabilities provided by GEOPM. The PlatformIO abstraction enables Agent implementations to be ported to different hardware platforms without modification. Messaging between Agents running on different compute nodes is encapsulated in the Comm class. New implementations of the Comm class make it possible to port inter-node communication used by the GEOPM runtime to different underlying communication protocols and hardware without modifying the Agent implementations.
The libgeopm library can be called directly or indirectly within MPI applications to enable application feedback for informing the control decisions. The indirect calls are facilitated by GEOPM's integration with MPI and OpenMP through their profiling decorators, and the direct calls are made through the geopm_prof_c(3) or geopm_fortran(3) interfaces. Marking up a compute application with profiling information through these interfaces can enable better integration of the GEOPM runtime with the compute application and more precise control.
The GEOPM public GitHub project has been integrated with Travis continuous integration.
All pull requests will be built and tested automatically by Travis.
The OpenHPC project provides the most robust way to install GEOPM.
The GEOPM project was first packaged with OpenHPC version 1.3.6. The OpenHPC install guide contains documentation on how to install GEOPM and its dependencies and can be found on the OpenHPC download page.
The OpenHPC packages are distributed from the OpenHPC OBS build server.
The OpenHPC project packages all of the dependencies required by GEOPM that are not part of a standard Linux distribution. This includes the msr-safe kernel driver and MSR save/restore functionality built into the Slurm resource manager to enable robust reset of hardware controls when returning compute nodes to the general pool available to other users.
The GEOPM python tools are packaged in the RPMs described above, but
they are also available from PyPI as the
geopmpy package. For
example, to install the geopmpy package into your home directory, run
the following command:
pip install --user geopmpy
Note this installs only the GEOPM python tools and does not install the full GEOPM runtime.
In order to build the GEOPM package from source, the below requirements must be met.
The GEOPM package requires a compiler that supports the MPI 2.2 and C++11 standards. These requirements can be met by using GCC version 4.7 or greater and installing the openmpi-devel package version 1.7 or greater on RHEL or SLES Linux. Documentation creation including man pages further requires the rubygems package, and the ruby-devel package.
yum install openmpi-devel ruby-devel rubygems
zypper install openmpi-devel ruby-devel rubygems
Alternatively these can be installed from source, and an alternate MPI implementation to OpenMPI can be selected (e.g. one provided by OpenMPI). See
for details on how to use non-standard install locations for build requirements through the
The source code can be rebuilt from the source RPMs available from OpenHPC. To build from the git repository follow the instructions below.
To build all targets and install it in a "build/geopm" subdirectory of your home directory run the following commands:
./autogen.sh ./configure --prefix=$HOME/build/geopm make make install
If building with the Intel toolchain the following environment variables must be set prior to running configure:
export CC=icc export CXX=icpc export FC=ifort export F77=ifort export MPICC=mpicc export MPICXX=mpicxx
An RPM can be created on a RHEL or SUSE system with the
target. Note that the --with-mpi-bin option may be required to inform configure about the location of the MPI compiler wrappers. The following command may be sufficient to determine the location:
dirname $(find /usr -name mpicc)
To build in an environment without support for OpenMP (i.e. clang on Mac OS X) use the
option can be used to build only targets which do not require MPI. By default MPI targets are built.
We are targeting SLES12 and RHEL7 distributions for functional runtime support. There is a single runtime requirement that can be obtained from these distributions for the OpenMPI implementation of MPI. To install, follow the instructions below for your Linux distribution.
yum install openmpi
zypper install openmpi
Alternatively the MPI requirement can be met by using OpenHPC packages.
In order for GEOPM to properly use shared memory to communicate
between the Controller and the application, it may be necessary to
alter the configuration for systemd. The default behavior of systemd
is to clean-up all inter-process communication for non-system users.
This causes issues with GEOPM's initialization routines for shared
memory. This can be disabled by ensuring that
RemoveIPC=no is set
/etc/systemd/logind.conf. Most Linux distributions change the
default setting to disable this behavior. More information can be
The msr-safe kernel driver must be loaded at runtime to support user-level read and write of white-listed MSRs. The msr-safe kernel driver is distributed with OpenHPC and can be installed using the RPMs distributed there (see INSTALL section above).
The source code for the driver can be found here at the link below.
Alternately, you can run GEOPM as root with the standard msr driver loaded:
LINUX POWER MANAGEMENT
Note that other Linux mechanisms for power management can interfere with GEOPM, and these must be disabled. We suggest disabling the intel_pstate kernel driver by modifying the kernel command line through grub2 or the boot loader on your system by adding:
The cpufreq driver will be enabled when the intel_pstate driver is disabled. The cpufreq driver has several modes controlled by the scaling_governor sysfs entry. When the performance mode is selected, the driver will not interfere with GEOPM. For SLURM based systems the GEOPM launch wrappers will attempt to set the scaling governor to "performance". This alleviates the need to manually set the governor. Older versions of SLURM require the desired governors to be explicitly listed in /etc/slurm.conf. In particular, SLURM 15.x requires the following option:
More information on the slurm.conf file can be found here. Non-SLURM systems must still set the scaling governor through some other mechanism to ensure proper GEOPM behavior. The following command will set the governor to performance:
echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
See kernel documentation here for more information.
GEOPM MPI LAUNCH WRAPPER
The GEOPM package installs the command, "geopmlaunch". This is a wrapper for the MPI launch commands like "mpiexec.hydra", "srun", and "aprun" where the wrapper script enables the GEOPM runtime. The "geopmlaunch" command supports exactly the same command line interface as the underlying launch command, but the wrapper extends the interface with GEOPM specific options. When the "--geopm-ctl" option is passed to the wrapper it will launch the GEOPM runtime with your application while enforcing the GEOPM requirements by manipulating the options passed to the underlying launch command. This includes handling the CPU affinity requirements and getting the GEOPM control process on each node up and ready to connect to the main compute application. The wrapper is documented in the geopmlaunch(1) man page.
There are several underlying MPI application launchers that "geopmlaunch" wrapper supports. See the geopmlaunch(1) man page for information on available launchers and how to select them. If the launch mechanism for your system is not supported, then affinity requirements must be enforced by the user and all options to the GEOPM runtime must be passed through environment variables. Please consult the geopm(7) man page for documentation of the environment variables used by the GEOPM runtime that are otherwise controlled by the wrapper script.
CPU AFFINITY REQUIREMENTS
The GEOPM runtime requires that each MPI process of the application under control is affinitized to distinct CPUs. This is a strict requirement for the runtime and must be enforced by the MPI launch command.
Affinitizing the GEOPM control thread to a CPU that is distinct from the application CPUs may improve performance of the application, but this is not a requirement. On systems where an application achieves highest performance when leaving a CPU unused by the application so that this CPU can be dedicated to the operating system, it is usually best to affinitize the GEOPM control thread to this CPU designated for system threads.
There are many ways to launch an MPI application, and there is no single uniform way of enforcing MPI rank CPU affinities across different job launch mechanisms. Additionally, OpenMP runtimes, which are associated with the compiler choice, have different mechanisms for affinitizing OpenMP threads within CPUs available to each MPI process. To complicate things further the GEOPM control thread can be launched as an application thread or a process that may be part of the primary MPI application or a completely separate MPI application. For these reasons it is difficult to document how to correctly affinitize processes in all configurations. Please refer to your site documentation about CPU affinity for the best solution on the system you are using and consider extending the geopmlaunch wrapper to support your system configuration (please see the CONTRIBUTING.md file for information about how to share these implementation with the community).
From within the source code directory, unit tests can be executed with the "make check" target.
This software is production quality. We will be enforcing semantic versioning for all releases following version 1.0. We are very interested in feedback from the community. Refer to the ChangeLog a high level history of changes in each release. See github issues page for information about ongoing work. Test coverage by unit tests is lacking for some files and will continue to be improved. The line coverage results from gcov as reported by gcovr for the latest release can be found here
Development of the GEOPM software package has been partially funded through contract B609815 with Argonne National Laboratory.