The pertool program provides multiple helpful operations for managing
Perturbo input and data files when conducting large simulations and other
complex tests.
This program solves two main problems:
To support scaling the calculations, Perturbo is able to generate distinct pools of k-points and their corresponding q-points and other values, one pool per MPI task. The problem is, once these files have been generated for a given number of pools (i.e. MPI tasks), the files cannot be reused for a different number of pools. This can be very frustrating when trying to find the right number of MPI tasks for a given simulation. You generate the data files, and then you try running Perturbo - but it crashes. So you try generating a different number of data files. You keep doing this until you find a pool size that seems to work well enough.
No longer! The pertool program can take an existing set of pool data
files and regenerate them for a different number of pools. This tends to
run much faster than computing them from scratch.
Coming soon...
-
The
analyzecommand reports details of theeph_g2pool data files generated by Perturbo into the./tmpdirectory. -
The
reshapecommand allows theeph_g2pool data files to be "reshaped" to a different number of pools.
Currently the installation process is very basic. Clone the repository, and run:
pip install -r requirements.txt
The pertool program has a pretty small number of dependencies:
h5pyfor HDF5 access from Python (which hasnumpyas a dependency)progressbar2for providing a helpful progress bar at the command prompt
To run preshape to regenerate pool data files, run it like this:
python -m pertool reshape \
-f <path to source tmp directory> \
-t <path to target tmp directory> \
-p <number of pools to generate>
[--mp]
The source tmp directory is scanned, and preshape will determine the
number of pools in the source directory automatically.
If the target tmp directory is not empty, the program will report an error.
The --mp flag can be used to enable multi-process parallelism, which often
yields a substantial performance improvement.
The HDF5 library is not designed to be used in highly concurrent settings.
However, a certain amount of performance improvement is possible through the
use of multi-process programming and the multiprocessing Python standard
library. The pertool program supports using multiprocessing with the
--mp flag, which parallelizes analysis/reshape operation in these ways:
-
The source HDF5 files are scanned in parallel, with one subprocess being started for each source file. This yields significant performance improvement in the typical case, as the source files are completely independent. Since the scan operation will typically be IO-bound, having multiple concurrent scans will be a big win.
-
The target HDF5 files are generated in parallel, with one subprocess generating each target file. This also yields significant performance improvement, but each subprocess must access all source HDF5 files, and the HDF5 library may use file locking to ensure exclusive access, even in read-only cases. (See above link for details.)
You should test your operations to see if the multi-process code will be faster than serial code, but in the limited tests done on NERSC Perlmutter, an order of magnitude performance improvement is typical.
NOTE: For large simulations with many pool files, running a reshape operation that involves spawning many subprocesses will massively degrade the machine's performance while the operation is running. Therefore,
pertoolalso includes a--max-processesargument that limits the maximum number of subprocesses spawned. The default value is 20, but it can be changed as appropriate.