Skip to content
Evan Weinberg edited this page Aug 11, 2023 · 5 revisions

QIO and QMP

QUDA has CMake options to automatically download and build QIO and QMP. QMP is a QCD message passing library, and QIO is a QCD IO library. QUDA uses QIO to read and write gauge fields and color-spinor fields and has QMP as a dependency.

To automatically download, build, and use QMP/QIO, add the following flags to your cmake command:

cmake [...] -DQUDA_DOWNLOAD_USQCD=ON -DQUDA_QMP=ON -DQUDA_QIO=ON [...]

Alternatively, QUDA can be pointed to a pre-existing cmake build of QMP and QIO. Assuming QMP is in the directory /usr/local/qmp and QIO is in the directory /usr/local/qio, this can be accomplished by:

cmake [...] -DQUDA_QMP=ON -DQMP_DIR=/usr/local/qmp/lib/cmake/QMP -DQUDA_QIO=ON -DQIO_DIR=/usr/local/qio/lib/cmake/QIO [...]`

Loading and Saving Gauge fields

Gauge fields can be loaded and saved from the test executables via the flags --load-gauge [filename] and --save-gauge [filename], where the files can either be relative or absolute paths. At this time the files must be in the SCIDAC format, but there are plans for at least loading MILC and NERSC archive formatted files in the future. The code relevant for loading and saving gauge fields is in qio_field.cpp.

Loading and Saving Vector fields

ColorSpinorFields can be loaded and saved in the context of eigenvalue/eigenvector deflation, as well as near-null vector generation when MG is being used. Vector saving can also take advantage of QIO's support for PARTFILE formats as an optimization for filesystem performance. Note that QIO will automatically look for both single file and partfile formats while trying to load files, which is why there is no need to have a partfile flag for loading.

Object Field name Description Flag
QudaEigParam char vec_infile[256] Input filename for eigenvectors --eig-load-vec
QudaEigParam char vec_outfile[256] Output filename for eigenvectors --eig-save-vec
QudaEigParam QudaBoolean vec_partfile Whether or not to save in partfile --eig-save-partfile
QudaMultigridParam char mg_vec_infile[i][256] Base of input filename --mg-load-vec
QudaMultigridParam char mg_vec_outfile[i][256] Base of output filename --mg-save-vec
QudaMultigridParam QudaBoolean mg_vec_partfile[i] Whether or not to save in partfile --mg-partfile-vec

Note that for MG, vec_load[i] and vec_save[i], which are both QudaBoolean, must also be set to QUDA_BOOLEAN_TRUE.

Notes on PARTFILE

When vectors are saved in PARTFILE format, each MPI rank saves its own contribution to a vector in a separate file. If the specified output filename is, for ex, foo, the name of each individual file is given by foo.vol0003 (as an example) where the zero-padding is given by the format %04d (in the context of printf).

One possible workflow for this is to save PARTFILE to per-node scratch disks during a run, copy them back to a network filesystem after a run, and then in subsequent runs copy the PARTFILEs back to per-node scratch disks. An example script that could handle this is:

#! /usr/bin/bash

BASE_FILE=$1
FROM_TO=$2 # 1 for from, 0 for to

# compose my filename
if [ ! -z $OMPI_COMM_WORLD_RANK ]
then
  MY_RANK=$OMPI_COMM_WORLD_RANK
elif [ ! -z $SLURM_PROCID ]
then
  MY_RANK=$SLURM_PROCID
else
  echo "No MPI rank variable detected"
fi

NUMBER=$(echo $MY_RANK | awk ' { printf "%04d",$1; } ')
FILENAME="${BASE_FILE}.vol${NUMBER}"

if [ $FROM_TO -eq 1 ]
then
  echo "copy from scratch"
  cp /raid/scratch/${FILENAME} ${FILENAME}
else
  echo "copy to scratch"
  cp ${FILENAME} /raid/scratch/${FILENAME}
fi

This script assumes that the per-node scratch disk resides at /raid/scratch. The script is called, for example, as mpirun -np 8 ./copy_partfile.sh filename 0, where filename gives the filename that is passed to one of the --load-* or --save-* files, and the 0 corresponds to copying the scratch drive (and 1 corresponds to copying from the scratch drive). This script works with both OpenMPI and within a SLURM task, though it should be easy to generalize.

A more full workflow can be given by, for saving:

# generate eigenvectors, saving them to on-node scratch storage
mpirun -np 8 ./staggered_eigensolve_test [...] --eig-save-vec /raid/scratch/l32 --eig-save-partfile true

# copy from the scratch storage
mpirun -np 8 ./copy_partfile.sh 1 l32

And for loading, using them for a solve:

# copy to the scratch storage
mpirun -np 8 ./copy_partfile.sh 0 l32

# Run a deflated solve
mpirun -np 8 ./staggered_invert_test [...] --inv-deflate true --eig-load-vec /raid/scratch/l32

Notes on MG

The input/output filenames for MG solves have extra metadata appended within the multigrid solve that needs to be added to the argument to --mg-{load/save}-vec [#] [filename]. This is important for copying PARTFILE from/to scratch disks as described above.

The format for near-nulls is given by [filename]_level_[mg level]_nvec_[number of near-nulls]; for coarsest-level eigenvectors replace nvec with defl.

Clone this wiki locally