File utilities designed for scalability and performance.
C Shell Python M4 Makefile Roff Other
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.github Add CONTRIBUTING.md Jul 10, 2017
auxdir remove more autotools cruft Jan 24, 2017
devops Use current vagrant box. Jan 14, 2014
doc experimental tools moved to sub-tree Apr 18, 2018
experimental Applying format ---Signed-off-by Ahana Roy Choudhury Oct 10, 2017
m4 no need for pandoc Apr 16, 2018
src dcmp: skip the size differ check for dir Apr 17, 2018
test Merge pull request #42 from hpc/sikich1/sync Jun 14, 2017
.gitignore ignore generated files Apr 16, 2018
.travis.yml removing email notifications Jan 11, 2017
AUTHORS Add Giuseppe Di Natale to AUTHORS Mar 28, 2017
CHANGELOG Initial changelog. Dec 13, 2013
CMakeLists.txt Replaced malloc, free, MPI_Finalize, exit() with MFU_ALLOC, mfu_free,… Oct 10, 2017
LICENSE add LANL contract number Jul 11, 2015
Makefile.am don't call from top level makefile Apr 16, 2018
README.md point readme to actual docs Apr 18, 2018
astyle.options insert space in if,while,for and condition, drop break-blocks setting… May 21, 2014
autogen.sh Added autogen script. Dec 5, 2013
builddist enable lustre by default in builddist Jan 28, 2017
buildme add buildme_dev for developers, create buildme for end users Jun 30, 2017
buildme_autotools updated build scripts for EA systems May 31, 2017
buildme_dependencies updated build scripts for EA systems May 31, 2017
buildme_dependencies.git grab dist tarballs of dependencies instead of git checkouts (avoid au… Aug 1, 2014
buildme_dev add buildme_dev for developers, create buildme for end users Jun 30, 2017
buildrpm add build rpm support Mar 6, 2017
common.mk More build system details. I'm going to merge this before I attempt a Dec 5, 2013
configure.ac no need for pandoc Apr 16, 2018
libtool Fixed up build. Added checking for dtcmp. Added dcmp back into the Dec 6, 2013
makedist add makedist script, update buildme for new --with-dtcmp option Apr 18, 2015
mpifileutils.spec add build rpm support Mar 6, 2017
mpifu_venv.sh fixing print error in venv script Jul 28, 2017

README.md

mpiFileUtils

mpiFileUtils is a suite of MPI-based tools to manage large datasets, which may vary from large directory trees to large files. High-performance computing users often generate large datasets with parallel applications that run with many processes (millions in some cases). However those users are then stuck with single-process tools like cp and rm to manage their datasets. This suite provides MPI-based tools to handle typical jobs like copy, remove, and compare for such datasets, providing speedups of up to 20-30x. Documentation is available on ReadTheDocs and in this repo:

  • dbcast - Broadcast files to compute nodes.
  • dchmod - Change permissions and group access on files.
  • dcmp - Compare files.
  • dcp - Copy files.
  • dfilemaker - Generate random files.
  • drm - Remove files.
  • dstripe - Restripe files.
  • dwalk - List files.

Experimental Utilities

To enable experimental utilities, run configure with the enable experimental option.

./configure --enable-experimental

  • dfind - Find files by path name (experimental).
  • dgrep - Search contents of files (experimental).
  • dparallel - Perform commands in parallel (experimental).
  • dtar - Create file tape archives (experimental).

Build

mpiFileUtils depends on several libraries. To simplify builds, there are two scripts: buildme_dependencies and buildme. The buildme_dependencies script downloads and installs all the necessary libraries. The buildme script then builds mpiFileUtils assuming the libraries have been installed. Both scripts require that mpicc is in your path, and that it is for an MPI library that supports at least v2.2 of the MPI standard. Please review each buildme script, and edit if necessary. Then run them in sequence:

./buildme_dependencies

./buildme

Alternatively, mpiFileUtils is available in Spack, which simplifies the install to just:

spack install mpifileutils

or to enable all features:

spack install mpifileutils +lustre +experimental

Build Status

The current status of the mpiFileUtils master branch is Build Status.