forked from hpc/libhio
-
Notifications
You must be signed in to change notification settings - Fork 0
libhio is a library intended for writing data to hierarchical data store systems.
License
hjelmn/libhio
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
# -*- Mode: sh; sh-basic-offset:2 ; indent-tabs-mode:nil -*- # # Copyright (c) 2014-2016 Los Alamos National Security, LLC. All rights # reserved. # $COPYRIGHT$ # # Additional copyrights may follow # # $HEADER$ # HIO Readme ========== Last updated 2016-04-06 See file NEWS for a description of changes to hio. Note that this README refers to various LANL clusters that have been used for testing HIO. Using HIO in other environments may require some adjustments. Building -------- HIO builds via a standard autoconf/automake build. So, to build: 1) Untar 2) cd to root of tarball 3) module load needed compiler or MPI environment 4) ./configure 5) make Additional generally useful make targets include clean and docs. make docs will build the HIO API document, but it requires doxygen and various latex packages to run, so you may prefer to use the document distributed in file design/libhio_api.pdf. Our target build environments include gcc with OpenMPI on Mac OSX for unit test and gcc, Intel and Cray compilers on LANL Cray systems and TOSS clusters with OpenMPI or Cray MPI. Included with HIO is a build script named hiobuild. It will perform all of the above steps in one invocation. The HIO development team uses it to launch builds on remote systems. You may find it useful; a typical invocation might look like: ./hiobuild -c -s PrgEnv-intel PrgEnv-gnu hiobuild will also create a small script named hiobuild.modules.bash that can be sourced to recreate the module environment used for build. Testing ------- HIO's tests are in the test subdirectory. There is a simple API test named test01 which can also serve as a coding example. Additionally, other tests are named run02, run03, etc. Theses tests are able to run in a variety of environments: 1) On Mac OSX for unit testing 2) On a non-DataWarp cluster in interactive or batch mode 3) On one of the Trinity systems with DataWarp in interactive or batch mode run02 and run03 are N-N and N-1 tests (respectively). Options help can be displayed by invoking with a -h option. These tests use a common script named run_setup to process options and establish the testing environment. They invoke hio using a program named xexec which is driven by command strings contained in the runxx test scripts. A typical usage to submit a test DataWarp batch job on the small LANL test system named buffy might look like: cd <tarball>/test ./run02 -s m -r 32 -n 2 -b Options used: -s m ---> Size medium (200 MB per rank) -r 32 ---> Use 32 ranks -n 2 ---> Use 2 nodes -b ---> Submit a batch job The runxx tests will use the hiobuild.modules.bash files saved by hiobuild (if available) to reestablish the same module environment used at build time. A multi-job submission script to facilitate running a large number of tests with one command is available. A typical usage for a fairly thorough test on a large system like Trinity might look like: run_combo -t ./run02 ./run03 ./run12 -s x y z -n 32 64 128 256 512 1024 -p 32 -b This will submit 54 jobs (3 x 3 x 6) with all combinations of the specified tests and parameters. The job scripts and output will be in the test/run subdirectory. Simple DataWarp Test Job ------------------------ The HIO source contains a script test/dw_simple_sub.sh that will submit a simple, small scale test job on a system with Moab/DataWarp integration. See the comments in the file for instructions and a more detailed description. Step by step procedure for building and running HIO tests on LANL system Trinity -------------------------------------------------------------------------------- This procedure is accurate as of 2016-03-02 with HIO.1.2.0.4. 1) Get the distribution tarball libhio-1.2.0.1.tar.gz from oine of the following: a) tr-login1:~cornell/hio/libhio-1.2.0.1.tar.gz b) yellow /usr/projects/hio/user/rel/libhio-1.2.0.1.tar.gz c) By request from Cornell Wright - cornell@lanl.gov d) Download from Github << need location >> 2) Untar 3) cd <dir>/libhio-1.2 ( <dir> is where you untarred HIO ) 4) ./hiobuild -cf -s PrgEnv-cray,PrgEnv-gnu -l craype-haswell At the end of the build you will see: nid00070 ====[HIOBUILD_RESULT_START]===()=========================================== nid00070 buildhio : Checking /cray_home/cornell/libhio/libhio-1.2/hiobuild.out for build problems 41:configure: WARNING: using cross tools not prefixed with host triplet 268:Warning: nid00070 buildhio : Checking for build target files nid00070 buildhio : Build errors found, see above. nid00070 ====[HIOBUILD_RESULT_END]===()============================================= Ideally, the two warning messages would not be present, but at the moment, they can be ignored. 5) cd test 6) ./run_combo -t ./run02 ./run03 ./run12 ./run20 -s z y x -n 1024 512 256 128 64 32 16 -p 32 -b This will create 84 job scripts in the libhio-1.2/test/run directory and submit the jobs. Msub messages are in the cooresponding .jobid files in the same directory. Job output is directed to corresponding .out files. The number and mix of jobs is controlled by the parameters. Issue run_combo -h for more information. 7) After the jobs complete, issue the following: grep -c "RESULT: SUCCESS" run/*.out If all jobs ran OK, grep should show 84 files with a count of 1. Like this: cornell@tr-login1:~/pgm/hio/tr-gnu/libhio-1.2/test> grep -c "RESULT: SUCCESS" run/*.out run/job.20160108.080917.out:1 run/job.20160108.080927.out:1 run/job.20160108.080936.out:1 run/job.20160108.081422.out:1 . . . . run/job.20160108.082133.out:1 run/job.20160108.082141.out:1 Investigate any missing job output or counts of 0. 8) Resources for better understanding and/or modifying these procedures: libhio-1.2/README libhio-1.2/hiobuild -h libhio-1.2/test/run_combo -h libhio-1.2/test/run_setup -h libhio-1.2/test/run02, run03, run12, run20 libhio-1.2/test/xexec -h libhio-1.2/design/libhio_api.pdf 9) Additional test commands, check the results the same way as above. Very simple small single job Moab/DataWarp test: ./run02 -s t -n 1 -r 1 -b Alternate multi job test suitable for small test system Gadget: ./run_combo -t ./run02 ./run03 ./run12 ./run20 -s t s m l -n 1 2 4 1 2 4 -p 32 -b Additional many job submission contention test ./run90 -p 5 -s t -n 1 -b This test submits two jobs that each submit two additional jobs. Job submission continues until the -p parameter is exhausted. So, the total number of jobs is given by (p^2) - 2. Be cautious about increasing the -p parameter. Since this is only a job submission test, the normal scan for RESULT: SUCCESS is not applicable. Simply wait for the queue to empty and look for the expected number of .sh and .out files in the run directory. If there are any .sh files without corresponding .out files, look for errors via checkjob -v on the job IDs in the .jobid file. --- End of README ---
About
libhio is a library intended for writing data to hierarchical data store systems.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- C 78.0%
- Shell 12.5%
- M4 6.5%
- TeX 2.2%
- Makefile 0.8%