Grigori Fursin edited this page Dec 9, 2018 · 55 revisions

Workflow authors and contributors

  • Grigori Fursin, dividiti/cTuning foundation (CK workflow implementation)
  • Flavio Vella, Free University of Bozen-Bolzano (CK workflow implementation)
  • Stephen Herbein, LLNL (testing and Flux)
  • Todd Gamblin, LLNL (spack feedback)
  • Kenneth Hoste, Ghent University (EasyBuild feedback)
  • Damian Alvarez, Jülich Supercomputing Centre (testing and EasyBuild feedback)
  • Carsten Uphoff, TUM (testing and feedback)
  • Michael Bader, TUM (testing and feedback)

Introduction

This repository contains a proof-of-concept Collective Knowledge workflow to automate installation, execution and customization of SeisSol application from the SC18 Student Cluster Competition Reproducibility Challenge across different platforms, environments and datasets. See CK motivation about our concept to automatically generate reproducible articles with portable workflows and reusable research components.

Note that this is an ongoing and evolving project!

Related resources

Installing and customizing CK

First you need to install Collective Knowledge framework (CK) as described here.

If you have never used CK, we also suggest you to check this CK getting started guide.

You may also want to check how to customize your CK installation. For example, you can force CK to install all packages (code, data sets and models) to your scratch file system instead of the default ${HOME}/CK-TOOLS by specifying the path using environment variable "CK_TOOLS"! You can also specify where to install all CK repositories instead of the default ${HOME}/CK using environment variable "CK_REPOS".

Note that CK is a continuously evolving community project similar to Wikipedia, so if you don't like something or something is not working, please do not hesitate to send your feedback to the public mailing list, open tickets in related CK GitHub repositories, or even contribute patches, updates, new workflows and research components!

Installing CK workflow for SeisSol

$ ck pull repo:ck-scc18

Note that since we strongly encourage reuse of shared code and data, CK will automatically check dependencies on other CK repositories based on this .ckr.json, and will also install them in a user environment. In this way, we a user can take advantage of all other CK modules, software detection plugins and packages shared by the community!

You can find where CK stores all such repositories as follows:

$ ck find repo:ck-scc18

Updating this workflow and dependencies

You can update all CK components (including CK framework if you installed it from GitHub) at any time as follows:

$ ck pull all --kernel

If you want to re-detect and re-install all software after updating CK, just remove the CK environment as follows:

$ ck clean env:*

You can turn off interactive mode by adding '-f' flag:

$ ck clean env:* -f

Testing SeisSol proxy application via CK

Now you can try to automatically build and run a simple 1-node SeisSol proxy application to understand how portable CK program workflow works (select "seissol-proxy" from the list of available command lines):

$ ck run program:seissol-proxy

CK concept of portable workflows is to let users describe software dependencies (libraries, frameworks, data sets, models) required to build and run their applications using simple semantic tags and version ranges. See example of such dependencies in this CK meta.json file for the above program workflow.

CK will then attempt to automatically detect all dependencies using shared CK software plugins and performing an exhaustive search in system and user directories. Whenever multiple acceptable versions of a given dependency are found, CK will ask you to choose which one to use. Based on your selection, CK will then register this version in the CK "local" repository (CK scratch pad) using "env" entry, and will create a simple batch "env.sh" file with pre-set PATH, LD_LIBRARY_PATH and other environment variables.

You can see all detected and registered dependencies as follows:

$ ck show env

You can also prune this search using semantic tags:

$ ck show env --tags=compiler

Next time you run the same workflow, CK will attempt to resolve dependencies using "env" entries instead of searching in system and user paths. Furthermore, other CK workflows can also reuse the same dependencies.

The pros of such approach is that you can automatically adapt your workflow to any environment. This is particularly important for research workflows which should be able to run on continuously evolving software and hardware.

However the cons is that such workflows may be prepared with some untested dependencies and thus fail. In such case, the CK concept is to let the community collaboratively improve such shared workflows and all related components in spirit of Wikipedia and similar to any agile development in open source projects.

For example, our LLNL colleagues noticed some unexpected behavior in the python detection plugin and opened this https://github.com/ctuning/ck-env/issues/85 which we collaboratively resolved shortly afterwords.

Furthermore, you can use Spack and EasyBuild to pre-install software dependencies on your platform. You can then force CK to search for software dependencies only in specific paths as follows:

$ ck set kernel var.soft_search_dirs="{path1 to already installed dependencies},{path2}..."

Finally, if required software is still not found, CK will attempt to automatically install missing software via CK packages which are just wrappers with a unified API to other build tools and package managers including spack, easybuild, make, cmake, scons, etc.

For example, see the meta.json of the SeisSol Proxy Library with other software sub-dependencies.

If you encounter any problems during building and fix the, you can just restart the same workflow until it detects or rebuilds all dependencies and runs this code:

$ ck run program:seissol-proxy

You can also customize the execution of the proxy app as follows:

$ ck run program:seissol-proxy --cmd_key=seissol-proxy \
                               --env.CELLS={number of cells} \
                               --env.TIMESTEP={time step} \
                               --env.KERNELS={all|local|neigh|ader|localwoader|neigh_dr|godunov_dr}

If you still experience troubles or don't understand something, do not hesitate to open tickets or get in touch with the CK community using this public Google group!

Installing SeisSol proxy library manually

When troubleshooting workflows, you may want to install dependencies manually. For example, you can install SeisSol proxy application using CK package before running workflows as follows:

$ ck install package:lib-seissol-proxy-scc18 --reuse_deps

You can also restart installation without downloading this library as follows:

$ ck install package:lib-seissol-proxy-scc18 --reuse_deps --rebuild

After you manage to successfully install it, you can run the workflow:

ck run program:seissol-proxy

Note that by default we use OpenMPI and older versions of other dependencies which do not give you the best performance but serve more as a stable proof-of-concept of experiment automation.

Our future work includes adding dependencies and optimization parameters to obtain the best performance across different supercomputers based on SCC18 submissions.

Preparing and running SeisSol MPI workflow

Installing and parameterizing SeisSol library

Since MPI version of SeisSol requires more dependencies and parameterization, we suggest you to install and parameterize this library via CK as follows:

$ ck install package:lib-seissol-scc18 --reuse_deps \
             --env.CK_SEISSOL_TARGET_ARCH={d|s}{noarch|wsm|snb|knc|hsw|knl} \
             --env.CK_SEISSOL_COMPILE_MODE={release|debug} \
             --env.CK_SEISSOL_ORDER=6 \
             --env.CK_SEISSOL_LOG_LEVEL=error \
             --env.CK_SEISSOL_LOG_LEVEL0=info

or to install it with default values:

$ ck install package:lib-seissol-scc18 --reuse_deps
  • CK_SEISSOL_TARGET_ARCH=dsnb
  • CK_SEISSOL_COMPILE_MODE=release

You can also install several libraries with different parameters at the same time. Just add --extra_version flag with some identifier to the above command line such as:

$ ck install package:lib-seissol-scc18 --reuse_deps \
             --env.CK_SEISSOL_COMPILE_MODE=debug \
             --env.CK_SEISSOL_ORDER=4 \
             --extra_version=my-debug-cfg-order-4

Running SeisSol MPI

You can now try to run SeisSol program (see related CK entry). If you don't use any batch system, you can run it via CK as follows:

$ ck run program:seissol-netcdf --cmd_key=mpi \
             --env.MPI_NUM_PROCESSES={number of processes (must be multiple of 20 when used with the default data set)} \
             --env.OMP_NUM_THREADS={number of OpenMP threads}

Note that CK will automatically download a CK dataset package "dataset-seissol-sumatra-andaman-2004" from Zenodo which requires around 700MB of free space:

You may need the following information about this mesh when customizing SeisSol execution:

fursin@velociti:~/CK/ck-scc18/program/seissol-netcdf/tmp$ /home/fursin/CK/local/env/6c8d46509d02e704/install/bin/ncdump -h 1003_topo30sec_wSplays_sim5_pumgenSM2.dtc1-v2-suma.20.nc

netcdf \1003_topo30sec_wSplays_sim5_pumgenSM2.dtc1-v2-suma.20 {
dimensions:
        dimension = 3 ;
        partitions = 20 ;
        elements = 263993 ;
        element_sides = 4 ;
        element_vertices = 4 ;
        vertices = 48428 ;
        boundaries = 11 ;
        boundary_elements = 2416 ;
variables:
        int element_size(partitions) ;
        int element_vertices(partitions, elements, element_vertices) ;
        int element_neighbors(partitions, elements, element_sides) ;
        int element_boundaries(partitions, elements, element_sides) ;
        int element_neighbor_sides(partitions, elements, element_sides) ;
        int element_side_orientations(partitions, elements, element_sides) ;
        int element_neighbor_ranks(partitions, elements, element_sides) ;
        int element_mpi_indices(partitions, elements, element_sides) ;
        int element_group(partitions, elements) ;
        int vertex_size(partitions) ;
        double vertex_coordinates(partitions, vertices, dimension) ;
        int boundary_size(partitions) ;
        int boundary_element_size(partitions, boundaries) ;
        int boundary_element_rank(partitions, boundaries) ;
        int boundary_element_localids(partitions, boundaries, boundary_elements) ;
}

Note, that CK will create a tmp directory in the "program:seissol-netcdf" entry and will record SeisSol outputs and checkpoints there. You can find this directory as follows:

$ ck find program:seissol-netcdf
$ cd `ck find program:seissol-netcdf`/tmp

Furthermore, CK will compare the output with the reference results from the published paper using the following plugin:

Running SeisSol MPI with a small data set

Before running long simulations, you may want to test the workflow. You can do so by limiting simulation time as follows:

$ ck run program:seissol-netcdf --cmd_key=mpi \
         --env.MPI_NUM_PROCESSES=20 \
         --env.OMP_NUM_THREADS=8 \
         --env.LIMIT_SEISSOL_TIME=0.1

Using job managers

Slurm

You can run SeisSol via Slurm as follows:

ck run program:seissol-netcdf --cmd_key=mpi \
                                            \
              --env.JOB_MANAGER=slurm \
              --env.MPI_NUM_PROCESSES=20 \
              --env.OMP_NUM_THREADS=8 \
              --env.SBATCH_NODES=20 \
              --env.SBATCH_TIME="00:05:00" \
              --env.SBATCH_MEM="100000" \
                                           \
              --env.LIMIT_SEISSOL_TIME=0.1 \
                                           \
              --env.MPI_NUM_PROCESSES=20 \
              --env.OMP_NUM_THREADS=8

Extra parameters:

  • SBATCH_JOB_NAME (default "ck-scc18")
  • SBATCH_TIME (default "00:05:00")
  • SBATCH_NODES" (default 20)
  • SBATCH_NTASKS_PER_CORE (default 1)
  • SBATCH_NTASKS_PER_NODE (default 1)
  • SBATCH_CPU_PER_TASK (default 1)
  • SBATCH_PARTITION (default "normal")
  • SBATCH_CONSTRAINT (default "mc")
  • SBATCH_MEM (default 100000) - memory limit per node in MB

Add flag --clean to clean all SeisSol outputs and checkpoints (it removes "tmp" directory)!

You can then check the latest status of your job:

$ ck run program:seissol-netcdf --cmd_key=print-last-job-status

Finally, you can check the log of the last job:

$ ck run program:seissol-netcdf --cmd_key=print-last-job-log-and-validate-results

Flux

TBD

Customizing SeisSol workflow

Using different MPI libraries

You can install a stable OpenMPI package via CK:

$ ck install package:lib-openmpi-1.10.3-universal

You can also detect and register other MPI libraries as follows:

$ ck detect soft:lib.mpi

If required MPI library is in an unusual path, you can help CK detect it by providing a search path as follows:

$ ck detect soft:lib.mpi --search_dirs={path to MPI installation}

Next time you build SeisSol package, CK will ask you which MPI library to use:

$ ck install package:lib-seissol-scc18 --reuse_deps

Using Intel compilers and MPI

Note that you need to use Intel compilers and Intel MPI at the same time! We managed to compile and run SeisSol using 2017 and 2018 versions, but encountered some run-time errors with 2019 version!

Detecting multiple Intel compilers

If you would like to use Intel compilers in the CK program workflows, but did not yet install them, you can first download them from here. You can then automatically detect and register them via CK as follows:

$ ck detect soft --tags=compiler,icc

Note that if Intel compiler was not automatically found, you can provide a path to Intel installation as follows:

$ ck detect soft --tags=compiler,icc --search_dirs={INSTALLATION_PATH}

If automatic detection is too slow (on NFS), use the following command:

$ ck detect soft --tags=compiler,icc --full_path={full path to compilervars.sh}

Detecting Intel MPI library

Install Intel MPI lib following the instruction here or via apt

1. wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
2. apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB

3. sudo wget https://apt.repos.intel.com/setup/intelproducts.list -O /etc/apt/sources.list.d/intelproducts.list
4. sudo sh -c 'echo deb https://apt.repos.intel.com/mpi all main > /etc/apt/sources.list.d/intel-mpi.list'
5. sudo apt-get update
6. sudo apt-get install intel-mpi

7. ck detect soft --tags=lib,mpi,intel

Note that if this library is installed in an unusual path, you can help CK detect it as follows:

7. ck detect soft --tags=lib,mpi,intel --search_dirs={INSTALLATION_PATH}

If automatic detection is too slow (on NFS), use the following command:

$ ck detect soft --tags=lib,mpi,intel --full_path={full path to mpiicc}

Using stable dependencies

The main concept of CK is to automatically detect already installed dependencies for a given workflow to automatically adapt it to evolving and latest environments on diverse user machines!

While very flexible, such approach may sometimes result in selecting inappropriate and incompatible dependencies. In such case, we would also like to be able to pre-install and use stable (validated) dependencies. This can be done via spack and EasyBuild.

Installing dependencies via Spack

CK allows you to use (stable) packages installed by spack to build and customize SeisSol via CK.

If you already use spack, you can install required dependencies as follows:

$ spack install openmpi@1.10.7 %gcc
$ spack install netcdf@4.4.1 %gcc +mpi +dap -pic -shared ^openmpi@1.10.7 ^hdf5@1.10.4 +mpi +fortran -shared -pic

You can then use CK software detection plugins to register these packages for CK workflows:

$ ck detect soft:lib.mpi --extra_tags=vspack --extra_name="(spack)" --search_dir={PATH TO spack packages}
$ ck detect soft:lib.hdf5.static --extra_tags=vparallel,vmpi,vspack --extra_name="(spack)" --search_dir={PATH TO spack packages}
$ ck detect soft:lib.netcdf --extra_tags=vspack --extra_name="(spack)" --search_dir={PATH TO spack packages}

You can also update the following CK kernel variable to avoid specifying --search_dir all the time:

$ ck set kernel var.soft_search_dirs="{PATH to spack packages}"

You can then rebuild SeisSol package as before while just selecting new spack packages.

We also provided sample scripts to install spack with dependencies needed by SeisSol, and then automatically detect them by CK to be used in the program workflow. You can find them using the following command:

$ cd `ck find script:install-seissol-deps-via-spack`

It has 2 scripts:

  • ./install-deps.sh (installing spack and deps)
  • ./detect-deps.sh (registering spack packages in CK)

Note that this is an experimental functionality!

Installing dependencies via EasyBuild

We also provided a way to reuse dependencies installed via EasyBuild.

If you have packages installed by EasyBuild such as GCC, you can reuse them in a similar way as described in previous sub-section:

$ ck detect soft:compiler.gcc --extra_tags=veasybuild --extra_name="(easybuild)" --search_dir={PATH TO EasyBuild packages}
$ ck detect soft:lib.mpi --extra_tags=veasybuild --extra_name="(easybuild)" --search_dir={PATH TO EasyBuild packages}
$ ck detect soft:lib.hdf5.static --extra_tags=vparallel,vmpi,veasybuild --extra_name="(easybuild)" --search_dir={PATH TO EasyBuild packages}
$ ck detect soft:lib.netcdf --extra_tags=veasybuild --extra_name="(easybuild)" --search_dir={PATH TO EasyBuild packages}

We successfully recompiled and run SeisSol using GCC 6.4.0 from EasyBuild package installed on "Piz Daint" using EasyBuild installation notes as follows:

$ export EASYBUILD_PREFIX={PATH where EasyBuild will be installed}

$ wget https://raw.githubusercontent.com/easybuilders/easybuild-framework/develop/easybuild/scripts/bootstrap_eb.py

$ python bootstrap_eb.py $EASYBUILD_PREFIX

$ export EASYBUILD_MODULES_TOOL=EnvironmentModulesC

$ module use $EASYBUILD_PREFIX/modules/all

$ module load EasyBuild

$ eb --module-syntax=Tcl foss-2018a.eb --robot --ignore-osdeps

$ ck detect soft:compiler.gcc --extra_tags=veasybuild --extra_name="(easybuild)" --search_dir=${EASYBUILD_PREFIX}/software

We also provided sample scripts to install EasyBuild and then automatically detect GCC by CK to be used in the program workflow. You can find them using the following command:

$ cd `ck find script:install-seissol-deps-via-eb`

It has 2 scripts:

  • ./install-deps.sh (installing EasyBuild and GCC)
  • ./detect-deps.sh (registering EasyBuild GCC in CK)

Note that this is an experimental functionality!

Using different platforms to run SeisSol

Using "SuperMUC Phase 2" platform

$ ck run program:seissol-netcdf --cmd_key=mpi --env.MPI_NUM_PROCESSES=<<processes>> --env.OMP_NUM_THREADS=54 --env.KMP_AFFINITY="compact,granularity=thread"

Using "Shaheen II" platform

$ ck run program:seissol-netcdf --cmd_key=mpi --env.MPI_NUM_PROCESSES=<<processes>> --env.OMP_NUM_THREADS=62 --env.KMP_AFFINITY="compact,granularity=thread"

Using "Cori" platform

$ ck run program:seissol-netcdf --cmd_key=mpi --env.MPI_NUM_PROCESSES=<<processes>> --env.OMP_NUM_THREADS=65 --env.KMP_AFFINITY="proclist =[2-66],explicit,granularity=thread"

Using "Piz Daint" platform

We managed to compile and run SeisSol (SCC18 branch) via CK but we did not yet optimize it.

Changing other parameters

You can update the following parameters from bash

$ export XDMFWRITER_ALIGNMENT=8388608
$ export XDMFWRITER_BLOCK_SIZE=8388608
$ export SEISSOL_CHECKPOINT_ALIGNMENT=8388608
$ export SEISSOL_CHECKPOINT_DIRECT=1
$ export ASYNC_MODE=THREAD
$ export ASYNC_BUFFER_ALIGNMENT=8388608

or via CK

$ ck run program:seissol-netcdf ... --env.XDMFWRITER_ALIGNMENT=8388608 \
    --env.XDMFWRITER_BLOCK_SIZE=8388608 \
    --env.SEISSOL_CHECKPOINT_ALIGNMENT=8388608 \
    --env.SEISSOL_CHECKPOINT_DIRECT=1 \
    --env.ASYNC_MODE=THREAD \
    --env.ASYNC_BUFFER_ALIGNMENT=8388608

Running SeisSol binary without CK

You can also try to manually run SeisSol binary using CK virtual environment as follows:

$ ck virtual env --tags=lib,seissol

> cd ${CK_ENV_LIB_SEISSOL_BIN}
> ls

> echo ${CK_ENV_LIB_SEISSOL_BINARY}
> echo ${CK_ENV_LIB_SEISSOL_BINARY_FULL}
> echo ${CK_ENV_LIB_SEISSOL_MAPLE}
> echo ${CK_ENV_LIB_SEISSOL_SET}
> echo ${CK_ENV_LIB_SEISSOL_SRC}

Contacting the CK community

If you still encounter problems, please feel free to get in touch with the CK community and we will help you fix them. You feedback is very important since the whole point of CK is to continuously and collaboratively improve all shared research workflows and components thus gradually improving their stability and reproducibility across diverse platforms and environments!

Acknowledgments

We would like to thank colleagues from TUM, LLNL and UGent for very productive discussions and feedback. We are also thankful to CSCS for providing resources to test this CK workflow.

Future work

  • LLNL+dividiti: automate execution of SeisSol MPI via Flux
  • UGent+dividiti: check installation of dependencies via easybuild
  • dividiti (currently overbooked - need to find resources):
    • add public CK scoreboard to exchange results
    • automatically generate reproducible article based on this CK example
    • build and run latest SeisSol version with the new mesh structure and data sets via ck-graph-analytics repository
    • improve CK documentation with all APIs and a better guide for contributors
  • Test automatic generation of a reproducible article via CK based on this example
  • Add optimized versions from SCC18 participants
Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.