Skip to content

Commit

Permalink
add one-node central submit script, additional README changes, conten…
Browse files Browse the repository at this point in the history
…t, and clarifications
  • Loading branch information
jlnav committed Jul 12, 2021
1 parent eacc69c commit c19749c
Show file tree
Hide file tree
Showing 2 changed files with 61 additions and 22 deletions.
56 changes: 34 additions & 22 deletions libensemble/tests/scaling_tests/ddmd/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,61 +16,73 @@ $ conda create -name new_env python=3.8
$ conda activate new_env
```

# Install libEnsemble, DeepDriveMD, and other initial dependencies
Note: Set ``export PYTHONNOUSERSITE=1`` as a preventative measure
for package conflicts in ``.local``.

# Install DeepDriveMD, and other initial dependencies

We'll be running components of DeepDriveMD as tasks within libEnsemble. The MD components
require additional dependencies.

```
$ pip install libensemble
...
$ pip install git+https://github.com/DeepDriveMD/DeepDriveMD-pipeline.git
...
$ pip install git+https://github.com/braceal/MD-tools.git
...
$ pip install git+https://github.com/braceal/molecules.git
...
$ conda install scikit-learn mpi4py pandas numpy==1.20.3
```

# Install OpenMM from source
# Install OpenMM

The binaries of OpenMM available on conda-forge or other distribution sources
were compiled with a version of CUDA that is not supported on Swing's GPUs. Therefore,
we need to build OpenMM from source with the expected CUDA version (11.0)

Helpful pointers for installing OpenMM on Swing (or other systems): https://gist.github.com/lee212/4bbfe520c8003fbb91929731b8ea8a1e

Load the following modules:
1) gcc/9.2.0-r4tyw54
2) cuda/11.0.2-4szlv2t

The drivers on Swing's GPUs expect CUDA 11.0. Other versions from the other modules won't work for this purpose.
3) cmake

Obtain the source code from: https://github.com/openmm/openmm/releases/tag/7.5.1

Do the following:

```
$ mkdir build_openmm
$ mkdir install_openmm
$ conda install cython swig doxygen
...
$ cd build_openmm
$ ccmake -i ../openmm-7.5.1/
```

Follow the instructions from here: http://docs.openmm.org/7.1.0/userguide/library.html#compiling-openmm-from-source-code

Notes:

1) Use the ``cmake`` module for cmake: ``module load cmake; cmake -i ..``
2) See http://docs.openmm.org/7.1.0/userguide/library.html#other-required-software for instructions on compiling the Python API wrappers. SWIG and Doxygen will need to be downloaded and installed separately.
3) In the event you receive an error regarding ``CUDA_CUDA_LIBRARY`` being set to ``NOTFOUND``,
set it to ``/gpfs/fs1/soft/swing/spack-0.16.1/opt/spack/linux-ubuntu20.04-x86_64/gcc-9.2.0/cuda-11.0.2-4szlv2t/lib64/stubs``.
1) In the event you receive an error regarding ``CUDA_CUDA_LIBRARY`` being set to ``NOTFOUND``,
set it (under Advanced Options) to ``/gpfs/fs1/soft/swing/spack-0.16.1/opt/spack/linux-ubuntu20.04-x86_64/gcc-9.2.0/cuda-11.0.2-4szlv2t/lib64/stubs``. If the option doesn't persist, append the above to your ``PATH``.
2) The first time running ``ccmake``, the initial set of options may not be very large, but configure anyway. A subsequent set of options will appear for configuration. Make sure that ``CMAKE_INSTALL_PREFIX``, ``DOXYGEN_EXECUTABLE``, ``PYTHON_EXECUTABLE``, ``SWIG_EXECUTABLE``, and others are accurate.
4) A configuration warning stating ``Could NOT find OPENCL (missing: OPENCL_LIBRARY)`` can be ignored.

# Install libEnsemble

```
$ pip install libensemble
```

# Executing the test

OpenMM, libEnsemble, DeepDriveMD, and all other components must be installed first
into a conda environment. See above.
The test can be found in ``libensemble/libensemble/tests/scaling_tests/ddmd``,
whereever libEnsemble was installed.

Feel free to adjust ``'sim_max'`` or ``sim_specs['user']['sim_length_ns']`` to customize
Feel free to adjust ``MD_BATCH_SIZE``, ``'sim_max'`` or ``sim_specs['user']['sim_length_ns']`` to customize
the length of the routine.

Currently, ``swing_submit_central.sh`` is the only batch submission script known to work.
Adjust the account and number of workers within this file, then run ``sbatch`` on it
to submit ``run_libe_mdml.py`` to the scheduler.

## Getting started locally

We recommend creating a new Python environment and installing each of the necessary
components by a process similar to that listed above for Swing.

Running the test locally should then be as simple as ``python run_libe_ddmd.py --comms local --nworkers N``
or ``mpiexec -n N python run_libe_ddmd.py``. ``sim_specs['user']['sim_length_ns']`` may need adjusting
to run much quicker.
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#!/bin/bash -x
#SBATCH --job-name=libE-test
#SBATCH --account=STARTUP-USERNAME
#SBATCH --nodes=1
#SBATCH --gres=gpu:4
#SBATCH --time=00:45:00

# Make sure conda and environment are loaded and activated before sbatch

module load gcc
module load cuda/11.0.2-4szlv2t

export EXE=run_libe_ddmd.py
export NUM_WORKERS=4

export PYTHONNOUSERSITE=1

python $EXE --comms local --nworkers $NUM_WORKERS

echo The command is: $cmd
echo End PBS script information.
echo All further output is from the process being run and not the script.\n\n

$cmd

# Print the date again -- when finished
echo Finished at: `date`

0 comments on commit c19749c

Please sign in to comment.