GROMACS GPU ACCELERATION USING OPENCL
Table of Contents
- Port GROMACS to OpenCL
- Make the code faster in general
- GROMACS version: 5.0. This is a cleaned-up code according to GROMACS
- OpenCL version: 1.1 for a start. Using OpenCL 1.1 would also allow running the application on NVIDIA cards and doing performance comparisons between the CUDA and the OpenCL implementation.
This project is still under development and no official release has been launched yet. Known problems and limitations can be found here. Please report any new bug you encounter by opening a new issue and label it as "bug".
The current version works with NVIDIA GPUs and GCN based AMD GPUs.
Make sure that you have the latest drivers installed.
Also check Known Limitations chapter.
Checkout from https://github.com/StreamComputing/gromacs.
The code here is a forked version of Gromacs 5.0.
The latest sources can be found here: master.
Check GROMACS website for how to build the project.
After building GROMACS, running a simulation requires several input files. Such sample files can be found here.
Here are two examples:
- Download archive gmxbench-3.0.tar.gz from the link above
- Copy conf.gro, grompp.mdp, topol.top files from d.poly-ch2 to the folder containing the gmx binary
- Run the following commands:
- gmx grompp
- gmx mdrun
- No input filename is specified when running gmx, as the actual input files have the default names
- No pdb file is present or used, so the first stages are skipped (gmx pdb2gmx)
- Download archive ADH_bench_systems.tar.gz from the link above
- Copy all files from adh_cubic folder to the folder containing the gmx binary
- Run the following commands:
- gmx grompp -f rf_verlet.mdp
- gmx mdrun
For more details about file formats go to Gromacs website.
Build Gromacs with OpenCL support enabled
To build Gromacs with OpenCL support enabled, an OpenCL SDK must be installed and the following cmake flags must be set:
- OPENCL_INCLUDE_DIR - the OpenCL include directory
- OPENCL_LIBRARY - the OpenCL library
Run Gromacs with OpenCL accelerations enabled
Gromacs loads and builds at runtime the OpenCL kernels. To do so, it needs to know the location of the OpenCL source files.
If you want to run the installed version, the path to the OpenCL files is automatically defined.
If you do not wish to install Gromacs, but run the version built from sources, you need to provide the path to the source tree with the OpenCL kernels like below:
Caching options for building the kernels
Building an OpenCL program can take a significant amount of time. NVIDIA implements a mechanism to cache the result of the build. As a consequence, only the first build of the OpenCL kernels will take longer, the following builds will be very fast. AMD drivers, on the other hand, implement no caching and building a program can be very slow. That's why we have started implementing our own caching. Caching for OpenCL kernel builds is by default enabled. To disable it, set GMX_OCL_NOGENCACHE environment variable.
If you plan to modify the OpenCL kernels, you should disable any caching: * add GMX_OCL_NOGENCACHE environment variable and set it to 1 * for NVIDIA cards: add CUDA_CACHE_DISABLE environment variable and set it to 1
OpenCL Device Selection
The same option used to select CUDA devices or to define a mapping of GPUs to PP ranks can also be used for OpenCL devices: -gpu_id
See the manual for more details about -gpu_id option.
Environment Variables For OpenCL
Currently, several environment variables exist that help customize some aspects of the OpenCL version of Gromacs. They are mostly related to the runtime compilation of OpenCL kernels, but they are also used on the device selection.
- GXM_OCL_FILE_PATH : Is the full path to Gromacs src folder. Useful when gmx is called from a folder other than the installation/bin folder.
- GMX_OCL_NOGENCACHE : Disable caching for OpenCL kernel builds.
- GMX_OCL_NOFASTGEN : Generates and compiles all algorithm flavors, otherwise only the flavor required for the simulation is generated and compiled.
- GMX_OCL_FASTMATH : Adds the option cl-fast-relaxed-math to the compiler options (in the CUDA version this is enabled by default, it is likely that the same will happen with the OpenCL version soon)
- GMX_OCL_DUMP_LOG: If defined, the OpenCL build log is always written to file. The file is saved in the current directory with the name OpenCL_kernel_file_name.build_status where OpenCL_kernel_file_name is the name of the file containing the OpenCL source code (usually nbnxn_ocl_kernels.cl) and build_status can be either SUCCEEDED or FAILED. If this environment variable is not defined, the default behavior is the following:
- Debug build: build log is always written to file
- Release build: build log is written to file only in case of errors.
- GMX_OCL_VERBOSE: If defined, it enables verbose mode for OpenCL kernel build. Currently available only for NVIDIA GPUs. See GMX_OCL_DUMP_LOG for details about how to obtain the OpenCL build log.
- GMX_OCL_DUMP_INTERM_FILES: If defined, intermediate language code corresponding to the OpenCL build process is saved to file. Caching has to be turned off in order for this option to take effect (see GMX_OCL_NOGENCACHE).
- NVIDIA GPUs: PTX code is saved in the current directory with the name device_name.ptx
- AMD GPUs: .IL/.ISA files will be created for each OpenCL kernel built. For details about where these files are created check AMD documentation for -save-temps compiler option.
- GMX_OCL_DEBUG : Use in conjunction with OCL_FORCE_CPU or with an AMD device. It adds the debug flag to the compiler options (-g).
- GMX_OCL_NOOPT : Disable optimisations. Adds the option cl-opt-disable to the compiler options.
- GMX_OCL_FORCE_CPU : Force the selection of a CPU device instead of a GPU. This exists only for debugging purposes. Do not expect Gromacs to function properly with this option on, it is solely for the simplicity of stepping in a kernel and see what is happening.
- GMX_OCL_NB_ANA_EWALD : Forces the use of analytical Ewald kernels. Equivalent of CUDA env var GMX_CUDA_NB_ANA_EWALD
- GMX_OCL_NB_TAB_EWALD: Forces the use of tabulated Ewald kernel. Equivalent of CUDA env var GMX_OCL_NB_TAB_EWALD
- GMX_OCL_NB_EWALD_TWINCUT: Forces the use of twin-range cutoff kernel. Equivalent of CUDA env var GMX_CUDA_NB_EWALD_TWINCUT
- GMX_DISABLE_OCL_TIMING: Disables timing for OpenCL operations
Build and run Gromacs with OpenCL on a Unix system
To build and run the latest OpenCL development build:
- switch to scdev branch
- mkdir build
- cd build
- cmake .. -DGMX_GPU=ON -DGMX_USE_OPENCL=ON -DCMAKE_BUILD_TYPE=Release
- provide the path to the source tree with OpenCL kernels:
- cd bin
- download archive gmxbench-3.0.tar.gz from ftp://ftp.gromacs.org/pub/benchmarks
- copy conf.gro, grompp.mdp, topol.top files from d.poly-ch2 to the bin folder
- ./gmx grompp
- ./gmx mdrun
A list with known limitations can be found here.
For details about tested configurations go to Tested Configurations
Project planning is available here: Planning