LCM with Kokkos CUDA support on Shannon

Glen Hansen edited this page Nov 23, 2015 · 1 revision

Building Albany/LCM on Shannon with CUDA

The purpose of this guide is to try and explain and make sense of the build process for Albany/LCM for CUDA on the Shannon GPU cluster. The process is quite complicated and many modules are not updated to compile for GPUs.

Environment Setup

The first thing we will do is set up our environment. All of the following commands can easily be added to a shell script or just executed before compiling. In my scripts I sourced an sh file called "" that I used for environment settings.

First, we need to load the required modules:

module purge
module add cmake/
module add openmpi/1.8.4/gnu/4.7.2/cuda/7.0.28
module add libmpc/1.0.1
module add nvcc-wrapper/gnu

Trilinos will need Cmake version 2.8.11 in order to compile. Additionally, we want to use OpenMPI here with GCC version 4.7.2. While Shannon has newer versions of GCC including 5.10, the version of CUDA installed on the cluster does not necessarily support those versions of GCC. MPC is a library required by Trilinos that is available on the cluster. Finally, nvcc-wrapper is a utility that wraps compiler calls to forward properly to either GCC or NVCC.

Next we set our install prefix variable:

mkdir -p install

Next we need to export the compiler environment variables that are needed for building the various dependencies of Albany/LCM. First the compiler search paths:

export CPATH=$CPATH:$PREFIX/include

Now we need to set which compilers should be used. We want to compile everything with mpicc except for the CUDA code. To do this we will set the NVCC_WRAPPER_DEFAULT_COMPILER environment variable for the nvcc-wrapper script. This determines what compiler will be used for C and C++ files rather than Cuda files.

export CC=mpicc
export CXX=mpicxx
export FC=mpif90

###Compiling Dependencies

The Trilinos packages we need for Albany require a few dependencies. This document will only show how to obtain the dependencies specifically required to build LCM.

Note: On Shannon you may not be able to use wget or git clone; if this is the case then you should download these packages and transfer them over using sftp or scp.


Boost is used throughout Trilinos. It's mostly a header only library so very little will need to be compiled. I've tested this with the latest version of boost at the time of writing, 1.58.

tar -xjf boost_1_58_0.tar.bz2

Once you have it downloaded, cd into the directory and run with the following arguments:

./ --with-libraries=system,program_options --prefix=$PREFIX

This configures boost to only build the specified libraries. You can omit the argument if you want but you will then build the entirety of boost which takes a long time. --prefix sets the installation directory. So if you set $PREFIX to install/ earlier then when you install boost it will place the header files in install/include/, the library files in install/lib/, and so on.

Once configuring is done we can build boost and install it:

./b2 -j 8
./b2 install

This will run boost's bjam build system on 8 processors and install it to $PREFIX.


Next we will obtain and build zLib:

tar -xzf zlib-1.2.8.tar.gz
pushd zlib-1.2.8
./configure --prefix=$PREFIX
make -j8
make install


HDF5 is built similarly:

tar -xjf hdf5-1.8.15-patch1.tar.bz2
pushd hdf5-1.8.15-patch1
./configure --prefix=$PREFIX --enable-parallel
make -j8
make install


You know the drill...

tar -xzf netcdf-
pushd netcdf-
./configure --prefix=$PREFIX
make -j8
make install


Shannon has the Intel MKL package installed, so you can use that for BLAS and LAPACK if you want. I'll show you how to build the default LAPACK implementation from source though. It won't be as fast or as optimized as MKL however.

tar -xzf lapack-3.5.0.tgz
pushd lapack-2.5.0
mkdir -p build
cd build
make -j8 install

###Building Trilinos

We are now finally ready to build Trilinos.

First, clone the git repository for the latest version of Trilinos. In my script I also run git pull in order to make sure the current version is the latest.

git clone
pushd publicTrilinos
mkdir -p build
cd build

Now we run cmake on Trilinos similarly to how we did it for LAPACK. Trilinos has a lot of CMake cache variables so it is recommended to run this with a shell script. I will be breaking up the cmake command into several parts. Each line of the command needs the continuation symbol \ at the end, so don't forget that.

Note: I'm separating out the command into multiple parts for readability. The entire cmake command should be run at once.

cmake \
-DTPL_ENABLE_Thrust=On \

This part of the command enables specific third party libraries, MPI, CUDA, and Thrust that we want to build with. Thrust is necessary for Kokkos if CUDA is enabled. On Shannon, Thrust comes with CUDA so you don't need to install it.

-DKokkos_ENABLE_Cuda_UVM=ON \

First, we should not enable all packages. This will dramatically reduce compile times. Furthermore, some Trilinos packages conflict or do not work at all so this will probably also break your build if you do not set it.

We also need to enable UVM for CUDA. Kokkos and some Trilinos libraries like Phalanx use UVM to assign data in Kokkos views.

In order to reduce memory use while compiling, avoiding annoying other people on the cluster, we enable explicit template instantiation.

Disabling Trilinos development mode will disable pedantic compiling and (I think) warnings as errors. Trilinos has a lot of warnings when building so I would not recommend enabling this flag since it may break your build.

Finally, it's necessary to disable asserting on missing packages. Trilinos contains references to packages that don't actually exist, so enabling this flag would break your build.

-DBoost_INCLUDE_DIRS=$PREFIX/include \
-DBoostLib_INCLUDE_DIRS=$PREFIX/include \

These flags just set the location of boost. Shannon already has boost installed but it may be the wrong version. I recommend installing the version of boost I specify in this guide.

-DTrilinos_ENABLE_Piro=ON \
-DTrilinos_ENABLE_Phalanx=ON \
-DTrilinos_ENABLE_ThyraTpetraAdapters=On \ 
-DTrilinos_ENABLE_STKIO=On \
-DTrilinos_ENABLE_STKMesh=On \
-DTrilinos_ENABLE_STKClassic=Off \
-DTrilinos_ENABLE_STKDoc_tests=Off \
-DTrilinos_ENABLE_Teko=Off \ 

First, we want to disable debug mode. If you want to enable it to get asserts you can do that and then also set -DCMAKE_BUILD_TYPE=Debug.

Next we enable and disable specific libraries that Albany/LCM require. Piro, Phalanx, and ThyraTpetraAdapters are all required. The Albany makefile claims STKIO and STKMesh are not required but they actually are. Do not include all of STK. This will break your build. Additionally, explicitly disable STKClassic and STKDoc_tests as they will break your build. Finally, for some reason Teko doesn't compile on Shannon. Disable it.


#Done with CMake command
make -j8 install

We finally set our install location and our source location. I like having my install location in one place; however, this can cause issues if you update Trilinos. If you are pulling from Trilinos make sure you delete the prefix folder. The reason for this is Trilinos copies all its headers in a flat directory. So Trilinos will use the headers in the prefix location when building instead of the headers in the source directory. If you update Trilinos it will use outdated headers and fail to compile unless you delete the install directory.

###Building Albany

Finally, we get to Albany.

Albany also uses CMake in a similar manner to Trilinos. First, we clone the repository:

git clone
pushd Albany
mkdir -p build
cd build

Now for the CMake command.

Note: I'm separating out the command into multiple parts again.

cmake \
-DTrilinos_DIR=$PREFIX \
-DCMAKE_CXX_FLAGS="-std=c++11" \

This sets the trilinos directory for Albany. Also it sets the CXX flags. You can add optimization flags and so on if you would like. However, you need to at least set something because if not Albany copies the Trilinos flags. This has some weird consequences such as NVCC complaining about duplicate flags and failing to compile your code. We also disable debugging but you can turn it on.


Next we enable LCM and disable other parts of Albany to speed up compile time. You can enable these if you want.


#End of CMake command
make -j8

Make sure you disable Demo PDES. There are some issues compiling those with CUDA on Shannon. I'm not exactly sure if 64 bit ints are used at all, so it may not be necessary to disable this. Finally, if you want floating point exception checking, set ENABLE_CHECK_FPE.

I hope this guide was helpful in getting LCM copmpiled with Kokkos support on Shannon.

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.