Skip to content

phy6boy/julia-on-hpc-systems

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Julia on HPC systems

The purpose of this repository is to document best practices for running Julia on HPC systems (i.e., "supercomputers"). At the moment, both information relevant for supercomputer operators as well as users is collected here. There is no guarantee for permanence or that information here is up-to-date, neither for a useful ordering and/or categorization of issues.

For operators

Official Julia binaries vs. building from source

According to this Discourse post, the difference between compiling Julia from source with architecture-specific optimization and using the official Julia binaries is negligible. This has been confirmed by Ludovic Räss for an Nvidia DGX-1 system at CSCS, where also no performance differences between a Spack-installed version and the official binaries were found (April 2022).

Since installing from source using, e.g., Spack, can sometimes be cumbersome, the general recommendation is to go with the pre-built binaries unless benchmarked and found to be different.

  • This is also the current approach on NERSC's systems

Last update: April 2022

Ensure correct libraries are loaded

When using Julia on a system that uses an environment-variable based module system (such as modules or Lmod), the LD_LIBRARY_PATH variable might be filled with entries pointing to different packages and libraries. To avoid issues from Julia loading another library instead of the ones packaged with Julia, make sure that Julia's lib directory is always the first directory in LD_LIBRARY_PATH.

One possibility to achieve this is to create a wrapper shell script that modifies LD_LIBRARY_PATH before calling the Julia executable. Inspired by a script from UCL's Owain Kenway:

#!/usr/bin/env bash

# This wrapper makes sure the julia binary distributions picks up the GCC
# libraries provided with it correctly meaning that it does not rely on
# the gcc-libs version.

# Dr Owain Kenway, 20th of July, 2021
# Source: https://github.com/UCL-RITS/rcps-buildscripts/blob/04b2e2ccfe7e195fd0396b572e9f8ff426b37f0e/files/julia/julia.sh

location=$(readlink -f $0)
directory=$(readlink -f $(dirname ${location})/..)

export LD_LIBRARY_PATH=${directory}/lib/julia:${LD_LIBRARY_PATH}
exec ${directory}/bin/julia "$@"

Note that using readlink might not be optimal from a performance perspective if used in a massively parallel environment. Alternatively, hard-code the Julia path or set an environment variable accordingly.

Also note that fixing the LD_LIBRARY_PATH variable does not seem to be a hard requirement, since it is not used universally (e.g., it is not necessary on NERSC's systems).

Last update: April 2022

Julia depot path

There is no clear consensus where the Julia depot folder (by default on Unix-like systems: ~/.julia) should be located. On some systems that have good I/O connectivity, it resides in the user's home directory, e.g., at NERSC. On other systems, e.g., at CSCS, it is put on a scratch file system. At the time of writing (April 2022), there does not seem to be reliable performance data available that could help to make a data-based decision.

If the depot path, which can be controlled by the JULIA_DEPOT_PATH variable, is located on a scratch/workspace file system with automatic deletion of unused files, it must be ensured that there is a mechanism (either operator-provided or documented and in userspace) to prevent the deletion of files. In case multiple platforms share a single home directory, it might make sense to make the depot path platform dependend by setting the JULIA_DEPOT_PATH environment variable appropriately, e.g.,

prepend-path JULIA_DEPOT_PATH $env(HOME)/.julia/$platform

where $platform contains the current system name (source).

MPI.jl

On the NERSC systems, there is a pre-built MPI.jl for each programming environment, which is loaded through a settings module. More information on the NERSC module file setup can be found here.

CUDA.jl

It seems to be generally advisable to set the environment variable

JULIA_CUDA_USE_BINARYBUILDER=false

in the module files when loading Julia on a system with GPUs. Otherwise, Julia will try to download its own BinaryBuilder.jl-provided CUDA stack, which is typically not what you want on a production HPC system. Instead, you should make sure that Julia finds the local CUDA installation by setting relevant environment variables (see also the CUDA.jl docs).

Modules file setup

Johannes Blaschke provides scripts and templates to set up modules file for Julia on some of NERSC's systems:
https://gitlab.blaschke.science/nersc/julia/-/tree/main/modulefiles

There are a number of environment variables that should be considered to be set through the module mechanism:

Easybuild resources

Samuel Omlin and colleagues from CSCS provide their Easybuild configuration files used for Piz Daint online at https://github.com/eth-cscs/production/tree/master/easybuild/easyconfigs/j/Julia. For example, there are configurations available for Julia 1.7.2 and for Julia 1.7.2 with CUDA support. Looking at these files also helps to decide which kind of environment variables are useful to set.

Further resources

For users

HPC systems with Julia support

The following is an (incomplete) list of HPC systems that provide a Julia installation and/or support for using Julia to its users:

Center System Installation Support Interactive Architecture Accelerators Documentation
CSCS Piz Daint yes ? yes Intel Xeon Broadwell + Haswell Nvidia Tesla P100 1
NERSC Cori yes ? ? Intel Xeon Haswell Intel Xeon Phi 1
NERSC Perlmutter yes yes ? AMD EPYC Milan Nvidia Ampere A100 1, 2
PC², U Paderborn Noctua 1 yes ? yes Intel Xeon Skylake Intel Stratix 10 1
PC², U Paderborn Noctua 2 ? ? ? AMD EPYC Milan Nvidia Ampere A100, Xilinx Alveo U280 1

Nomenclature:

  • Center: The HPC center's name
  • System: The compute system's "marketing" name
  • Installation: Is there a pre-installed Julia configuration available?
  • Support: Is Julia officially supported on the system?
  • Interactive: Is interactive computing with Julia supported?
  • Architecture: The main CPU used in the system
  • Accelerators: The main accelerator (if anything) in the system
  • Documentation: Links to documentation for Julia users

Authors

Acknowledgments

These people have provided valuable input to this repository via private communication:

Disclaimer

Everything is provided as is and without warranty. Use at your own risk!

About

Information on how to set up Julia on HPC systems

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published