Note: This repository is no longer maintained. More and up-to-date information on using Julia for HPC can be found on the new site at https://juliahpc.github.io/JuliaOnHPCClusters/, which includes the contents of this repository.
The purpose of this repository is to document best practices for running Julia on HPC systems (i.e., "supercomputers"). At the moment, both information relevant for supercomputer operators as well as users is collected here. There is no guarantee for permanence or that information here is up-to-date, neither for a useful ordering and/or categorization of issues.
According to this Discourse post, the difference between compiling Julia from source with architecture-specific optimization and using the official Julia binaries is negligible. This has been confirmed by Ludovic Räss for an Nvidia DGX-1 system at CSCS, where also no performance differences between a Spack-installed version and the official binaries were found (April 2022).
Since installing from source using, e.g., Spack, can sometimes be cumbersome, the general recommendation is to go with the pre-built binaries unless benchmarked and found to be different. This is also the current approach taken at NERSC, CSCS, and PC2.
In June 2022, a new Julia PR was created (JuliaLang/julia#45641) that aims to add PGO (profile-guided optimization) and LTO (link-time optimization) to the Julia Makefile
. Depending on the test, compilation time improvements of up to 30% have been reported, so it might be worth checking out once merged. The performance of the compiled Julia code is unaffected though.
Last update: June 2022
When using Julia on a system that uses an environment-variable based module
system (such as modules or
Lmod), the LD_LIBRARY_PATH
variable might
be filled with entries pointing to different packages and libraries. To avoid
issues from Julia loading another library instead of the ones packaged with
Julia, make sure that Julia's lib
directory is always the first directory in
LD_LIBRARY_PATH
.
One possibility to achieve this is to create a wrapper shell script that
modifies LD_LIBRARY_PATH
before calling the Julia executable. Inspired by a
script
from UCL's Owain Kenway:
#!/usr/bin/env bash
# This wrapper makes sure the julia binary distributions picks up the GCC
# libraries provided with it correctly meaning that it does not rely on
# the gcc-libs version.
# Dr Owain Kenway, 20th of July, 2021
# Source: https://github.com/UCL-RITS/rcps-buildscripts/blob/04b2e2ccfe7e195fd0396b572e9f8ff426b37f0e/files/julia/julia.sh
location=$(readlink -f $0)
directory=$(readlink -f $(dirname ${location})/..)
export LD_LIBRARY_PATH=${directory}/lib/julia:${LD_LIBRARY_PATH}
exec ${directory}/bin/julia "$@"
Note that using readlink
might not be optimal from a performance perspective
if used in a massively parallel environment. Alternatively, hard-code the Julia
path or set an environment variable accordingly.
Also note that fixing the LD_LIBRARY_PATH
variable does not seem to be a hard
requirement, since it is not used universally (e.g., it is not necessary on NERSC's systems).
Last update: April 2022
Since the available file systems can differ significantly between HPC centers, it is hard to make a general statement about where the Julia depot folder (by default on
Unix-like systems: ~/.julia
) should be placed (via JULIA_DEPOT_PATH
).
Generally speaking, the file system hosting the Julia depot should have
- good (parallel) I/O
- no tight quotas
- read and write access
- no mechanism for the automatic deletion of unused files (or the depot should be excluded as an exception)
On some systems, it resides in the user's home directory (e.g. at NERSC). On other systems, it is put on a parallel scratch file system (e.g. CSCS and PC2). At the time of writing (April 2022), there does not seem to be reliable performance data available that could help to make a data-based decision.
If multiple platforms, e.g., systems with different architecture, would access the same Julia depot, for example because the file system is shared, it might
make sense to create platform-dependend Julia depots by setting the
JULIA_DEPOT_PATH
environment variable appropriately, e.g.,
prepend-path JULIA_DEPOT_PATH $env(HOME)/.julia/$platform
where $platform
contains the current system name
(source).
It is generally recommended to set
JULIA_MPI_BINARY=system
such that MPI.jl will always use a system MPI instead of the Julia artifact (i.e. MPI_jll.jl). For more configuration options see this part of the MPI.jl documentation.
Additionally, on the NERSC systems, there is a pre-built MPI.jl for each programming environment, which is loaded through a settings module. More information on the NERSC module file setup can be found here.
It seems to be generally advisable to set the environment variables
JULIA_CUDA_USE_BINARYBUILDER=false
JULIA_CUDA_USE_MEMORY_POOL=none
in the module files when loading Julia on a system with GPUs. Otherwise, Julia will try to download its own BinaryBuilder.jl-provided CUDA stack, which is typically not what you want on a production HPC system. Instead, you should make sure that Julia finds the local CUDA installation by setting relevant environment variables (see also the CUDA.jl docs). Disabling the memory pool is advisable to make CUDA-aware MPI work on multi-GPU nodes (see also the MPI.jl docs).
Johannes Blaschke provides scripts and
templates to set up modules file for Julia on some of NERSC's systems:
https://gitlab.blaschke.science/nersc/julia/-/tree/main/modulefiles
There are a number of environment variables that should be considered to be set through the module mechanism:
JULIA_DEPOT_PATH
: Ensure depot path is on the correct file systemJULIA_MPI_BINARY
: Use system-provided MPI backendJULIA_CUDA_USE_BINARYBUILDER
: Use system-provided CUDA stackJULIA_CUDA_USE_MEMORY_POOL
: Make CUDA-aware MPI work
Samuel Omlin and colleagues from CSCS provide their Easybuild configuration files used for Piz Daint online at https://github.com/eth-cscs/production/tree/master/easybuild/easyconfigs/j/Julia. For example, there are configurations available for Julia 1.7.2 and for Julia 1.7.2 with CUDA support. Looking at these files also helps to decide which kind of environment variables are useful to set.
- There is a lengthy discussion on the Julia Discourse about how to set up a
centralized Julia installation. Some of it is already dated (probably), but
it gives a good overview of some best practices and about approaches that work
(and some which do not). In particular, the summary from CSCS is very helpful:
https://discourse.julialang.org/t/how-does-one-set-up-a-centralized-julia-installation/13922/32 - NERSC's Johannes Blaschke has a nice repository set up with lots
of scripts and helpful information on setting up Julia on Cori and
Perlmutter:
https://gitlab.blaschke.science/nersc/julia/-/tree/main
We maintain an (incomplete) list of HPC systems that provide a Julia installation and/or support for using Julia to its users. For this, we use the following nomenclature:
- Center: The HPC center's name
- System: The compute system's "marketing" name
- Installation: Is there a pre-installed Julia configuration available?
- Support: Is Julia "officially" supported on the system, i.e., will Julia users be supported by HPC center staff if they have questions/problems?
- Interactive: Is interactive computing with Julia supported, i.e., can you run parallel jobs on the system interactively via, e.g., Jupyter notebooks?
- Architecture: The main CPU used in the system
- Accelerators: The main accelerator (if anything) in the system
- Documentation: Links to documentation for Julia users
Center | System | Installation | Support | Interactive | Architecture | Accelerators | Documentation |
---|---|---|---|---|---|---|---|
NeSI | Mahuika, Māui | ✅ | ✅ | ✅ | Intel Xeon Broadwell/Cascade Lake + AMD EPYC Milan | Nvidia Tesla P100, A100 | 1 |
Center | System | Installation | Support | Interactive | Architecture | Accelerators | Documentation |
---|---|---|---|---|---|---|---|
Carnegie Mellon College of Engineering | Arjuna, Hercules | ✅ | ✅ | ✅ | Intel Xeon+AMD EPYC Milan | Nvidia A100, Nvidia K80 | 1 |
Dartmouth College | Discovery | ✅ | ? | ✅ | Intel Xeon (various) + AMD EPYC 7532 | Nvidia V100 | 1 |
FASRC, Harvard U | Cannon | ✅ | ? | ✅ | Intel Xeon Cascade Lake | Nvidia V100, A100 | 1 |
HPC @ LLNL | various systems | ✅ | ? | ✅ | various processors | various GPUs | 1 |
NERSC | Cori | ✅ | ? | ? | Intel Xeon Haswell | Intel Xeon Phi | 1 |
NERSC | Perlmutter | ✅ | ✅ | ? | AMD EPYC Milan | Nvidia Ampere A100 | 1, 2 |
Open Science Grid | N/A | ❌ | ✅ | ? | Various | Various | 1 |
Perimeter Institute for Theoretical Physics | Symmetry | ✅ | ✅ | ✅ | AMD EPYC, Intel Xeon | Nvidia V100 | - |
Pittsburgh Supercomputing Center | Bridges-2 | ✅ | ✅ | ✅ | AMD EPYC, Intel Xeon | Nvidia V100 | 1 |
Princeton University | Several including Tiger | ✅ | ✅ | ✅ | Intel Xeon (Skylake + Broadwell) | Nvidia P100 | 1 |
There are a number of other HPC systems that have been reported to provide a Julia installation and/or Julia support, but lack enough details to be put on the list above:
- Various clusters at ANL
The contents of this repository are published under the MIT license (see LICENSE). Our main goal is to publicly curate information on using Julia on HPC systems, as a service from the community and for the community. Therefore, we are very happy to accept contributions from everyone, preferably in the form of a PR.
This repository is maintained by Michael Schlottke-Lakemper (RWTH Aachen University, Germany).
The following people have provided valuable contributions, either in the form of PRs or via private communication:
- Carsten Bauer (@carstenbauer)
- Alexander Bills (@abillscmu)
- Johannes Blaschke (@jblaschke)
- Valentin Churavy (@vchuravy)
- Steffen Fürst (@s-fuerst)
- Mosè Giordano (@giordano)
- C. Brenhin Keller (@brenhinkeller)
- Mirek Kratochvíl (@exaexa)
- Pedro Ojeda (@pojeda)
- Samuel Omlin (@omlins)
- Ludovic Räss (@luraess)
- Erik Schnetter (@eschnett)
- Dinindu Senanayake (@DininduSenanayake)
- Kjartan Thor Wikfeldt (@wikfeldt)
Everything is provided as is and without warranty. Use at your own risk!