Julia on HPC systems

Note: This repository is no longer maintained. More and up-to-date information on using Julia for HPC can be found on the new site at https://juliahpc.github.io/JuliaOnHPCClusters/, which includes the contents of this repository.

The purpose of this repository is to document best practices for running Julia on HPC systems (i.e., "supercomputers"). At the moment, both information relevant for supercomputer operators as well as users is collected here. There is no guarantee for permanence or that information here is up-to-date, neither for a useful ordering and/or categorization of issues.

For operators

Official Julia binaries vs. building from source

According to this Discourse post, the difference between compiling Julia from source with architecture-specific optimization and using the official Julia binaries is negligible. This has been confirmed by Ludovic Räss for an Nvidia DGX-1 system at CSCS, where also no performance differences between a Spack-installed version and the official binaries were found (April 2022).

Since installing from source using, e.g., Spack, can sometimes be cumbersome, the general recommendation is to go with the pre-built binaries unless benchmarked and found to be different. This is also the current approach taken at NERSC, CSCS, and PC2.

In June 2022, a new Julia PR was created (JuliaLang/julia#45641) that aims to add PGO (profile-guided optimization) and LTO (link-time optimization) to the Julia Makefile. Depending on the test, compilation time improvements of up to 30% have been reported, so it might be worth checking out once merged. The performance of the compiled Julia code is unaffected though.

Last update: June 2022

Ensure correct libraries are loaded

When using Julia on a system that uses an environment-variable based module system (such as modules or Lmod), the LD_LIBRARY_PATH variable might be filled with entries pointing to different packages and libraries. To avoid issues from Julia loading another library instead of the ones packaged with Julia, make sure that Julia's lib directory is always the first directory in LD_LIBRARY_PATH.

One possibility to achieve this is to create a wrapper shell script that modifies LD_LIBRARY_PATH before calling the Julia executable. Inspired by a script from UCL's Owain Kenway:

#!/usr/bin/env bash

# This wrapper makes sure the julia binary distributions picks up the GCC
# libraries provided with it correctly meaning that it does not rely on
# the gcc-libs version.

# Dr Owain Kenway, 20th of July, 2021
# Source: https://github.com/UCL-RITS/rcps-buildscripts/blob/04b2e2ccfe7e195fd0396b572e9f8ff426b37f0e/files/julia/julia.sh

location=$(readlink -f $0)
directory=$(readlink -f $(dirname ${location})/..)

export LD_LIBRARY_PATH=${directory}/lib/julia:${LD_LIBRARY_PATH}
exec ${directory}/bin/julia "$@"

Note that using readlink might not be optimal from a performance perspective if used in a massively parallel environment. Alternatively, hard-code the Julia path or set an environment variable accordingly.

Also note that fixing the LD_LIBRARY_PATH variable does not seem to be a hard requirement, since it is not used universally (e.g., it is not necessary on NERSC's systems).

Last update: April 2022

Julia depot path

Since the available file systems can differ significantly between HPC centers, it is hard to make a general statement about where the Julia depot folder (by default on Unix-like systems: ~/.julia) should be placed (via JULIA_DEPOT_PATH). Generally speaking, the file system hosting the Julia depot should have

good (parallel) I/O
no tight quotas
read and write access
no mechanism for the automatic deletion of unused files (or the depot should be excluded as an exception)

On some systems, it resides in the user's home directory (e.g. at NERSC). On other systems, it is put on a parallel scratch file system (e.g. CSCS and PC2). At the time of writing (April 2022), there does not seem to be reliable performance data available that could help to make a data-based decision.

If multiple platforms, e.g., systems with different architecture, would access the same Julia depot, for example because the file system is shared, it might make sense to create platform-dependend Julia depots by setting the JULIA_DEPOT_PATH environment variable appropriately, e.g.,

prepend-path JULIA_DEPOT_PATH $env(HOME)/.julia/$platform

where $platform contains the current system name (source).

MPI.jl

It is generally recommended to set

JULIA_MPI_BINARY=system

such that MPI.jl will always use a system MPI instead of the Julia artifact (i.e. MPI_jll.jl). For more configuration options see this part of the MPI.jl documentation.

Additionally, on the NERSC systems, there is a pre-built MPI.jl for each programming environment, which is loaded through a settings module. More information on the NERSC module file setup can be found here.

CUDA.jl

It seems to be generally advisable to set the environment variables

JULIA_CUDA_USE_BINARYBUILDER=false
JULIA_CUDA_USE_MEMORY_POOL=none

in the module files when loading Julia on a system with GPUs. Otherwise, Julia will try to download its own BinaryBuilder.jl-provided CUDA stack, which is typically not what you want on a production HPC system. Instead, you should make sure that Julia finds the local CUDA installation by setting relevant environment variables (see also the CUDA.jl docs). Disabling the memory pool is advisable to make CUDA-aware MPI work on multi-GPU nodes (see also the MPI.jl docs).

Modules file setup

Johannes Blaschke provides scripts and templates to set up modules file for Julia on some of NERSC's systems:
https://gitlab.blaschke.science/nersc/julia/-/tree/main/modulefiles

There are a number of environment variables that should be considered to be set through the module mechanism:

JULIA_DEPOT_PATH: Ensure depot path is on the correct file system
JULIA_MPI_BINARY: Use system-provided MPI backend
JULIA_CUDA_USE_BINARYBUILDER: Use system-provided CUDA stack
JULIA_CUDA_USE_MEMORY_POOL: Make CUDA-aware MPI work

Easybuild resources

Samuel Omlin and colleagues from CSCS provide their Easybuild configuration files used for Piz Daint online at https://github.com/eth-cscs/production/tree/master/easybuild/easyconfigs/j/Julia. For example, there are configurations available for Julia 1.7.2 and for Julia 1.7.2 with CUDA support. Looking at these files also helps to decide which kind of environment variables are useful to set.

Further resources

There is a lengthy discussion on the Julia Discourse about how to set up a centralized Julia installation. Some of it is already dated (probably), but it gives a good overview of some best practices and about approaches that work (and some which do not). In particular, the summary from CSCS is very helpful:
https://discourse.julialang.org/t/how-does-one-set-up-a-centralized-julia-installation/13922/32
NERSC's Johannes Blaschke has a nice repository set up with lots of scripts and helpful information on setting up Julia on Cori and Perlmutter:
https://gitlab.blaschke.science/nersc/julia/-/tree/main

For users

HPC systems with Julia support

We maintain an (incomplete) list of HPC systems that provide a Julia installation and/or support for using Julia to its users. For this, we use the following nomenclature:

Center: The HPC center's name
System: The compute system's "marketing" name
Installation: Is there a pre-installed Julia configuration available?
Support: Is Julia "officially" supported on the system, i.e., will Julia users be supported by HPC center staff if they have questions/problems?
Interactive: Is interactive computing with Julia supported, i.e., can you run parallel jobs on the system interactively via, e.g., Jupyter notebooks?
Architecture: The main CPU used in the system
Accelerators: The main accelerator (if anything) in the system
Documentation: Links to documentation for Julia users

Australasia

Center	System	Installation	Support	Interactive	Architecture	Accelerators	Documentation
NeSI	Mahuika, Māui	✅	✅	✅	Intel Xeon Broadwell/Cascade Lake + AMD EPYC Milan	Nvidia Tesla P100, A100	1

Europe

Center	System	Installation	Support	Interactive	Architecture	Accelerators	Documentation
ARC, UCL	Myriad, Kathleen, Michael, Young	✅	✅	?	various Intel Xeon	various GPUs	1
CSC (EuroHPC)	LUMI	✅	✅	?	AMD EPYC Milan	AMD Radeon Instinct MI250X	1
CSCS	Piz Daint	✅	✅	✅	Intel Xeon Broadwell + Haswell	Nvidia Tesla P100	1
DESY IT	Maxwell	✅	?	✅	various AMD EPYC/Intel Xeon	various GPUs	1
HLRS	Hawk	✅	✅	✅	AMD EPYC Rome	Nvidia Tesla A100	1
HPC2N, Umeå U	Kebnekaise	✅	✅	?	Intel Xeon Broadwell + Skylake	Nvidia Tesla K80, Nvidia Tesla V100	1
IT4I (EuroHPC)	Karolina	✅	✅	✅	AMD EPYC Rome	Nvidia Ampere A100	1
IZUM (EuroHPC)	Vega	✅	✅	✅	AMD EPYC Rome	Nvidia Ampere A100	1
LuxProvide (EuroHPC)	MeluXina	✅	?	✅	AMD EPYC Rome	Nvidia Ampere A100-40	1, 2
PC2, U Paderborn	Noctua 1	✅	✅	✅	Intel Xeon Skylake	Intel Stratix 10 + consumer GPUs	1
PC2, U Paderborn	Noctua 2	✅	✅	✅	AMD EPYC Milan	Nvidia Ampere A100, Xilinx Alveo U280	1
ULHPC, U Luxembourg	Aion, Iris	✅	?	✅	AMD EPYC Rome + Intel Xeon Broadwell/Skylake	Nvidia Tesla V100	1
ZDV, U Mainz	MOGON II	✅	?	?	Intel Xeon Broadwell + Skylake	no	1
ZIB	HLRN-IV	✅	✅	?	Intel Cascade Lake AP	coming soon: Nvidia A100, Intel PVC	1

North America

Center	System	Installation	Support	Interactive	Architecture	Accelerators	Documentation
Carnegie Mellon College of Engineering	Arjuna, Hercules	✅	✅	✅	Intel Xeon+AMD EPYC Milan	Nvidia A100, Nvidia K80	1
Dartmouth College	Discovery	✅	?	✅	Intel Xeon (various) + AMD EPYC 7532	Nvidia V100	1
FASRC, Harvard U	Cannon	✅	?	✅	Intel Xeon Cascade Lake	Nvidia V100, A100	1
HPC @ LLNL	various systems	✅	?	✅	various processors	various GPUs	1
NERSC	Cori	✅	?	?	Intel Xeon Haswell	Intel Xeon Phi	1
NERSC	Perlmutter	✅	✅	?	AMD EPYC Milan	Nvidia Ampere A100	1, 2
Open Science Grid	N/A	❌	✅	?	Various	Various	1
Perimeter Institute for Theoretical Physics	Symmetry	✅	✅	✅	AMD EPYC, Intel Xeon	Nvidia V100	-
Pittsburgh Supercomputing Center	Bridges-2	✅	✅	✅	AMD EPYC, Intel Xeon	Nvidia V100	1
Princeton University	Several including Tiger	✅	✅	✅	Intel Xeon (Skylake + Broadwell)	Nvidia P100	1

Other HPC systems

There are a number of other HPC systems that have been reported to provide a Julia installation and/or Julia support, but lack enough details to be put on the list above:

Various clusters at ANL

License and contributing

The contents of this repository are published under the MIT license (see LICENSE). Our main goal is to publicly curate information on using Julia on HPC systems, as a service from the community and for the community. Therefore, we are very happy to accept contributions from everyone, preferably in the form of a PR.

Authors

This repository is maintained by Michael Schlottke-Lakemper (RWTH Aachen University, Germany).

The following people have provided valuable contributions, either in the form of PRs or via private communication:

Carsten Bauer (@carstenbauer)
Alexander Bills (@abillscmu)
Johannes Blaschke (@jblaschke)
Valentin Churavy (@vchuravy)
Steffen Fürst (@s-fuerst)
Mosè Giordano (@giordano)
C. Brenhin Keller (@brenhinkeller)
Mirek Kratochvíl (@exaexa)
Pedro Ojeda (@pojeda)
Samuel Omlin (@omlins)
Ludovic Räss (@luraess)
Erik Schnetter (@eschnett)
Dinindu Senanayake (@DininduSenanayake)
Kjartan Thor Wikfeldt (@wikfeldt)

Disclaimer

Everything is provided as is and without warranty. Use at your own risk!

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Julia on HPC systems

For operators

Official Julia binaries vs. building from source

Ensure correct libraries are loaded

Julia depot path

MPI.jl

CUDA.jl

Modules file setup

Easybuild resources

Further resources

For users

HPC systems with Julia support

Australasia

Europe

North America

Other HPC systems

License and contributing

Authors

Disclaimer

About

Contributors 13

License

hlrs-tasc/julia-on-hpc-systems

Folders and files

Latest commit

History

Repository files navigation

Julia on HPC systems

For operators

Official Julia binaries vs. building from source

Ensure correct libraries are loaded

Julia depot path

MPI.jl

CUDA.jl

Modules file setup

Easybuild resources

Further resources

For users

HPC systems with Julia support

Australasia

Europe

North America

Other HPC systems

License and contributing

Authors

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Contributors 13