Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Computing tips" docs section for running on clusters with slurm, google cloud, etc... #1045

Closed
glwagner opened this issue Oct 13, 2020 · 2 comments
Labels
documentation 📜 The sacred scrolls

Comments

@glwagner
Copy link
Member

For example, when running on MIT's Satori cluster one must use a magic incantation to obtain useful output (and also to run 4 simulations on a single node):

#!/bin/bash

#SBATCH --job-name=eady
#SBATCH --output=slurm-eady-%j.out
#SBATCH --error=slurm-eady-%j.err
#SBATCH --time=12:00:00
#SBARCH --mem=0
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --gres="gpu:4" # GPUs per Node
#SBATCH --cpus-per-task=4

# Clear the environment from any previously loaded modules
module purge > /dev/null 2>&1

module load spack/0.1
module load gcc/8.3.0 # to get libquadmath
module load julia/1.4.1
module load cuda/10.1.243
module load openmpi/3.1.4-pmi-cuda
module load py-matplotlib/3.1.1

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

cd $DIR/../Oceananigans/

CUDA_VISIBLE_DEVICES=0 unbuffer julia --project run_small_eady_problem.jl --Nh 128 --Nz 96 --geostrophic-shear 0.25 --years 2.0 2>&1 | tee quarter_shear.out &
CUDA_VISIBLE_DEVICES=1 unbuffer julia --project run_small_eady_problem.jl --Nh 128 --Nz 96 --geostrophic-shear 0.1  --years 2.0 2>&1 | tee tenth_shear.out &
CUDA_VISIBLE_DEVICES=2 unbuffer julia --project run_small_eady_problem.jl --Nh 192 --Nz 96 --geostrophic-shear 0.25 --years 2.0 2>&1 | tee quarter_shear_hires.out &
CUDA_VISIBLE_DEVICES=3 unbuffer julia --project run_small_eady_problem.jl --Nh 192 --Nz 96 --geostrophic-shear 0.1  --years 2.0 2>&1 | tee tenth_shear_hires.out

sleep 42480 # sleep for 11.8 hours

(and explanations for each part of the script might be helpful)

@vchuravy
Copy link
Collaborator

https://github.com/CliMA/ClimateMachine.jl/wiki/Satori-Cluster

@ali-ramadhan
Copy link
Member

For Google Cloud Platform we could probably dig up some of the scripts we developed way back:

https://github.com/christophernhill/gcloudhacks
https://github.com/christophernhill/oceananiganshacks

@glwagner glwagner added the documentation 📜 The sacred scrolls label Nov 4, 2020
@CliMA CliMA locked and limited conversation to collaborators Mar 22, 2023
@glwagner glwagner converted this issue into discussion #2998 Mar 22, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
documentation 📜 The sacred scrolls
Projects
None yet
Development

No branches or pull requests

3 participants