Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Successfully running Isca on HPC using conda builds of libraries; looking for feedback #108

Closed
spencerahill opened this issue Feb 20, 2020 · 4 comments
Labels
infrastructure Isca infrastructure: installation, CI, HPC setups priority:medium Medium-piority task

Comments

@spencerahill
Copy link

spencerahill commented Feb 20, 2020

In trying to port Isca to Columbia University's Terremoto HPC cluster, I discovered that Terremoto's builtin netcdf-c library was broken. So, while awaiting a fix from Terremoto's helpful sysadmins, I got the crazy idea of installing all of the needed libraries with conda in my local directory instead. And it worked! I've run both the Held Suarez and Frierson test cases successfully, the latter in full parallel on 8 nodes. This seems like it could be of interest and so I thought I'd share.

I've pasted the contents of the pertinent files below. Before that, my thoughts:

  1. This would seem to point toward a future wherein Isca is as easily installed on any (linux-64) HPC cluster as (something like) conda install isca: all of the dependencies, including the c and fortran libraries, mpi, and netcdf, are installable via a combination of conda and pip. Yes/no? (Assuming this works on other machines.)
  2. At the same time, perhaps there are optimizations in the builtin c and other libraries, or other good reasons why using local, conda-installed libraries for such HPC tasks is undesirable?

Anyways, Isca is great, and thanks for that!


My conda env:

sah2249@bake:~$ conda list --export
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-64
_libgcc_mutex=0.1=conda_forge
_openmp_mutex=4.5=0_gnu
binutils_impl_linux-64=2.33.1=h53a641e_8
binutils_linux-64=2.33.1=h9595d00_16
bzip2=1.0.8=h516909a_2
ca-certificates=2019.11.28=hecc5488_0
certifi=2019.11.28=py38_0
curl=7.68.0=hf8cf82a_0
expat=2.2.9=he1b5a44_2
f90nml=1.1.2=pypi_0
gcc_impl_linux-64=7.3.0=hd420e75_5
gcc_linux-64=7.3.0=h553295d_16
gettext=0.19.8.1=hc5be6a0_1002
gfortran_impl_linux-64=7.3.0=hdf63c60_5
gfortran_linux-64=7.3.0=h553295d_16
hdf4=4.2.13=hf30be14_1003
hdf5=1.10.5=mpi_mpich_ha7d0aea_1004
isca=0.2=dev_0
jinja2=2.11.1=py_0
jpeg=9c=h14c3975_1001
krb5=1.16.4=h2fd8d38_0
ld_impl_linux-64=2.33.1=h53a641e_8
libblas=3.8.0=14_openblas
libcblas=3.8.0=14_openblas
libcurl=7.68.0=hda55be3_0
libedit=3.1.20170329=hf8c457e_1001
libffi=3.2.1=he1b5a44_1006
libgcc-ng=9.2.0=h24d8f2e_2
libgfortran-ng=7.3.0=hdf63c60_5
libgomp=9.2.0=h24d8f2e_2
liblapack=3.8.0=14_openblas
libnetcdf=4.7.3=mpi_mpich_h755db7c_1
libopenblas=0.3.7=h5ec1e0e_6
libpng=1.6.37=hed695b0_0
libssh2=1.8.2=h22169c7_2
libstdcxx-ng=9.2.0=hdf63c60_2
libuuid=2.32.1=h14c3975_1000
libxcb=1.13=h14c3975_1002
markupsafe=1.1.1=py38h516909a_0
mpi=1.0=mpich
mpich=3.3.2=hc856adb_0
ncurses=6.1=hf484d3e_1002
ncview=2.1.7=h8ec25ab_3
netcdf-fortran=4.5.2=mpi_mpich_hd560429_3
numpy=1.18.1=py38h95a1406_0
openssl=1.1.1d=h516909a_0
pandas=1.0.1=py38hb3f55d8_0
pip=20.0.2=py_2
pkg-config=0.29.2=h516909a_1006
pthread-stubs=0.4=h14c3975_1001
python=3.8.1=h357f687_2
python-dateutil=2.8.1=py_0
pytz=2019.3=py_0
readline=8.0=hf8c457e_0
setuptools=45.2.0=py38_0
sh=1.12.14=py38_1001
six=1.14.0=py38_0
sqlite=3.30.1=hcee41ef_0
tk=8.6.10=hed695b0_0
tqdm=4.42.1=py_0
udunits2=2.2.27.6=h4e0c4b3_1001
wheel=0.34.2=py_1
xarray=0.15.0=py_0
xorg-kbproto=1.0.7=h14c3975_1002
xorg-libice=1.0.10=h516909a_0
xorg-libsm=1.2.3=h84519dc_1000
xorg-libx11=1.6.9=h516909a_0
xorg-libxau=1.0.9=h14c3975_0
xorg-libxaw=1.0.13=h14c3975_1002
xorg-libxdmcp=1.1.3=h516909a_0
xorg-libxext=1.3.4=h516909a_0
xorg-libxmu=1.1.3=h516909a_0
xorg-libxpm=3.5.13=h516909a_0
xorg-libxt=1.1.5=h516909a_1003
xorg-xextproto=7.3.0=h14c3975_1002
xorg-xproto=7.0.31=h14c3975_1007
xz=5.2.4=h14c3975_1001
zlib=1.2.11=h516909a_1006

My Isca test run script:

(isca_conda) sah2249@bake:~/testing/isca-testing/held-suarez$ cat run_isca_terremoto.sh
#!/bin/bash -l

#SBATCH --account=apam
#SBATCH --partition=free
#SBATCH --job-name=held_suarez_test_case
#SBATCH --time=1:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --cpus-per-task=1
#SBATCH --output=slurm_%j.out

echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`

module purge
source $HOME/.bashrc
source $GFDL_BASE/src/extra/env/terremoto
conda activate isca_conda

python $GFDL_BASE/exp/test_cases/held_suarez/held_suarez_test_case.py

My terremoto env file:

(isca_conda) [first-try !?]sah2249@bake:~/Isca/src/extra/env$ cat terremoto
echo loadmodules for terremoto machine at Columbia University

module purge
module load shared
module load slurm/17.11.8

# Need to source the `conda.sh` file in order to activate a conda env from
# within a script.  See https://github.com/conda/conda/issues/7980#issuecomment-441358406.
source ${HOME}/miniconda3/etc/profile.d/conda.sh
conda activate isca_conda

export F90=mpifort
export CC=mpicc
export GFDL_MKMF_TEMPLATE=terremoto-conda
export LD_LIBRARY_PATH=${HOME}/miniconda3/envs/isca_conda/lib:${LD_LIBRARY_PATH}
export CPPFLAGS=-I${HOME}/miniconda3/envs/isca_conda/include
export CFLAGS=-D__IFC ${CPPFLAGS}

And my mkmf templates:

sah2249@bake:~/Isca/src/extra/python/isca/templates$ cat mkmf.template.terremoto-conda
# template for the Columbia University "Terremoto" machine using conda
# typical use with mkmf
# mkmf -t template.ifc -c" -Duse_libMPI -Duse_netCDF" path_names /usr/local/include
CPPFLAGS = -I${HOME}/miniconda3/envs/isca_conda/include
NETCDF_LIBS = -L${HOME}/miniconda3/envs/isca_conda/lib

# FFLAGS:
#  -cpp: Use the fortran preprocessor
#  -fcray-pointer: Cray pointers don't alias other variables.
#  -O2: Level 2 speed optimisations
#  -ffree-line-length-none -fno-range-check: Allow arbitrarily long lines
#  -fdefault-real-8: 8 byte reals (compatability for some parts of GFDL code)
#  -fdefault-double-8: 8 byte doubles (compat. with RRTM)
FFLAGS = $(CPPFLAGS) $(NETCDF_LIBS) -cpp -fcray-pointer \
          -O2 -ffree-line-length-none -fno-range-check \
          -fdefault-real-8 -fdefault-double-8

LDFLAGS = $(NETCDF_LIBS) -lhdf5 -lhdf5_hl -lhdf5_fortran -lhdf5hl_fortran \
           -lnetcdff -lnetcdf -lmpi
CFLAGS = -D__IFC $(CPPFLAGS)

FC = $(F90)
LD = $(F90)
@sit23
Copy link
Contributor

sit23 commented Feb 20, 2020

Hey Spencer - thanks very much for sending this through. I agree that it would be awesome to have the installation on other machines be this easy. Even if there is a performance hit (which maybe there isn't?), people will probably be willing to accept that for the ease.

I'll have a go at re-creating your setup on one of our machines and we can go from there.

And we're very glad you like Isca! Happy simulating!

@sit23
Copy link
Contributor

sit23 commented Feb 20, 2020

In particular, I'll try and do a speed comparison of the built-in libraries we're using and the conda-installed ones you suggested above. Perhaps if you're able to do one too, that would be a good comparison across different machines?

@spencerahill
Copy link
Author

Great! Glad to hear this is indeed of interest. CCing a few non-Isca folks who might also find it of note, @rabernat, @spencerkclark, @naomi-henderson

Perhaps if you're able to do one too, that would be a good comparison across different machines?

Still waiting on the sysadmins to fix the netcdf build, but once that's done yes I can give this a shot.

@dennissergeev dennissergeev added infrastructure Isca infrastructure: installation, CI, HPC setups priority:high High-priority task priority:medium Medium-piority task and removed priority:high High-priority task labels May 6, 2020
@RuthG RuthG closed this as completed Apr 6, 2022
@RuthG
Copy link

RuthG commented Apr 6, 2022

Thanks Spencer! We're having a spring clean so I've closed this with Denis' issue as the updated one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure Isca infrastructure: installation, CI, HPC setups priority:medium Medium-piority task
Projects
None yet
Development

No branches or pull requests

4 participants