project to run pyaerocom aeroval tasks in parallel on the Met Norway PPI infrastructure
Standard installation is done via pip:
python -m pip install 'git+https://github.com/metno/pyaerocom-parallelization.git'
For a different branch than main
python -m pip install 'git+https://github.com/metno/pyaerocom-parallelization.git@<branch name>'
- aeroval config files need to be on PPI
- aeroval config files can be run in parallel by running all models in parallel.
- The whole parallelisation happens in five steps:
- create cache files so that all parallel jobs have always a cache hit (for non pyaro obs networks)
- submit aeroval config files in parallel (one job per model)
- assemble the json files (after all jobs have finished)
- adjust variable and model order (since aeroval actively uses the order in the json files)
- remove temporary data (omitted atm)
- As runtime environment the aerotools modules are supported. The standard module used is
/modules/MET/rhel8/user-modules/fou-kl/aerotools/aerotools.conda
Please note that the used module needs to provide the command line interface of the parallelization (e.g. theaeroval_parallelize
command). If the group provided modules are used, only those ending with.conda
can be used for the parallelization at the moment.
Getting started:
- load one of the aerotools modules ending with
.conda
at the time of this writing the following where available:
aerotools.conda
pya-v2024.03.conda
pya-v2024.03.NorESM.conda
pya-v2024.03.ratpm25pm10.conda
In order for them to work, the entire path has to be given. In the case of the preinstalled modules that path is
/modules/MET/rhel8/user-modules/fou-kl/aerotools/
If you want to use your own module, you have to make sure parallelization has been installed there as well.
If everything went right, the command aeroval_parallelize -h
should give you the following output:
usage: aeroval_parallelize [-h] [-v] [-m MODULE] [--jsonrunscript JSONRUNSCRIPT] [--cfgvar CFGVAR]
[--tempdir TEMPDIR] [--json_basedir JSON_BASEDIR]
[--coldata_basedir COLDATA_BASEDIR] [--io_aux_file IO_AUX_FILE]
[--nocache] [--cachegen-only] [--queue QUEUE]
[--cache_queue CACHE_QUEUE] [--queue-user QUEUE_USER] [--dry-qsub]
[--qsub-id QSUB_ID] [--qsub-dir QSUB_DIR] [--cacheram CACHERAM]
[--anaram ANARAM] [--assemblyram ASSEMBLYRAM] [-o OUTDIR] [-c] [-a]
[--adjustmenujson] [--adjustheatmap]
files [files ...]
command line interface to aeroval parallelisation.
positional arguments:
files file(s) to read, directories to combine (if -c switch is used)
options:
-h, --help show this help message and exit
-v, --verbose switch on verbosity
-m MODULE, --module MODULE
environment module to use; defaults to /modules/MET/rhel8/user-modules/fou-
kl/aerotools/aerotools.conda
--jsonrunscript JSONRUNSCRIPT
script to run json config files; defaults to aeroval_run_json_cfg
--cfgvar CFGVAR variable that holds the aeroval config in the file(s) provided. Defaults to
CFG
--tempdir TEMPDIR directory for temporary files; defaults to /tmp
--json_basedir JSON_BASEDIR
set json_basedir in the config manually
--coldata_basedir COLDATA_BASEDIR
set coldata_basedir in the configuration manually
--io_aux_file IO_AUX_FILE
set io_aux_file in the configuration file manually
caching options:
options for cache file generation
--nocache switch off cache generation before running aeroval
--cachegen-only run the cache file generation only
queue options:
options for running on PPI
--queue QUEUE queue name to submit the jobs to; defaults to research-r8.q
--cache_queue CACHE_QUEUE
queue name to submit the caching jobs to; defaults to research-r8.q
--queue-user QUEUE_USER
queue user; defaults to jang
--dry-qsub do not submit to queue (all files created, but no submission)
--qsub-id QSUB_ID id under which the qsub commands will be run. Needed only for automation.
--qsub-dir QSUB_DIR directory under which the qsub scripts will be stored. defaults to
/lustre/storeB/users/jang/submission_scripts, needs to be on fs mounted by
all queue hosts.
--cacheram CACHERAM RAM usage [GB] for cache queue jobs (defaults to 30GB).
--anaram ANARAM RAM usage [GB] for analysis queue jobs (defaults to 30GB).
--assemblyram ASSEMBLYRAM
RAM usage [GB] for assembly queue jobs (defaults to 10GB.
data assembly:
options for assembly of parallelizations output
-o OUTDIR, --outdir OUTDIR
output directory for experiment assembly
-c, --combinedirs combine the output of a parallel runs; MUST INCLUDE <project
dir>/<experiment dir>!!
adjust variable and model order:
options to change existing order of variables and models
-a, --adjustall <aeroval cfgfile> <path to menu.json>; adjust order of all models/variables
to aeroval config file
--adjustmenujson <aeroval cfgfile> <path to menu.json>; adjust order of menu.json to aeroval
config file
--adjustheatmap <aeroval cfgfile> <path to glob_*_monthly.json>; adjust order of menu.json
to aeroval config file
__Example usages__:
submit jobs to queue; parameters as defaults:
aeroval_parallelize <cfg-file>
do not submit jobs to queue (dry-run):
aeroval_parallelize --dry-qsub <cfg-file>
submit jobs to queue; use special module:
aeroval_parallelize -m /modules/MET/rhel8/user-modules/fou-kl/aerotools/pya-v2024.03.conda <cfg-file>
run just the cache file generation:
aeroval_parallelize --cachegen-only <cfg-file>
set data directories and submit to queue:
aeroval_parallelize --json_basedir /tmp/data --coldata_basedir /tmp/coldata --io_aux_file /tmp/gridded_io_aux.py <cfg-file>
assemble aeroval data after a parallel run has been finished:
__The output directory needs to be the target experiment's output path !__
aeroval_parallelize -c -o <output directory> <input directories>
aeroval_parallelize -c -o ${HOME}/tmp/testing/IASI/ ${HOME}/tmpt39n2gp_*
adjust all variable and model orders to the one given in a aeroval config file:
aeroval_parallelize --adjustall <aeroval-cfg-file> <path to menu.json>
aeroval_parallelize --adjustall /tmp/config/cfg_cams2-82_IFS_beta.py /tmp/data/testmerge_all/IFS-beta/menu.json
Recommendation on the configuration file: It's recommended to use absolut paths in the config file. This makes sure the user and the system knows exactly where to put all files.
run aeroval config file and do not submit jobs to queue (testing):
aeroval_parallelize ---dry-qsub <cfg-file>
run aeroval config file on default queue queue:
aeroval_parallelize <cfg-file>
set data directories and submit to queue:
aeroval_parallelize --json_basedir /tmp/data --coldata_basedir /tmp/coldata --io_aux_file /tmp/gridded_io_aux.py <cfg-file>
assemble aeroval data after a parallel run has been finished:
aeroval_parallelize -c -o <output directory> <input directories>
aeroval_parallelize -c -o ${HOME}/tmp ${HOME}/tmpt39n2gp_*
adjust all variable and model orders to the one given in a aeroval config file:
aeroval_parallelize --adjustall <aeroval-cfg-file> <path to menu.json>
aeroval_parallelize --adjustall /tmp/config/cfg_cams2-82_IFS_beta.py /tmp/data/testmerge_all/IFS-beta/menu.json
Part of this project is also a simple cache file generator for pyaerocom. After a standard installation of this package,
it is available as the command pyaerocom_cachegen
help screen
usage: pyaerocom_cachegen [-h] [--vars VARS [VARS ...]] [-o OBSNETWORKS [OBSNETWORKS ...]] [-v] [--tempdir TEMPDIR]
[-m MODULE] [-p] [--queue QUEUE] [--queue-user QUEUE_USER] [--qsub] [--qsub-id QSUB_ID]
[--qsub-dir QSUB_DIR] [--dry-qsub] [-s SUBMISSION_DIR] [-r RAM]
command line interface to pyaerocom cache file generator pyaerocom_cachegen.
options:
-h, --help show this help message and exit
--vars VARS [VARS ...]
variable name(s) to cache
-o OBSNETWORKS [OBSNETWORKS ...], --obsnetworks OBSNETWORKS [OBSNETWORKS ...]
obs networks(s) names to cache
-v, --verbose switch on verbosity
--tempdir TEMPDIR directory for temporary files; defaults to /tmp
-m MODULE, --module MODULE
environment module to use; defaults to /modules/MET/rhel8/user-modules/fou-
kl/aerotools/aerotools.conda
-p, --printobsnetworks
just print the names of the supported obs network
queue options:
options for running on PPI
--queue QUEUE queue name to submit the jobs to; defaults to research-r8.q
--queue-user QUEUE_USER
queue user; defaults to jang
--qsub submit to queue using the qsub command
--qsub-id QSUB_ID id under which the qsub commands will be run. Needed only for automation.
--qsub-dir QSUB_DIR directory under which the qsub scripts will be stored. defaults to
/lustre/storeB/users/jang/submission_scripts, needs to be on fs mounted by all queue hosts.
--dry-qsub create all files for qsub, but do not submit to queue
-s SUBMISSION_DIR, --submission-dir SUBMISSION_DIR
directory submission scripts
-r RAM, --ram RAM RAM usage [GB] for queue
__Example usages__:
start cache generation serially
pyaerocom_cachegen --vars concpm10 concpm25 -o EEAAQeRep.v2
dry run cache generation for queue job
pyaerocom_cachegen --dry-qsub --vars ang4487aer od550aer -o AeronetSunV3Lev2.daily
use special module at queue run (__full module path needed!__)
pyaerocom_cachegen -m /modules/MET/rhel8/user-modules/fou-kl/aerotools/pya-v2024.03 --vars ang4487aer od550aer -o AeronetSunV3Lev2.daily
start cache generation parallel on PPI queue
pyaerocom_cachegen --qsub --vars ang4487aer od550aer -o AeronetSunV3Lev2.daily
pyaerocom_cachegen --qsub --vars concpm10 concpm25 vmro3 concno2 -o EEAAQeRep.NRT
start cache creation on default queue
pyaerocom_cachegen --qsub --vars concpm10 concpm25 vmro3 concno2 -o EEAAQeRep.NRT
start cache creation serially
pyaerocom_cachegen --vars concpm10 concpm25 -o EEAAQeRep.v2
create all files necessary for queue submission, but don't submit to queue (testing)
pyaerocom_cachegen --dry-qsub --vars ang4487aer od550aer -o AeronetSunV3Lev2.daily