-
Notifications
You must be signed in to change notification settings - Fork 68
code norms
-
Adopt the NCO standards, the BASH shell style guide and the Python PEP 8 code style guide.
The following norm translates the above code standards to practical actionable items in rrfs-workflow. -
Think big. The workflow will include complicated DA logics, different domains/resolutions (such as CONUS vs NA, 3km vs 12km, etc), spinup and prod cycles, ensemble components, chemistry components, RTMA applications, etc.
-
The core of the workflow will only focus on the NCO naming convention for all existing operational products (such as gfs grib2 files, etc). Uses may develop scripts to use hard or soft links to convert users' specific naming conventions to match the NCO standard.
-
Reduce the Python library dependencies as much as possible for the workflow so as to reduce long-time maintenance needs.
-
It is recommended to use only BASH or Python for scripting.
Every script file (bash or python) should be set as executable and can run from the command line directly as./myscript.shor./myscript.py
so add the following shebang to the first line of the script files:
#!/usr/bin/env bash
the above is preferred, but#!/bin/bashis also acceptable.
or
#!/usr/bin/env python -
One can source other bash files inside the ex-scripts
-
Put the following three lines in any runtime scripts at the beginning:
declare -rx PS4='+ $(basename ${BASH_SOURCE[0]:-${FUNCNAME[0]:-"Unknown"}})[${LINENO}]: '
set -x
date
This will show information about which line of which script file generates the output, as follows:
+ JRRFS_IC[31]: export pgmout=/lfs5/BMC/wrfruc/gge/nco/stmp/conus12km/1.0.1/rrfs.20240527/00/ic/OUTPUT.ic
+ JRRFS_IC[31]: pgmout=/lfs5/BMC/wrfruc/gge/nco/stmp/conus12km/1.0.1/rrfs.20240527/00/ic/OUTPUT.ic
+ JRRFS_IC[32]: /mnt/lfs5/BMC/wrfruc/gge/rrfsx/scripts/exrrfs_ic.sh
+ exrrfs_ic.sh[4]: cpreq='ln -snf'
+ exrrfs_ic.sh[5]: prefix=GFS
+ exrrfs_ic.sh[6]: cd /lfs5/BMC/wrfruc/gge/nco/stmp/conus12km/1.0.1/rrfs.20240527/00/ic
-
Use
sourceinstead of a dot for better readability. (There is no difference between source and dot in BASH) -
The ending of a J-job should be like this:
#
#----------------------------------------
# Execute the script.
#----------------------------------------
export pgmout="${DATA}/OUTPUT.${task_id}"
$SCRIPTSrrfs/exrrfs_da.sh
export err=$?; err_chk
if [ -e "$pgmout" ]; then
cat $pgmout
fi
#
#----------------------------------------
# Remove the Temporary working directory
#----------------------------------------
cd ${DATAROOT}
[[ "${KEEPDATA^^}" == "NO" ]] && rm -rf ${DATA}
#
date
echo "JOB ${jobid:-} HAS COMPLETED NORMALLY!"
exit 0
-
It is preferred to use
[[instead of[.[[is bash's extension to the[command. It has several enhancements that make it a better choice if you write scripts that target bash. -
Use
==instead of=for string comparison. Double-quote strings to be compared. For example:if [[ "${begin^^}" == "YES" ]]; then -
Enforce base-10 arithmetic operations to avoid unexpected errors when dealing with such as numbers "03", "003", "08", etc:
if (( 10#${ENS_SIZE:-0} > 0 )); then -
use ${cpreq} to copy files/directories that are required for a job to function.
-
All scripts under
workflow/sideloadwill NOT go into the NCO operation. It is used to do some tweaks, mimic ecflow job cards, and then provide flexibility for community users. -
The following environmental variables should always be available for any tasks per the NCO standard:
HOMErrfs, EXPDIR, CDATE, PDY, cyc, COMROOT, DATAROOT, VERSION, MACHINE, NET, RUN, TAG
Examples for CDATE, PDY, cyc: (NOTE:cycis an exception and all in lower cases)
CDATE=2024052703
PDY=20240527
cyc=03
-
All tasks have input and output data streams, from/to either
comorumbrella -
The working directory should be defined by the ${DATA} variable. In NCO, a working directory will be removed immediately after a job is completed successfully (it will be kept automatically if a job fails). Users can set
KEEPDATA=YESto keep working directories for debugging purpose.
In the rrfs-workflow, users can further choose to keep data for individual tasks by setting such asKEEPDATA_jedivar=YES. -
In scripts or config files, except for a few exceptions due to NCO practices (such as
cyc), a variable whose name starts with upper cases is assumed to be exported to a subshell (such asLBC_OFFSET_HRS,COMINgfs, etc) while all lower cases mean a temporary variable that is only visible to the script defining it and will not go into sub-shells -
Use
-sinstead of-fto check if a file exists and is not size zero. -
Catch and handle return code on all cases (Run executable, wgrib2, python, ush, script, utility...).
-
Correctly label output information as
INFO,WARNINGorFATAL ERROR. -
Use
${NDATE}to find previous or future cycles/dates, thedatecommand is only used to output a format string -
A workflow calls a J-job, a J-job calls an ex-script, and an ex-script calls scripts/executables under ush/exec respectively.
24. Use nouns for task names, and avoid verbs as much as possible.
-
rrfs-workflow adopts a config cascade and a resource environmental variable cascade so that one can optionally fine-tune settings for individual tasks.
For example: To get the walltime setting for a spinup forecast job, the workflow checks the following variables in this order:
WALLTIME_FCST_SPINUP, WALLTIME_FCST, WALLTIME
until a variable is defined. Check this link for more information. -
rrfs-workflow uses the powerful but, at the same time, intuitive Python language to generate the rocoto workflow (and potentially the ecflow, cylc workflow in the future) directly. This enables us be able to handle complicated DA/workflow logics and still keep things simple and manageable.
-
Use 4 spaces for indentation in Python and 2 spaces in BASH scripts. Avoid using TABs.
-
In bash, double quotes and single quotes function differently while in Python there are no differences.
-
In config files, don't forget to add "export" for any variables that will be exported to sub-shells.
-
To be safe, put a space before
))and after((in BASH:
FHRin=$((10#${FHR}+10#${offset}) # This is wrong, but may not be easy to debug
FHRin=$(( 10#${FHR} + 10#${offset} )) #If we add spaces, it will help reduce bugs
- Need spaces before and after
==
if [[ "${TYPE}"=="IC" ]] || [[ "${TYPE}"=="ic" ]]; then # This is wrong
if [[ "${TYPE}" == "IC" ]] || [[ "${TYPE}" == "ic" ]]; then # this is correct
-
export WALLTIME_UPP=${WALLTIME_UPP:"00:50:00"}# this is wrong. Be sure to have:-instead of:only. -
Use
${var}to reference a variable instead of$var. This is more robust and avoids situations where$varmay cause trouble. -
Python's f-string uses
{var}to reference a variable while BASH uses${var}to reference a variable. -
In Bash, use the double parentheses
(( ))instead of[[ ]]for arithmetic operations and comparisons. The former is more intuitive and less error-prone. For example, useif (( ${num1} > ${num2} )); then, avoidif [[ "${num1}" -gt "${num2}" ]]; then.
More examples about(( )):
if (( ${num1} == ${num3} && ${num1} < {num2} )); then
if (( ${num1} != ${num2} )); then
if (( ${num1} < ${num2} )); then
if (( ${num1} <= ${num2} )); then
if (( ${num1} >= ${num2} )); then
-
Test whether a variable is empty:
if [[ -z ${cycles} ]]; then# this is not right.
if [[ -z "${cycles}" ]]; then# this is correct -
Any file/directory preprocessing work or postprocessing work should be done in an ex-script or ush-script, not in a Python script. Python scripts should NOT be used to replace anything that can be easily done in BASH, such as linking files, copying files, creating directories, etc. Python scripts are only used to fulfill a function which is almost impossible in BASH. Python scripts accept an input (a variable or an input namelist) from BASH scripts and generate outputs (return texts or an output file) for BASH scripts to continue.
-
TRUE/FALSE,true/false,True/FalseandYES/NO.
(1). Usetrue/falsein the exp setup file and all config files, following the BASH rule.
(2). UseTrue/Falsefor Python logical variables, following the Python rule.
(3). UseTRUE/FALSEwhen passing environmental variables from job cards (or rrfs.xml) to runtime scripts.
(4).YES/NOonly applies to theKEEPDATAvariable, which is an NCO rule.
(5). In the bash scripts, use${var^^}to convert inputs to the upper cases and then compare againstTRUE/FALSEorYES/NO.