Coding Style

Gratien D'haese edited this page Jun 21, 2018 · 89 revisions

Why

Relax-and-Recover is written in Bash (at least bash version 3 is needed), a language that can be used in various styles. We like to make it easier for everybody to understand the Relax-and-Recover code and subsequently to contribute fixes and enhancements.

Here is a collection of coding hints that should help to get a more consistent code base.

Don't be afraid to contribute to Relax-and-Recover even if your contribution does not fully match all this coding hints. Currently large parts of the Relax-and-Recover code are not yet in compliance with this coding hints. This is an ongoing step by step process. Nevertheless try to understand the idea behind this coding hints so that you know how to break them properly (i.e. "learn the rules so you know how to break them properly").

The overall idea behind this coding hints is:

Make yourself understood

Make yourself understood to enable others to fix and enhance your code properly as needed.

From this overall idea the following coding hints are derived.

For the fun of it an extreme example what coding style should be avoided:

#!/bin/bash
for i in `seq 1 2 $((2*$1-1))`;do echo $((j+=i));done

Try to find out what that code is about - it does a useful thing.

Code must be easy to read

  • Variables and functions must have names that explain what they do, even if it makes them longer. Avoid too short names, in particular do not use one-letter-names (like a variable named i - just try to 'grep' for it over the whole code to find code that is related to i). In general names should consist of two parts, a generic part plus a specific part to make them meaningful. For example dev is basically meaningless because there are so many different kind of device-like thingies. Use names like boot_dev or even better boot_partition versus bootloader_install_device to make it unambiguous what each thingy actually is about. Use different names for different things so that others can 'grep' over the whole code and get a correct overview what actually belongs to a particular name.
  • Introduce intermediate variables with meaningful names to tell what is going on.
    For example instead of running commands with obfuscated arguments like
    rm -f $( ls ... | sed ... | grep ... | awk ... )
    which looks scaring (what the heck gets deleted here?) better use
    foo_dirs="..."
    foo_files=$( ls $foo_dirs | sed ... | grep ... )
    obsolete_foo_files=$( echo $foo_files | awk ... )
    rm -f $obsolete_foo_files 
    that tells the intent behind (regardless whether or not that code is the best way to do it - but now others can easily improve it).
  • Use functions to structure longer programs into code blocks that can be understood independently.
  • Avoid || and && one-liners. Prefer proper if-then-else-fi blocks.
    Exceptions are simple do-or-die statements like
    COMMAND || Error "meaningful error message"
    and only if it aids readability compared to a full if-then-else-fi clause.
  • Use $( COMMAND ) instead of backticks `COMMAND`
  • Use spaces when possible to aid readability like
    output=( $( COMMAND1 OPTION1 | COMMAND2 OPTION2 ) )
    instead of output=($(COMMAND1 OPTION1|COMMAND2 OPTION2))
    In particular avoid bash arithmetic evaluation and expansion
    without spaces as in result=$(((foo-bar)*baz))
    but prefer readability over compression when possible
    result=$(( ( foo - bar ) * baz ))

Code should be easy to understand

Do not only tell what the code does (i.e. the implementation details) but also explain what the intent behind is (i.e. why) to make the code maintainable.

  • Provide meaningful comments that tell what the computer should do and also explain why it should do it so that others understand the intent behind so that they can properly fix issues or adapt and enhance it as needed.
  • If there is a GitHub issue or another URL available for a particular piece of code provide a comment with the GitHub issue or any other URL that tells about the reasoning behind current implementation details.

Here the initial example so that one can understand what it is about:

#!/bin/bash
# output the first N square numbers
# by summing up the first N odd numbers 1 3 ... 2*N-1
# where each nth partial sum is the nth square number
# see https://en.wikipedia.org/wiki/Square_number#Properties
# this way it is a little bit faster for big N compared to
# calculating each square number on its own via multiplication
N=$1
if ! [[ $N =~ ^[0-9]+$ ]] ; then
    echo "Input must be non-negative integer." 1>&2
    exit 1
fi
square_number=0
for odd_number in $( seq 1 2 $(( 2 * N - 1 )) ) ; do
    (( square_number += odd_number )) && echo $square_number
done

Now the intent behind is clear and now others can easily decide if that code is really the best way to do it and easily improve it if needed.

Try to care about possible errors

By default bash proceeds with the next command when something failed. Do not let your code blindly proceed in case of errors because that could make it hard to find the root cause of a failure when it errors out somewhere later at an unrelated place with a weird error message which could lead to false fixes that cure only a particular symptom but not the root cause.

  • In case of errors better abort than to blindly proceed.
  • At least test mandatory conditions before proceeding. If a mandatory condition is not fulfilled abort with Error "meaningful error message", see 'Relax-and-Recover functions' below.
  • Preferably in new scripts use set -ue to die from unset variables and unhandled errors and use set -o pipefail to better notice failures in a pipeline. When leaving the script restore the Relax-and-Recover default bash flags and options with
    apply_bash_flags_and_options_commands "$DEFAULT_BASH_FLAGS_AND_OPTIONS_COMMANDS"
    
    see [usr/sbin/rear] (https://github.com/rear/rear/blob/master/usr/sbin/rear).
  • TODO Use set -eu and set -o pipefail also in existing scripts, see [make rear working with ''set -ue -o pipefail"] (https://github.com/rear/rear/issues/700).

Maintain backward compatibility

Implement adaptions and enhancements in a backward compatible way so that your changes do not cause regressions for others.

  • One same Relax-and-Recover code must work on various different systems. On older systems as well as on newest systems and on various different Linux distributions.
  • Preferably use simple generic functionality that works on any Linux system. Better very simple code than oversophisticated (possibly fragile) constructs. In particular avoid special bash version 4 features (Relax-and-Recover code should also work with bash version 3).
  • When there are incompatible differences on different systems distinction of cases with separated code is needed because it is more important that the Relax-and-Recover code works everywhere than having generic code that sometimes fails.

Dirty hacks welcome

When there are special issues on particular systems it is more important that the Relax-and-Recover code works than having nice looking clean code that sometimes fails. In such special cases any dirty hacks that intend to make it work everywhere are welcome. But for dirty hacks the above listed coding hints become mandatory rules:

  • Provide explanatory comments that tell what a dirty hack does together with a GitHub issue or any other URL that tell about the reasoning behind the dirty hack to enable others to properly adapt or clean up a dirty hack at any time later when the reason for it had changed or gone away.
  • Try as good as you can to foresee possible errors or failures of a dirty hack and error out with meaningful error messages if things go wrong to enable others to understand the reason behind a failure.
  • Implement the dirty hack in a way so that it does not cause regressions for others.

For example a dirty hack like the following is perfectly acceptable:

# FIXME: Dirty hack to make it work
# on "FUBAR Linux version 666"
# where COMMAND sometimes inexplicably fails
# but always works after at most 3 attempts
# see https://github.com/rear/rear/issues/1234567
# Retries have no bad effect on systems
# where the first run of COMMAND works.
COMMAND || COMMAND || COMMAND || Error "COMMAND failed."

Another perfectly acceptable dirty hack is to try several commands with same intent like:

# FIXME: Dirty hack to make it work
# both on older and newer systems where
# COMMAND1 is provided only on newer systems
# so that COMMAND2 is called as fallback for
# older systems that do not provide COMMAND1
# see https://github.com/rear/rear/issues/1234568
# COMMAND2 has no bad effect on newer systems
# that provide only COMMAND1 but not COMMAND2.
COMMAND1 || COMMAND2 || Error "Neither COMMAND1 nor COMMAND2 worked."

Character encoding

Use only traditional (7-bit) ASCII charactes. In particular do not use UTF-8 encoded multi-byte characters.

  • Non-ASCII characters in scripts may cause arbitrary unexpected failures on systems that do not support other locales than POSIX/C. During "rear recover" only the POSIX/C locale works (the ReaR rescue/recovery system has no support for non-ASCII locales) and /usr/sbin/rear sets the C locale so that non-ASCII characters are invalid in scripts. Have in mind that basically all files in ReaR are scripts. E.g. also /usr/share/rear/conf/default.conf and /etc/rear/local.conf are sourced (and executed) as scripts.
  • English documentation texts do not need non-ASCII characters. Using non-ASCII characters in documentation texts makes it needlessly hard to display the documentation correctly for any user on any system. When non-ASCII characters are used but the user does not have the exact right matching locale set on his system arbitrary nonsense can happen, cf. https://en.opensuse.org/SDB:Plain_Text_versus_Locale
  • The plain UTF-8 character encoding is compatible with ASCII but setting LANG to en_US.UTF-8 is not ASCII compatible, see this mail https://lists.opensuse.org/opensuse-packaging/2017-11/msg00006.html that reads (excerpt):
Setting LANG to en_US.UTF-8 is a horrible idea for scripts
...
collating order and ctypes get in the way
as it's not ASCII compatible

Text layout

  • Indentation with 4 blanks, no tabs.
  • Block level statements in same line.

Example:

while CONDITION1 ; do
    COMMAND1
    if CONDITION2 ; then
        COMMAND2
    else
        COMMAND3
    fi
    COMMAND4
done

Variables

  • Curly braces only where really needed:
    $FOO instead of ${FOO}, but ${FOO:-default_foo}.
  • All variables that are used in more than a single script must be all-caps:
    $FOO instead of $foo or $Foo.
  • Variables that are used only locally should be lowercased and should be marked with local like:
    local foo="default_value"

Functions

  • Use the function keyword to define a function.
  • Function names should be lower case, words separated by underline (_).

Relax-and-Recover functions

Use the available Relax-and-Recover functions when possible instead of re-implementing basic functionality again and again. The Relax-and-Recover functions are implemented in various lib/*-functions.sh files.

  • is_true and is_false:
    See lib/global-functions.sh how to use them.
    For example instead of using
    if [[ ! "$FOO" =~ ^[yY1] ]] ; then
    use
    if ! is_true "$FOO" ; then
    Note that ! is_true is not the same as is_false. Read the documentation how to use those functions.
  • mathlib_calculate:
    For mathematical calculations use mathlib_calculate() unless strict integer calculation is required, see lib/global-functions.sh how to use it.
    For example instead of using bash arithmetic
    result=$(( 2**32 * 2**32 ))
    that fails because numbers get too big for bash use
    result=$( mathlib_calculate "2^32 * 2^32" )

test, [, [[, ((

  • Use [[ where it is required (e.g. for pattern matching or complex conditionals) and [ or test everywhere else.
  • (( is the preferred way for numeric comparison, variables don't need to be prefixed with $ there.

Paired parentheses

Use paired parentheses for case patterns so that editor commands (like '%' in 'vi') that check for matching opening and closing parentheses work everywhere in the code.

Example:

case WORD in
    (PATTERN1)
        COMMAND1
        ;;
    (PATTERN2)
        COMMAND2
        ;;
    (*)
        COMMAND3
        ;;
esac

Return early, return often

All scripts in usr/share/rear/ are sourced via the Source() function so that you can return early if the actual work of a script is not needed and you may also return as soon as all needed work is done.

Example:

this_script_name="$( basename ${BASH_SOURCE[0]} )"
# Check if this script needs to be run:
if ! CONDITION1 ; then
    Log "Skipping '$this_script_name' - not needed because ..."
    return
fi
# Do the actual work:
if CONDITION2 ; then
    # Do all what needs to be done in case of CONDITION2:
    ...
    return
fi
# Do all what needs to be done in the default case:
...

Usually this way it is easier to understand what the script actually does compared to nested conditions like

this_script_name="$( basename ${BASH_SOURCE[0]} )"
# Check if this script needs to be run:
if CONDITION1 ; then
    # Do the actual work:
    if CONDITION2 ; then
        # Do all what needs to be done in case of CONDITION2:
        ...
    else
        # Do all what needs to be done in the default case:
        ...
    fi
else
    Log "Skipping '$this_script_name' - not needed because ..."
fi

What to do with stdin, stdout, and stderr

While usr/sbin/rear is running both stdout and stderr are redirected into ReaR's log file, cf. https://github.com/rear/rear/issues/885

The original stdin, stdout, and stderr file descriptors when usr/sbin/rear was launched are saved as fd6, fd7, and fd8 respectively, cf. https://github.com/rear/rear/pull/1391#issuecomment-311040948

To let messages appear on the user's terminal and to get user input from the user's keyboard wherefrom usr/sbin/rear was launched either use fd6 (for stdin) and fd7 or fd8 (for stdout or stderr) directly or better use ReaR's intended functions for user input and user output in usr/share/rear/lib/_input-output-functions.sh like LogPrint, LogPrintError, LogUserOutput, and UserInput.

The output on the user's terminal is neither meant for logging nor for debugging. Unless usr/sbin/rear was launched with '-v' or '-d' or '-D' there should be no output on the user's terminal. When usr/sbin/rear was launched with '-v' or '-d' or '-D' the output on the user's terminal is only meant as generic high-level information what is going on while usr/sbin/rear is running. To display such information use LogPrint "meaningful message" which also logs the message in ReaR's log file (to error out use Error "meaningful error message").

All what could be useful for later debugging in case of issues must appear in ReaR's log file. In particular any verbose command stdout output must not appear on the user's terminal but redirected into ReaR's log file.

Careful with tools that may (sometimes) prompt for user input!

Some tools prompt for user input like an explicit user confirmation response ('yes' or 'y') under this or that special circumstances. When stdout output is redirected into ReaR's log file those tool's prompt becomes invisible for the user and the tool waits endlessly for user input. Usually one can specify a 'force' option so that the tool does not prompt for user input. Otherwise one must call the tool with the original stdin, stdout, and stderr file descriptors when usr/sbin/rear was launched fd6, fd7, and fd8 as needed like

COMMAND ... 0<&6 1>&7 2>&8

It should be possible to run ReaR unattended

For user dialogs that are implemented in ReaR via the UserInput function the user can predefine automated input values, see the USER_INPUT variables description in usr/share/rear/conf/default.conf

For tools that prompt for an explicit user confirmation response ('yes' or 'y') under this or that special circumstances it should be possible to run ReaR with redirected stdin for example to the output of /usr/bin/yes like

exec 0< <( yes )

or even with input speed throttling and 'y'/'yes' toggling like

exec 0< <( while true ; do echo 'y' ; sleep 1 ; echo 'yes' ; sleep 1 ; done )

Accordingly tools that must run interactively (like an interactive backup restore program that cannot run automated) must be called with the original stdin, stdout, and stderr file descriptors when usr/sbin/rear was launched

COMMAND ... 0<&6 1>&7 2>&8

This way no COMMAND output is in ReaR's log file but it is more important that things work well for the user than having everything in the log and it is still possible to have have something meaningful in the log like

if COMMAND ... 0<&6 1>&7 2>&8 ; then
    Log "COMMAND finished with zero exit code"
else
    Log "COMMAND finished with non-zero exit code $?"
fi

See also

Relax-and-Recover Development

How to contribute to Relax-and-Recover

Google Shell Style Guide

Filenames and Pathnames in Shell: How to do it Correctly

Bash scripting quirks & safety tips

pure bash bible