Skip to content

Shellmath and runtime efficiency

Michael Wood edited this page Feb 26, 2021 · 2 revisions

Now for a few words about runtime efficiency, an important secondary goal of shellmath.

In his Advanced Bash Scripting Guide Cooper lays down a series of bash scripting optimization guidelines. In the interest of achieving top performance, we've tried to follow these principles in shellmath.

To avoid calling out to external programs it takes some unconventional maneuvers. The key idea behind shellmath is that we can eliminate call-outs to calculator programs if we trouble ourselves to string-parse numeric values down to their integer and fractional parts, apply the "double-parentheses" arithmetic operators to the parts, and recombine the results, all with the help of some string manipulation to touch up the rough edges. So it comes down to string manipulation: we need to work around cut, sed, grep, and the like. Our list of ingredients starts with the printf builtin, the += concatenation operator, and vanilla in-place string construction. Add the regex match operator =~ and its unsung companion, the shell variable BASH_REMATCH. Finally, toss in the quick-on-the-fly string operators (${s/x/y}, ${s##x}, and the like) and it turns we can do all we need to do. Check the source code to see how I managed with these tools, albeit by stretching their limits in some places when necessary.

(Incidentally, an older project of mine, fasder, employs these and other techniques to eliminate subprograms in order to speed up the popular productivity tool fasd on the Cygwin-over-Windows platform.)

Another key optimization in shellmath is its use of global storage to capture the side effects of function calls so they do not need to be subshelled. A few small wrapper functions provide a safe, semi-encapsulated means to achieve this. But in fulfilling such a vital function they introduce new common execution paths that should themselves be scrutinized for optimality.

Fortunately shellmath is small enough that we can feasibly make the common execution paths fast without resorting to a full-fledged profiler. Besides, profilers for bash are in short supply (if any exist at all). A few basic profiling techniques can be found here. In the case of shellmath an ad hoc analysis with call-stack tracing identified the aforementioned small wrapper functions as the highest-frequency code paths; a final performance squeeze was obtained by minimizing calls to eval in those places:

    for ((_i=1; _i<=$#; _i++)); do
        evalString+=${!_i}="${__shellmath_storage[_i]}"" "
    done

    eval "$evalString"

and rewriting the single-iteration case as a one-liner function.

Timing experiments on faster_e_demo.sh have shown the add and divide APIs cost about .0015ms on average when running on my laptop running minGW over Windows 10. Several speedups to multiplication were also implemented, reducing the initial execution time by about 60%.

As always, the costs of program startup, code "sourcing," and latency in the bash interpreter are beyond our control.

The truly adventurous and the morbidly curious could profile the bash source code. As bash is written in C, there are good tools for that. In the spirit of all good mathematicians, I'll leave that as an exercise for the reader.

Clone this wiki locally