Skip to content

Shellmath and arbitrary precision arithmetic

Michael Wood edited this page Feb 26, 2021 · 4 revisions

TL;DR:

Are you interested in an arbitrary-precision calculator for bash? Shellmath is finite-precision but stretches the limits of FP by rescaling its inputs under certain conditions. This technique is essentially a first big step toward arbitrary precision, and the door is open for further enhancements. If any of this sounds interesting I would be happy to hear from you.

Introduction

Shellmath was conceived as a proof-of-concept for the fact that floating-point arithmetic can in fact be implemented in the bash shell exclusively. It relies on built-in operations and shell commands alone, eschewing external tools of all kinds.

Arbitrary-precision arithmetic (hereinafter "AP") is well-known and widely implemented but stands outside the principal goals of this project. What follows is a conceptual outline of one possible approach to AP in shellmath.

Internal representation of AP numbers

In keeping with common practice, we will store AP numbers as arrays of ordinary integers each representing digits in some large base b that is yet small enough as to be "digestible" by the shell's built-in arithmetic operators. Additionally, we will further decrease b just enough that each digit-cell can store temporary information relating to numerical overflow so as to gain some flexibility in how we ultimately address that issue.

Addition, subtraction and multiplication

Addition and subtraction are no more challenging to implement for AP numbers than for finite-precision decimals. Each boundary between AP digits is like a decimal point.

Fourth-grade multiplication requires us to apply the distributive law to all pairs of ("big") digits in the multiplier and the multiplicand. We obtain a set of partial products, we adjust them to account for place value, and we add to obtain the full product. With addition implemented first, this reduces to calling out to our addition API.

Special challenges of (long) division

Division is the most challenging operation to implement. The familiar fourth-grade long division algorithm computes partial quotients digit-by-digit, working left-to-right. To determine the correct digit at any step, one applies a poor man's predictor-corrector routine: estimate the digit and compute a remainder. If the remainder is not between zero and the divisor, adjust the estimate and recalculate. Move forward to the next digit and repeat. We will follow this model.

In AP arithmetic the aforesaid prediction and correction are complicated by the fact that the partial quotients' numerators and denominators can be too large for us to divide them directly through the built-in division operator. We will not attempt to call our own division routine here because recursion is hard to implement and can be slow. Instead we will estimate the partial quotient with quotients of much smaller numbers.

A numerical example

Suppose we are working in base b = 10^3 = 1000. Consider a quotient Q that begins

                                     _______________________
         Q =        779 080 421 ...  )  123 456 789 332 ... 

Let d1 denote the first digit of Q. d1 will be placed above 332, the fourth digit of the dividend. This digit is the integer floor of the partial quotient

                123 456 789 332
         PQ =   ---------------  ,
                  779 080 421 

whose terms might be too large for the built-in division operator. We will estimate PQ with round numbers easily reduced to smaller terms. Round up and down as appropriate to see that

         123 456 000 000            123 457 000 000
         ---------------  <  PQ  <  ---------------  .
           780 000 000                779 000 000

Reducing,

         123 456            123 457
         -------  <  PQ  <  -------  .
           780                779

Since the terms in this reduction will never be larger than two digits in base b, we can evaluate these bounds with single calls to the built-in division operator as long as b is reasonably small; specifically, b*b < INT_MAX. Doing this we find that

         d1 = floor(PQ) = 158 .

If the upper and lower bounds on PQ are ill-conditioned (e.g. the denominators are 3 and 2 instead of 780 and 779), the interval they represent can be wide. The digit d1 can then be found by binary search, or the technique can be retried by rounding and cancelling on base-10 alignment rather than base b.

Concluding remarks

AP would be the icing on the cake for shellmath. Finite precision is baked into the current design; substantial tearing up of the floorboards and significant greenfield development would be required to make AP work. In addition, many existing test cases would be rendered obsolete and new, more difficult ones would arise and demand our attention. A version-2.0 designation would be appropriate for such an undertaking.

Please let me know if this subject piques your interest! I must decline to undertake a shellmath-2.0 project on my own, but I will consider collaborations and would love to hear about your own explorations in this area.