Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NUMERICAL] VS Algorithm causes lots of overflow / underflow for high degrees #263

Closed
ravipra opened this issue Jun 17, 2021 · 7 comments · Fixed by #264
Closed

[NUMERICAL] VS Algorithm causes lots of overflow / underflow for high degrees #263

ravipra opened this issue Jun 17, 2021 · 7 comments · Fixed by #264

Comments

@ravipra
Copy link

ravipra commented Jun 17, 2021

Hi,

Thank you for the very useful package.

Been able to use it for the past one year successfully. But getting "invalid value encountered" runtime warnings today.

The errors are:

/home/user/.local/lib/python3.9/site-packages/bezier/hazmat/curve_helpers.py:256
: RuntimeWarning: invalid value encountered in multiply 
  result += binom_val * lambda2_pow * nodes[:, [index]] 
/home/user/.local/lib/python3.9/site-packages/bezier/hazmat/curve_helpers.py:256
: RuntimeWarning: invalid value encountered in add 
  result += binom_val * lambda2_pow * nodes[:, [index]] 
/home/user/.local/lib/python3.9/site-packages/bezier/hazmat/curve_helpers.py:256
: RuntimeWarning: overflow encountered in multiply 
  result += binom_val * lambda2_pow * nodes[:, [index]] 
/home/user/.local/lib/python3.9/site-packages/bezier/hazmat/curve_helpers.py:257
: RuntimeWarning: invalid value encountered in multiply 
  result *= lambda1 

The code I am using is:

import bezier
import numpy as np
import sys

def bezier_values(vals):
  inds = list(range(len(vals)))
  nodes = np.asfortranarray([inds, vals])
  curve = bezier.Curve(nodes, len(vals)-1)

  def f(ind):
    s_val = ind/(len(inds)-1)
    return curve.evaluate(s_val).tolist()[1][0]

  return list(map(f, inds))

lines = []
for line in sys.stdin:
  lines.append(line)

vals = list(map(int, lines))
b_vals = bezier_values(vals)


Attaching the data file (data.txt) that I am using.

Could you please let me know what I am doing wrong?

Thank you.

Ravi
data.txt

@ravipra
Copy link
Author

ravipra commented Jun 17, 2021

I am using arch Linux and installed it as:

BEZIER_NO_EXTENSION=true python  -m pip install --upgrade bezier --no-binary=bezier
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: bezier in ./.local/lib/python3.9/site-packages (2021.2.12)
Requirement already satisfied: numpy>=1.20.1 in ./.local/lib/python3.9/site-packages (from bezier) (1.20.1)

@dhermes
Copy link
Owner

dhermes commented Jul 1, 2021

Thanks for filing! It is making me think about the trade-offs of using the (faster) VS Algorithm vs. using the (standard) de Casteljau algorithm. (A related issue cropped up in #156, fixed in 5768824 by using a float64 / c_double instead of an int32 / c_int. Using twice as many bits for a stack variable is essentially free.)

The primary issue with the implementation (the VS Algorithm) is that it is computing large powers of the input (s^k) and large binomial coefficients (n C k). For a very high degree (here the degree is 1021), this can cause underflow for s^k (when 0 < s < 1) and overflow for n C k.

I'm still digging into the specifics of how / where this fails, but I'll likely implement a fix that switches from the VS Algorithm to the de Casteljau algorithm if the degree exceeds a fixed number (probably 54).


The implementation computes binomial coefficients iteratively via:

binom_val = (binom_val * (degree - index + 1)) / index

However, this approach has 4 types of accuracy issues:

  • The intermediate computation binom_val * (degree - index + 1) cannot be computed exactly (i.e. requires round off)
  • The actual value n C k / binom_val cannot be represented exactly
  • The intermediate computation binom_val * (degree - index + 1) overflows
  • The actual value n C k / binom_val overflows

Using the script below, the following binomial coefficients are the earliest such instances of a problem for a 64-bit floating point value (double) and a 32-bit signed integer (int / int32):

Inexact computation (float64):        55 C 26 = 3560597348629860 != 3560597348629859.5
Inexact representation (float64):     57 C 25 = 9929472283517787 != 9929472283517788.0
Overflow in computation (int32):      30 C 15 = 155117520 != -131213634
Overflow in representation (int32):   34 C 16 = 2203961430 != -2091005866
Overflow in computation (float64):    1021 C 496
Overflow in representation (float64): 1030 C 500

Script to determine these values:

import fractions

import numpy as np


def computed_exactly(n):
    b_np = np.float64(1.0)
    b_fr = fractions.Fraction(1)
    for k in range(1, n):
        b_np = (b_np * (n + 1 - k)) / k
        b_fr = (b_fr * (n + 1 - k)) / k
        if b_fr.denominator != 1:
            raise ValueError("Non-integer", b_fr)
        if b_np != b_fr:
            return n, k, b_np, b_fr.numerator

    return None


def represented_exactly(n):
    b_fr = fractions.Fraction(1)
    for k in range(1, n):
        b_fr = (b_fr * (n + 1 - k)) / k
        if b_fr.denominator != 1:
            raise ValueError("Non-integer", b_fr)
        b_np = np.float64(b_fr.numerator)
        if b_np != b_fr:
            return n, k, b_np, b_fr.numerator

    return None


def computed_overflow_int32(n):
    b_np = np.int32(1)
    b_fr = fractions.Fraction(1)
    for k in range(1, n):
        b_fr = (b_fr * (n + 1 - k)) / k
        if b_fr.denominator != 1:
            raise ValueError("Non-integer", b_fr)

        try:
            b_np = (b_np * np.int32(n + 1 - k)) // np.int32(k)
        except FloatingPointError as exc:
            if exc.args != ("overflow encountered in int_scalars",):
                raise
            machine_limits_int32 = np.iinfo(np.int32)
            work_too_hard = np.int64(b_np) * np.int64(n + 1 - k)
            above = work_too_hard - machine_limits_int32.max - 1
            if not (0 < above < machine_limits_int32.max):
                raise ValueError("Overflow too large")
            overflow_numerator = above + machine_limits_int32.min
            overflow_value = overflow_numerator // np.int64(k)
            return n, k, overflow_value, b_fr.numerator

        if not isinstance(b_np, np.int32):
            raise TypeError("Non-int32", b_np)
        if b_np != b_fr:
            return n, k, b_np, b_fr.numerator

    return None


def represented_overflow_int32(n):
    b_fr = fractions.Fraction(1)
    for k in range(1, n):
        b_fr = (b_fr * (n + 1 - k)) / k
        if b_fr.denominator != 1:
            raise ValueError("Non-integer", b_fr)

        b_np = np.int32(b_fr.numerator)
        if b_np != b_fr:
            return n, k, b_np, b_fr.numerator

    return None


def computed_overflow(n):
    b_np = np.float64(1.0)
    for k in range(1, n):
        try:
            b_np = (b_np * (n + 1 - k)) / k
        except FloatingPointError as exc:
            if exc.args != ("overflow encountered in double_scalars",):
                raise
            return n, k

        if np.isinf(b_np):
            return n, k

    return None


def represented_overflow(n):
    b_fr = fractions.Fraction(1)
    for k in range(1, n):
        b_fr = (b_fr * (n + 1 - k)) / k
        if b_fr.denominator != 1:
            raise ValueError("Non-integer", b_fr)

        try:
            b_np = np.float64(b_fr.numerator)
        except OverflowError as exc:
            if exc.args != ("int too large to convert to float",):
                raise
            return n, k

        if np.isinf(b_np):
            return n, k

    return None


def main():
    np.seterr(all="raise")

    # Find the first instance where `n C k` cannot be computed exactly as a
    # float64.
    for n in range(1, 128):
        mismatch = computed_exactly(n)
        if mismatch is None:
            continue

        n, k, numerical, exact = mismatch
        print(
            "Inexact computation (float64):        "
            f"{n} C {k} = {exact} != {numerical}"
        )
        break

    # Find the first instance where `n C k` cannot be represented exactly as a
    # float64.
    for n in range(1, 128):
        mismatch = represented_exactly(n)
        if mismatch is None:
            continue

        n, k, numerical, exact = mismatch
        print(
            "Inexact representation (float64):     "
            f"{n} C {k} = {exact} != {numerical}"
        )
        break

    # Find the first instance where computing `n C k` overflows as an int32.
    for n in range(1, 128):
        mismatch = computed_overflow_int32(n)
        if mismatch is None:
            continue

        n, k, numerical, exact = mismatch
        print(
            "Overflow in computation (int32):      "
            f"{n} C {k} = {exact} != {numerical}"
        )
        break

    # Find the first instance where `n C k` overflows as an int32.
    for n in range(1, 128):
        mismatch = represented_overflow_int32(n)
        if mismatch is None:
            continue

        n, k, numerical, exact = mismatch
        print(
            "Overflow in representation (int32):   "
            f"{n} C {k} = {exact} != {numerical}"
        )
        break

    # Find the first instance where computing `n C k` overflows as a float64.
    for n in range(1, 2048):
        mismatch = computed_overflow(n)
        if mismatch is None:
            continue

        n, k = mismatch
        print(f"Overflow in computation (float64):    {n} C {k}")
        break

    # Find the first instance where `n C k` overflows as a float64.
    for n in range(1, 2048):
        mismatch = represented_overflow(n)
        if mismatch is None:
            continue

        n, k = mismatch
        print(f"Overflow in representation (float64): {n} C {k}")
        break


if __name__ == "__main__":
    main()

@dhermes
Copy link
Owner

dhermes commented Jul 2, 2021

As I dig a wee bit more, note that the 4 warnings you mentioned only show up on the FIRST instance of the warning, but if we convert all NumPy warnings to exceptions via np.seterr(all="raise"), then all 1022 inputs fail.

  • invalid value encountered in multiply occurs for index 0 (line 256)
  • underflow encountered in multiply occurs for indices 1, 2, ..., 248 (line 254)
  • invalid value encountered in add occurs for indices 249, 250, ..., 1018 (line 256)
  • overflow encountered in multiply occurs for indices 1019, 1020, 1021 (line 256)

UPDATES: (I'll post the instrumented code at some point).

  • For s = 0/1021, the invalid value occurs when doing binom_val * lambda2_pow (for index = 496)
  • For s = 1/1021 through 248/1021, the invalid value occurs when doing lambda2_pow *= lambda2 (for progressively larger index values as s increases; index = 103 up to 501)
  • For s = 249/1021 through 1018/1021, the invalid value occurs when doing the addition to update result += ..., i.e. the error happens when doing result + FOO on the RHS; 100% of these fail when index = 501
  • For s = 1019/1021 through 1021/1021, the invalid value occurs when doing the multiplication (...) * nodes[:, [index]] where the part elided is the product (with higher operator precedence) binom_val * lambda2_pow; these occur at index = 489, 485, 485

@dhermes dhermes changed the title RuntimeWarning: invalid value encountered ... [NUMERICAL] VS Algorithm causes lots of overflow / underflow for high degrees Jul 6, 2021
@dhermes
Copy link
Owner

dhermes commented Jul 6, 2021

@ravipra Another note (unrelated to my findings above), you can vectorize and massively speed up the computation. Instead of

  def f(ind):
    s_val = ind/(len(inds)-1)
    return curve.evaluate(s_val).tolist()[1][0]

  return list(map(f, inds))

create an array of the s-values and evaluate them all at once:

s_values = np.linspace(0.0, 1.0, len(inds))
evaluated = curve.evaluate_multi(s_values)

@dhermes
Copy link
Owner

dhermes commented Jul 7, 2021

This is the implementation I'll use. It ports quite nicely to Fortran and attempts to "optimize" the computation by using contiguous memory.

def de_casteljau(nodes, s_values):
    dimension, columns = nodes.shape
    (num_s,) = s_values.shape
    degree = columns - 1

    # TODO: Determine if it's possible to do broadcast multiply in Fortran
    #       order below to avoid allocating all of `lambda{1,2}`.
    lambda1 = np.empty((dimension, num_s, columns - 1), order="F")
    lambda2 = np.empty((dimension, num_s, columns - 1), order="F")
    workspace = np.empty((dimension, num_s, columns - 1), order="F")
    lambda1_1d = 1.0 - s_values  # Probably not worth allocating this
    for i in range(num_s):
        lambda1[:, i, :] = lambda1_1d[i]
        lambda2[:, i, :] = s_values[i]
        workspace[:, i, :] = (
            lambda1_1d[i] * nodes[:, :degree] + s_values[i] * nodes[:, 1:]
        )

    for i in range(degree - 1, 0, -1):
        workspace[:, :, :i] = (
            lambda1[:, :, :i] * workspace[:, :, :i]
            + lambda2[:, :, :i] * workspace[:, :, 1 : (i + 1)]
        )

    # NOTE: This returns an array with `evaluated.flags.owndata` false, though
    #       it is Fortran contiguous.
    return workspace[:, :, 0]

@dhermes
Copy link
Owner

dhermes commented Jul 8, 2021

@ravipra I have fixed this in #264 and hope to get a release out soon with the fix.

@dhermes
Copy link
Owner

dhermes commented Jul 8, 2021

@ravipra Here is the "fix" in action with your inputs:

In [1]: import bezier

In [2]: import numpy as np

In [3]: with open("./issue-263/data.txt", "r") as file_obj:
   ...:     raw_data = file_obj.read()
   ...: 

In [4]: data_lines = raw_data.strip().split("\n")

In [5]: values = list(map(int, data_lines))

In [6]: indicies = list(range(len(values)))

In [7]: nodes = np.asfortranarray([indicies, values])

In [8]: curve = bezier.Curve.from_nodes(nodes)

In [9]: curve
Out[9]: <Curve (degree=1021, dimension=2)>

In [10]: s_values = np.linspace(0.0, 1.0, len(indicies))

In [11]: evaluated = curve.evaluate_multi(s_values)

In [12]: evaluated.max()
Out[12]: 6650.24031866094

In [13]: evaluated.min()
Out[13]: -2374.7425879401285

In [14]: evaluated
Out[14]: 
array([[ 0.00000000e+00,  1.00000000e+00,  2.00000000e+00, ...,
         1.01900000e+03,  1.02000000e+03,  1.02100000e+03],
       [-3.00000000e+00,  3.72447859e+03,  5.40495424e+03, ...,
         4.41543898e+02,  4.33333153e+02,  4.32000000e+02]])

In [15]: bezier.__version__
Out[15]: '2021.2.13.dev1'

but also note that the de Casteljau algorithm is much less efficient here

In [16]: %time _ = bezier.hazmat.curve_helpers.evaluate_multi_vs(nodes, 1.0 - s_values, s_values)
.../site-packages/bezier/hazmat/curve_helpers.py:259: RuntimeWarning: overflow encountered in multiply
  result += binom_val * lambda2_pow * nodes[:, [index]]
.../site-packages/bezier/hazmat/curve_helpers.py:260: RuntimeWarning: invalid value encountered in multiply
  result *= lambda1
.../site-packages/bezier/hazmat/curve_helpers.py:259: RuntimeWarning: invalid value encountered in multiply
  result += binom_val * lambda2_pow * nodes[:, [index]]
.../site-packages/bezier/hazmat/curve_helpers.py:259: RuntimeWarning: invalid value encountered in add
  result += binom_val * lambda2_pow * nodes[:, [index]]
CPU times: user 37.7 ms, sys: 2.68 ms, total: 40.4 ms
Wall time: 38.7 ms

In [17]: %time _ = bezier.hazmat.curve_helpers.evaluate_multi_de_casteljau(nodes, 1.0 - s_values, s_values)
CPU times: user 5.17 s, sys: 1.51 s, total: 6.69 s
Wall time: 6.7 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants