nla-digbench tasks freire{1,2}.c are flawed #1118

MartinSpiessl · 2020-09-20T19:44:51Z

The programs freire{1,2}.c in c/nla-digbench are clearly flawed since they neglect the limited floating point precision.

As an example, let's look at this concretized version of freire1.c:

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>

int main() {
    int r;
    double a, x;
    a = 12345678901234567;
    x = a / 2.0;
    r = 0;
    long long i = 0;

    while (1) {
        i++;
        printf("i:\t\t\t%lld\n",i);
        printf("a:\t\t\t%f\n",a);
        printf("x:\t\t\t%f\n",x);
        printf("r:\t\t\t%d\n",r);
        printf("r*r - a - r + 2*x:\t%f\n",r*r - a - r + 2*x);
        printf("(2*x-a)+(r*r-r):\t%f\n\n",(2*x-a)+(r*r-r));
        assert((int)(r*r - a - r + 2*x) == 0);
        //assert((int)((2*x-a)+(r*r-r))==0);

        if (!(x > r))
            break;
        x = x - r;
        r = r + 1;
    }

    assert((int)(r*r - a - r + 2*x) == 0);
    return 0;
}

After compilation this will show us that the assert inside the loop can be violated after only 4 rounds (I introduced i to track numbers of rounds and also added some printf-calls for debugging:

i:                      1
a:                      12345678901234568.000000
x:                      6172839450617284.000000
r:                      0
r*r - a - r + 2*x:      0.000000
(2*x-a)+(r*r-r):        0.000000

i:                      2
a:                      12345678901234568.000000
x:                      6172839450617284.000000
r:                      1
r*r - a - r + 2*x:      0.000000
(2*x-a)+(r*r-r):        0.000000

i:                      3
a:                      12345678901234568.000000
x:                      6172839450617283.000000
r:                      2
r*r - a - r + 2*x:      0.000000
(2*x-a)+(r*r-r):        0.000000

i:                      4
a:                      12345678901234568.000000
x:                      6172839450617281.000000
r:                      3
r*r - a - r + 2*x:      -2.000000
(2*x-a)+(r*r-r):        0.000000

a.out: main.c:28: main: Assertion `(int)(r*r - a - r + 2*x) == 0' failed.
Aborted (core dumped)

The initial value for a that I chose here leads to the assertion violation in just 4 steps. I was not able to get any less than 4 steps, sometimes it goes into the thousands. I also showed that by reordering of the terms in the assumption (grouping by type) we get a slightly better numeric stability, but even then the assertion would eventually be violated. This is inevitable because of the limited precision of IEEE floating-point values.

@nguyenthanhvuh was aware that CPAchecker complained here since both files contain the following comment in the beginning:

Note: for some reason using cpa was able to disprove these
cpa.sh -kInduction -setprop solver.solver=z3 freire1.c`

Indeed, these counterexamples are probably valid as shown via my concrete example above. The problem here is that this specific algorithm of Freire was never stated for use with floating-point variables or even signed integers:

http://www.pedrofreire.com/sqrt/sqrt1.en.html

register unsigned int x, r;

/* … */
x = (x+1) >> 1;    /* init x */
for( r=0; x>r; x-=r++ );

In order to fix these tasks, we would need to change the types to integers and make sure that none of the operations on the way overflow. Since the algorithm in freire1.c will calculate the next bigger square, i.e. (floor(sqrt(a))+1)^2, we would have to limit the maximum value of a such that this value is still below INT_MAX. Or just some value that is sufficiently lower than INT_MAX if we are lazy 😉
For negative values of a the algorithm will immediately break and the assertions actually hold, so we would not need to change the integer types to be unsigned instead of signed in order for the program to be correct (w.r.t. the assertions).
We could probably also find constraints on the floating-point values of a that would allow us to fix these tasks while retaining the floating-point variables. It is always nice to have tasks that actually use floating-point variables, though they are intrinsically extremely hard to verify.

The text was updated successfully, but these errors were encountered:

ndkimhao · 2020-09-27T22:59:30Z

Hi, @MartinSpiessl, the double type can't represent exactly the number 12345678901234567, so all calculations that followed the statement a = 12345678901234567 would not be accurate.

I think we can limit the value of the a variable to resolve this problem. I did a quick binary search and found the following bounds for Freire1 and 2:

Freire1:

current code: a <= 2147441940
replace int r by double r: a <= 9007199254740993 = 2^53 + 1

Freire2:
a <= 538359860

MartinSpiessl · 2020-09-28T08:25:50Z

Hi, @MartinSpiessl, the double type can't represent exactly the number 12345678901234567, so all calculations that followed the statement a = 12345678901234567 would not be accurate.

Yeah, I didn't care to much whether this particular value can be represented. It will just lead to a specific value in a that can be represented. Even if we set a to a double that can be represented, there is no guarantee that intermediate calculation results can still be represented accurately if that double value is high enough.

Thanks for checking out the bounds! I tried the first solution for freire1.c with CPAchecker but I still found a violation, even if I set a<=1. I looked at the model and found that -inf was assigned to a. So I guess we would have to post a lower bound for a as well to exclude this extreme value (which is of course a double value, so I do not think this is an error in CPAchecker, __VERIFIER_nondet_double should be allowed to output (any) NaN and inf value).

With a lower bound of 0 or -2147441940 CPAchecker runs into a timeout instead, which is good news, since this means the solver cannot easily detect a problem anymore. There might still be one, but that would be the other verifiers job to prove or disprove.

Having this as a verification task is therefore very interesting. Can you make a PR for this? Maybe also add a variant that only uses integers, the comparison would be especially nice.
I will look over the PR, test the fixes with CPAchecker again and would approve if the violations cannot be found anymore.

ndkimhao · 2020-09-28T15:01:01Z

We have created the integer version of freire1.c and freire2.c recently. I will also check the bounds for them and make a PR.

MartinSpiessl added issue with benchmark C Task in language C labels Sep 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nla-digbench tasks freire{1,2}.c are flawed #1118

nla-digbench tasks freire{1,2}.c are flawed #1118

MartinSpiessl commented Sep 20, 2020

ndkimhao commented Sep 27, 2020 •

edited

Loading

MartinSpiessl commented Sep 28, 2020

ndkimhao commented Sep 28, 2020

nla-digbench tasks freire{1,2}.c are flawed #1118

nla-digbench tasks freire{1,2}.c are flawed #1118

Comments

MartinSpiessl commented Sep 20, 2020

ndkimhao commented Sep 27, 2020 • edited Loading

MartinSpiessl commented Sep 28, 2020

ndkimhao commented Sep 28, 2020

ndkimhao commented Sep 27, 2020 •

edited

Loading