bug in tests/cunit/test_darray_async_many.c #1867

rjdave · 2021-03-26T18:55:22Z

Hello,

As mentioned in #1862, cunit test test_darray_async_many fails for me on multiple systems. I have narrowed this down and possibly found a bug...at least when using Intel C compiler with anything but -g for flags. This test was added in 2.2.2a and has remained largely unchanged since then. In tests/cunit/test_darray_async_many.c, the following lines generate the data that is later compared to expected values:

#ifdef _NETCDF4
    unsigned char my_data_ubyte[LAT_LEN] = {my_rank * 10, my_rank * 10 + 1};
    unsigned short my_data_ushort[LAT_LEN] = {my_rank * 1000, my_rank * 1000 + 1};
    unsigned int my_data_uint[LAT_LEN] = {NC_MAX_SHORT + my_rank * 10, NC_MAX_SHORT + my_rank * 10 + 1};
    long long my_data_int64[LAT_LEN] = {NC_MAX_INT + my_rank * 10, -NC_MAX_INT + my_rank * 10};
    unsigned long long my_data_uint64[LAT_LEN] = {NC_MAX_INT64 + my_rank * 10,
                                                  NC_MAX_INT64 + my_rank * 10 + 1};
#endif /* _NETCDF4 */

my_data_int64 is the entry that is causing me problems. The expected values are defined on these lines:

    long long expected_int64[LAT_LEN * LON_LEN] = {-2147483639LL, -2147483637LL, -2147483629LL,
                                                   -2147483627LL, -2147483619LL, -2147483617LL};

This test is relying on overflow/wraparound math to get the values that end in 9. This works when only the -g flag is given. As soon as you add optimizations (including the default -g -O2) this no longer works and the calculated values in my_data_int64 become 2147483657, -2147483637, 2147483667, -2147483627, 2147483677, and -2147483617. You can see that the addition of my_rank * 10 no longer results in negatives but simply adds 10, 20, and 30 to NC_MAX_INT (2147483647). I have only tested the default -g -O2 and -O3. I have NOT tested -O1.

I have not been able to find an explanation of this so far but it seems that the optimization flags make it so the code no longer sees NC_MAX_INT as a regular integer and assumes it is 2147483647LL instead of just 2147483647.

The text was updated successfully, but these errors were encountered:

edwardhartnett · 2021-03-27T01:26:04Z

What should we do about it?

rjdave · 2021-04-05T17:25:17Z

Is this test meant to test overflow math or just int64 capabilities? If it is only meant to test int64 maybe different algebra or different numbers could be used that would test that both positive and negative values beyond regular int can be stored. Maybe something like:

long long my_data_int64[LAT_LEN] = {2147483647LL + my_rank * 10, -2147483648LL - my_rank * 10};

edwardhartnett · 2021-05-12T17:08:15Z

This test is only to test int64, and overflow is accidental. I will try to see if I can verify this on my system, and then fix it...

edwardhartnett · 2021-05-12T17:50:34Z

OK, I cannot reproduce this problem. test_darray_async_many works fine for me, even when I use CFLAGS='-O2'.

Does this problem occur for you on the most recent release? Are you using GNU compilers?

rjdave · 2021-05-12T18:07:51Z

yes, the problem occurs with 2.5.3 and 2.5.4 (I have not tried compiling any other versions).

I am using the Intel compilers and it happens at least as far back as version 17. I get the same behavior with version 17, 19 and 20 of the Intel compilers and only when activating optimizations. I have not tried the GNU compilers.

edwardhartnett · 2021-05-12T18:29:54Z

OK I will change the data line as you suggest...

rjdave · 2021-05-12T20:36:06Z

Just to be clear, changing the arithmetic as I suggested should result in:

   long long expected_int64[LAT_LEN * LON_LEN] = {2147483657LL, -2147483658LL, 2147483667LL,
                                                  -2147483668LL, 2147483677LL, -2147483678LL};

I have tested this with the Intel compiler with -g and -O3 and it works for me. I have not tested it with any other compilers.

edwardhartnett · 2021-05-13T13:34:32Z

OK, I have made this change. PR up shortly...

edwardhartnett self-assigned this May 12, 2021

edwardhartnett added the bug label May 12, 2021

edwardhartnett added this to To do in PIO v2.5.5 via automation May 12, 2021

rjdave mentioned this issue May 12, 2021

vpath build fails #1862

Open

edwardhartnett mentioned this issue May 13, 2021

fixed overflow in test tests/cunit/test_darray_async_many.c #1876

Merged

edwardhartnett closed this as completed in #1876 May 13, 2021

PIO v2.5.5 automation moved this from To do to Done May 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug in tests/cunit/test_darray_async_many.c #1867

bug in tests/cunit/test_darray_async_many.c #1867

rjdave commented Mar 26, 2021

edwardhartnett commented Mar 27, 2021

rjdave commented Apr 5, 2021

edwardhartnett commented May 12, 2021

edwardhartnett commented May 12, 2021

rjdave commented May 12, 2021

edwardhartnett commented May 12, 2021

rjdave commented May 12, 2021 •

edited

Loading

edwardhartnett commented May 13, 2021

bug in tests/cunit/test_darray_async_many.c #1867

bug in tests/cunit/test_darray_async_many.c #1867

Comments

rjdave commented Mar 26, 2021

edwardhartnett commented Mar 27, 2021

rjdave commented Apr 5, 2021

edwardhartnett commented May 12, 2021

edwardhartnett commented May 12, 2021

rjdave commented May 12, 2021

edwardhartnett commented May 12, 2021

rjdave commented May 12, 2021 • edited Loading

edwardhartnett commented May 13, 2021

rjdave commented May 12, 2021 •

edited

Loading