Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug in tests/cunit/test_darray_async_many.c #1867

Closed
rjdave opened this issue Mar 26, 2021 · 8 comments · Fixed by #1876
Closed

bug in tests/cunit/test_darray_async_many.c #1867

rjdave opened this issue Mar 26, 2021 · 8 comments · Fixed by #1876
Assignees
Labels
Projects

Comments

@rjdave
Copy link

rjdave commented Mar 26, 2021

Hello,

As mentioned in #1862, cunit test test_darray_async_many fails for me on multiple systems. I have narrowed this down and possibly found a bug...at least when using Intel C compiler with anything but -g for flags. This test was added in 2.2.2a and has remained largely unchanged since then. In tests/cunit/test_darray_async_many.c, the following lines generate the data that is later compared to expected values:

#ifdef _NETCDF4
    unsigned char my_data_ubyte[LAT_LEN] = {my_rank * 10, my_rank * 10 + 1};
    unsigned short my_data_ushort[LAT_LEN] = {my_rank * 1000, my_rank * 1000 + 1};
    unsigned int my_data_uint[LAT_LEN] = {NC_MAX_SHORT + my_rank * 10, NC_MAX_SHORT + my_rank * 10 + 1};
    long long my_data_int64[LAT_LEN] = {NC_MAX_INT + my_rank * 10, -NC_MAX_INT + my_rank * 10};
    unsigned long long my_data_uint64[LAT_LEN] = {NC_MAX_INT64 + my_rank * 10,
                                                  NC_MAX_INT64 + my_rank * 10 + 1};
#endif /* _NETCDF4 */

my_data_int64 is the entry that is causing me problems. The expected values are defined on these lines:

    long long expected_int64[LAT_LEN * LON_LEN] = {-2147483639LL, -2147483637LL, -2147483629LL,
                                                   -2147483627LL, -2147483619LL, -2147483617LL};

This test is relying on overflow/wraparound math to get the values that end in 9. This works when only the -g flag is given. As soon as you add optimizations (including the default -g -O2) this no longer works and the calculated values in my_data_int64 become 2147483657, -2147483637, 2147483667, -2147483627, 2147483677, and -2147483617. You can see that the addition of my_rank * 10 no longer results in negatives but simply adds 10, 20, and 30 to NC_MAX_INT (2147483647). I have only tested the default -g -O2 and -O3. I have NOT tested -O1.

I have not been able to find an explanation of this so far but it seems that the optimization flags make it so the code no longer sees NC_MAX_INT as a regular integer and assumes it is 2147483647LL instead of just 2147483647.

@edwardhartnett
Copy link
Collaborator

What should we do about it?

@rjdave
Copy link
Author

rjdave commented Apr 5, 2021

Is this test meant to test overflow math or just int64 capabilities? If it is only meant to test int64 maybe different algebra or different numbers could be used that would test that both positive and negative values beyond regular int can be stored. Maybe something like:

long long my_data_int64[LAT_LEN] = {2147483647LL + my_rank * 10, -2147483648LL - my_rank * 10};

@edwardhartnett
Copy link
Collaborator

This test is only to test int64, and overflow is accidental. I will try to see if I can verify this on my system, and then fix it...

@edwardhartnett edwardhartnett self-assigned this May 12, 2021
@edwardhartnett edwardhartnett added this to To do in PIO v2.5.5 via automation May 12, 2021
@edwardhartnett
Copy link
Collaborator

OK, I cannot reproduce this problem. test_darray_async_many works fine for me, even when I use CFLAGS='-O2'.

Does this problem occur for you on the most recent release? Are you using GNU compilers?

@rjdave
Copy link
Author

rjdave commented May 12, 2021

yes, the problem occurs with 2.5.3 and 2.5.4 (I have not tried compiling any other versions).

I am using the Intel compilers and it happens at least as far back as version 17. I get the same behavior with version 17, 19 and 20 of the Intel compilers and only when activating optimizations. I have not tried the GNU compilers.

@edwardhartnett
Copy link
Collaborator

OK I will change the data line as you suggest...

@rjdave
Copy link
Author

rjdave commented May 12, 2021

Just to be clear, changing the arithmetic as I suggested should result in:

   long long expected_int64[LAT_LEN * LON_LEN] = {2147483657LL, -2147483658LL, 2147483667LL,
                                                  -2147483668LL, 2147483677LL, -2147483678LL};

I have tested this with the Intel compiler with -g and -O3 and it works for me. I have not tested it with any other compilers.

@edwardhartnett
Copy link
Collaborator

OK, I have made this change. PR up shortly...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

2 participants