Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Output with -O3 Optimization on MIPS64 Due to Complex Type Casting and Nested Loop Structure #70495

Open
gyuminb opened this issue Oct 27, 2023 · 2 comments

Comments

@gyuminb
Copy link

gyuminb commented Oct 27, 2023

Description:

When compiling the given PoC on MIPS64 architecture with Clang-18, an unexpected behavior arises exclusively under -O3 optimization. The issue seems to be related to complex type casting, especially involving _Bool, and the nested loop structure. The program exhibits different outputs when optimization level -O3 is applied as compared to -O0, -O1, and -O2.

Environment:

  • Compiler: Clang-18
  • Target Architecture: MIPS64
  • Optimization Level: This issue is exclusively observed at O3 optimization level.

PoC:

#include <stdio.h>

short short_val = 1;
long long int flag_val = 0x0123456789abcdefLL;
_Bool bool_flag = 0;
unsigned long long int ull_result = 0ULL;
short short_array[18][18];
int int_array[18][18][18];
_Bool bool_array[15][15];
long long int result_array[15];
unsigned long long int subtraction_const = 18446744073709526850ULL;

void init() {
    for (size_t i = 0; i < 18; ++i) 
        for (size_t j = 0; j < 18; ++j) {
            short_array[i][j] = (short)-24751; // Represents 18446744073709526865 when interpreted as unsigned long long due t overflow.
            for (size_t k = 0; k < 18; ++k) 
                int_array[i][j][k] = 1;
        }
    
    for (size_t i = 0; i < 15; ++i) 
        for (size_t j = 0; j < 15; ++j)
            bool_array[i][j] = 1;

    for (size_t i = 0; i < 15; ++i) 
        result_array[i] = (i % 2 == 0) ? 1LL : 0LL;
}

#define max(a,b) \
    ({ __typeof__ (a) _a = (a); \
       __typeof__ (b) _b = (b); \
       _a > _b ? _a : _b; })

int main() {
    init();
    for (_Bool flag = 0; flag < 1; flag += (_Bool)1) 
    {
        for (int i = 0; i < 12; i += 2) 
        {
            for (int j = 0; j < 13; j += 3) 
            {
                
                for (unsigned long long int k = 0; k < ((unsigned long long int) max(((_Bool) int_array[i][0][0] ? (int) short_array[i][0] : 1), ((_Bool) int_array[0][0][i] ? (int) short_array[i][0] : int_array[i][0][0]))) - subtraction_const/*15*/; k++)
                {
                    if (((_Bool) flag_val))
                    {
                        bool_flag = (_Bool) max((bool_flag), ((_Bool) short_val));
                        ull_result = (unsigned long long int) (max((int) bool_array[0][i], 0) % (int) int_array[0][j][k]);
                    }
                }
            } 
        } 
        result_array [flag] = ((long long int) ((int) ((flag_val))));
    }
    printf("result_array [0]: %lx\n", result_array [0]);  
}

Expected Behavior:

The output for result_array[0] should remain consistent across different optimization levels.

Observed Behavior:

At -O3 optimization level, the output for result_array[0] deviates from the expected result, as seen in the other optimization levels (-O0, -O1, and -O2).

Analysis:

The bug appears to stem from a combination of factors:

  1. Complex type casting, notably involving the _Bool type.
  2. The nested loop structure, particularly the for loop containing an intricate calculation for its boundary condition.
  3. A series of assignments and type castings leading up to the assignment of result_array[flag].

When these factors interplay under -O3 optimization, the compiler might be making incorrect assumptions or optimizations that lead to the observed discrepancy.

Steps to Reproduce:

  1. Compile the PoC code using Clang-18 on MIPS64 with O3 optimization.
  2. Execute the compiled binary.
  3. Notice the different output for result_array[0] as compared to other optimization levels.

Evidence:

Outputs from various optimization levels:

-O0, -O1, -O2 Output:
result_array [0]: ffffffff89abcdef

-O3 Output:
result_array [0]: 123456789abcdef

Conclusion:

The -O3 optimization level in Clang-18 for MIPS64 introduces a distinct behavior in the provided PoC, which is not observed at lower optimization levels. Given the complexity of the code, especially around type casting and loop structures, there seems to be a misoptimization issue in the compiler. This inconsistency raises concerns about the reliability of high-level optimizations in certain scenarios and warrants a thorough investigation.

@llvmbot
Copy link

llvmbot commented Oct 27, 2023

@llvm/issue-subscribers-backend-mips

Author: None (gyuminb)

### **Description:**

When compiling the given PoC on MIPS64 architecture with Clang-18, an unexpected behavior arises exclusively under -O3 optimization. The issue seems to be related to complex type casting, especially involving _Bool, and the nested loop structure. The program exhibits different outputs when optimization level -O3 is applied as compared to -O0, -O1, and -O2.

Environment:

  • Compiler: Clang-18
  • Target Architecture: MIPS64
  • Optimization Level: This issue is exclusively observed at O3 optimization level.

PoC:

#include &lt;stdio.h&gt;

short short_val = 1;
long long int flag_val = 0x0123456789abcdefLL;
_Bool bool_flag = 0;
unsigned long long int ull_result = 0ULL;
short short_array[18][18];
int int_array[18][18][18];
_Bool bool_array[15][15];
long long int result_array[15];
unsigned long long int subtraction_const = 18446744073709526850ULL;

void init() {
    for (size_t i = 0; i &lt; 18; ++i) 
        for (size_t j = 0; j &lt; 18; ++j) {
            short_array[i][j] = (short)-24751; // Represents 18446744073709526865 when interpreted as unsigned long long due t overflow.
            for (size_t k = 0; k &lt; 18; ++k) 
                int_array[i][j][k] = 1;
        }
    
    for (size_t i = 0; i &lt; 15; ++i) 
        for (size_t j = 0; j &lt; 15; ++j)
            bool_array[i][j] = 1;

    for (size_t i = 0; i &lt; 15; ++i) 
        result_array[i] = (i % 2 == 0) ? 1LL : 0LL;
}

#define max(a,b) \
    ({ __typeof__ (a) _a = (a); \
       __typeof__ (b) _b = (b); \
       _a &gt; _b ? _a : _b; })

int main() {
    init();
    for (_Bool flag = 0; flag &lt; 1; flag += (_Bool)1) 
    {
        for (int i = 0; i &lt; 12; i += 2) 
        {
            for (int j = 0; j &lt; 13; j += 3) 
            {
                
                for (unsigned long long int k = 0; k &lt; ((unsigned long long int) max(((_Bool) int_array[i][0][0] ? (int) short_array[i][0] : 1), ((_Bool) int_array[0][0][i] ? (int) short_array[i][0] : int_array[i][0][0]))) - subtraction_const/*15*/; k++)
                {
                    if (((_Bool) flag_val))
                    {
                        bool_flag = (_Bool) max((bool_flag), ((_Bool) short_val));
                        ull_result = (unsigned long long int) (max((int) bool_array[0][i], 0) % (int) int_array[0][j][k]);
                    }
                }
            } 
        } 
        result_array [flag] = ((long long int) ((int) ((flag_val))));
    }
    printf("result_array [0]: %lx\n", result_array [0]);  
}

Expected Behavior:

The output for result_array[0] should remain consistent across different optimization levels.

Observed Behavior:

At -O3 optimization level, the output for result_array[0] deviates from the expected result, as seen in the other optimization levels (-O0, -O1, and -O2).

Analysis:

The bug appears to stem from a combination of factors:

  1. Complex type casting, notably involving the _Bool type.
  2. The nested loop structure, particularly the for loop containing an intricate calculation for its boundary condition.
  3. A series of assignments and type castings leading up to the assignment of result_array[flag].

When these factors interplay under -O3 optimization, the compiler might be making incorrect assumptions or optimizations that lead to the observed discrepancy.

Steps to Reproduce:

  1. Compile the PoC code using Clang-18 on MIPS64 with O3 optimization.
  2. Execute the compiled binary.
  3. Notice the different output for result_array[0] as compared to other optimization levels.

Evidence:

Outputs from various optimization levels:

-O0, -O1, -O2 Output:
result_array [0]: ffffffff89abcdef

-O3 Output:
result_array [0]: 123456789abcdef

Conclusion:

The -O3 optimization level in Clang-18 for MIPS64 introduces a distinct behavior in the provided PoC, which is not observed at lower optimization levels. Given the complexity of the code, especially around type casting and loop structures, there seems to be a misoptimization issue in the compiler. This inconsistency raises concerns about the reliability of high-level optimizations in certain scenarios and warrants a thorough investigation.

@wzssyqa
Copy link
Contributor

wzssyqa commented Feb 6, 2024

Interesting. Yet another bug about sign extension.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants