Description
Description:
When compiling the provided PoC on ARM64 architecture with Clang-18, there seems to be a pointer dereference optimization issue. The behavior of the code changes based on different optimization levels, and it's influenced by the data patterns used as well as the structure of adjacent printf
calls. For some data patterns, the issue is observed across optimization levels O1
to O3
. Intriguingly, when replacing two identical printf
calls with two distinct ones before and after the problematic line, the issue exclusively appears in O3
. It suggests that the optimization is influenced not just by data patterns but also by the presence and structure of adjacent print functions.
Environment:
- Compiler: Clang-18
- Target Architecture: ARM64
- Optimization Level: This issue is noticeable at
O1
,O2
, andO3
depending on the data patterns used. For patterns like0x123456789abcdeff
, the issue can be observed from to , but for patterns like0x1234567fffffffff
, it exclusively appears at . - OS: Ubuntu 22.04.2
PoC:
#include <stdio.h>
#include <stdint.h>
struct StructA {
uint32_t val1;
const int8_t val2;
uint64_t val3;
uint16_t val4;
};
union UnionB {
uint32_t u_val1;
struct StructA s_val;
uint32_t u_val2;
int32_t u_val3;
int32_t u_val4;
uint64_t u_val5;
};
static union UnionB main_union = {1UL};
static uint32_t *ptr_val1 = &main_union.s_val.val1;
static uint32_t **double_ptr = &ptr_val1;
static uint32_t ***triple_ptr = &double_ptr;
int main() {
printf("main_union.u_val5: %lx\n", main_union.u_val5);
uint32_t **local_double_ptr = &ptr_val1;
uint64_t local_val = 0x123456789abcedffLL;
uint64_t *local_ptr = &main_union.u_val5;
(*local_ptr) = local_val;
(triple_ptr = &local_double_ptr);
(***triple_ptr) = 0UL;
printf("main_union.u_val5: %lx\n", main_union.u_val5);
return 0;
}
Expected Behavior:
The value of main_union.u_val5
should be consistent across different optimization levels after the pointer dereference operation.
Observed Behavior:
he value of main_union.u_val5
changes depending on the optimization level, data patterns, and the structure of adjacent printf
calls.
Analysis:
The optimization seems to overlook the (**triple_ptr) = 0UL;
operation. The discrepancy in output, depending on the structure of printf
calls and data patterns, indicates a misoptimization during the compilation process. Notably, when changing the structure of the printf
statement or using a data pattern with repeating digits, the issue singularly appears in O3
optimization level. This brings to light the complex nature of this optimization bug that is sensitive to both the data patterns and surrounding code structures.
Steps to Reproduce:
- Compile the PoC code using Clang-18 on ARM64 with various optimization levels (
O1
,O2
, andO3
). - Execute the compiled binary.
- Observe the inconsistent behavior dependent on optimization level, data patterns, and
printf
structure.
Evidence:
The following output showcases the behavior for various optimization levels:
O0 Output:
main_union.u_val5: 1
main_union.u_val5: 1234567800000000
O1 Output:
main_union.u_val5: 1
main_union.u_val5: 123456789abcdeff
O2 Output:
main_union.u_val5: 1
main_union.u_val5: 123456789abcdeff
O3 Output:
main_union.u_val5: 1
main_union.u_val5: 123456789abcdeff
What's intriguing is that when we replace two identical printf
calls before and after the problematic line with two distinct printf
calls, such as:
printf("Before main_union.u_val5: %lx\n", main_union.u_val5);
and
printf("After main_union.u_val5: %lx\n", main_union.u_val5);
the issue only manifests at O3
optimization level.
Conclusion:
Across different optimization levels (O1
to O3
), there is a clear evidence of a bug likely resulting from incorrect compiler optimization. The unique scenarios under which this bug emerges, especially when altering the printf
structures or data patterns, further underline the unpredictable nature of this issue. This bug certainly requires attention to ensure consistent and correct behavior across all optimization levels.