Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clang's Optimization Introduces Unexpected Sign Extension in RISC-V Bit-Field Operations #68855

Closed
gyuminb opened this issue Oct 12, 2023 · 1 comment · Fixed by #69015
Closed
Assignees
Labels
confirmed Verified by a second party llvm:SelectionDAG SelectionDAGISel as well miscompilation

Comments

@gyuminb
Copy link

gyuminb commented Oct 12, 2023

Environment:

  • Compiler: Clang-18
  • Target Architecture: RISC-V
  • Optimization Level: -O1, -O2, -O3
  • OS: (Ubuntu 22.04.2)

Summary:

While compiling code that deals with bit field operations and type casting, an unexpected behavior was noticed with optimization levels -O1, -O2, and -O3 in Clang for the RISC-V architecture. The behavior deviates from the expected results based on the C language standard and is not observed in the -O0 optimization level.

Steps to Reproduce:

  1. Compile the provided source code with Clang targeting RISC-V architecture.
  2. Use optimization levels -O1, -O2, or -O3.
  3. Execute the compiled binary.

Expected Result:

resultValue1: ffff
resultValue2: 0

Actual Result:

resultValue1: ffffffff
resultValue2: 0

Source Code to Reproduce:

#include<stdio.h>

typedef struct {
    unsigned int bitField : 13;
} CustomStruct;

unsigned int resultValue1 = 0;
short resultValue2 = 0;
CustomStruct customArray[2] = {{0U} , {0U}};

int main()
{
    resultValue1 = (unsigned int) ((unsigned short) (~(customArray[0].bitField)));
    printf("resultValue1: %x\n", resultValue1);

    resultValue2 = (short) (customArray[1].bitField);   
    printf("resultValue2: %x\n", resultValue2);
    return 0;
}

Observation:

The value for customArray[0].bitField is a 13-bit unsigned integer defined as a bit field. When all bits of this field are inverted using the ~ operator, all 13 bits are set to 1, producing a value of 0x1FFF.

Casting this value to (unsigned short) results in a 16-bit (2 bytes) value, which should then be 0xFFFF.

Further casting this value to (unsigned int) should maintain the value at 0xFFFF. This is the expected behavior as per the C language standard for type casting.

However, in the provided code, while this is the case without optimization (-O0), with optimization the value unexpectedly becomes 0xFFFFFFFF. It seems that after the cast to unsigned short, the extension to unsigned int isn't carried out correctly, possibly sign-extending rather than zero-extending the value.

This unexpected behavior suggests a potential issue with either a specific implementation of the RISC-V architecture or with this version of the Clang compiler. Such an action deviates from the expected behavior of standard C, indicating a probable compiler bug.

Additional Information:

  • https://godbolt.org/z/Pv3Gaacv9
  • The issue seems to stem from the slli and srli instructions used in succession in the optimized versions, resulting in sign-extension.

Recommendation:

Please verify the behavior observed using the provided Godbolt link and investigate the underlying cause in the Clang compiler for RISC-V. It's essential to ensure consistent behavior across optimization levels and adherence to the C language standard.

@github-actions github-actions bot added the clang Clang issues not falling into any other category label Oct 12, 2023
@EugeneZelenko EugeneZelenko added backend:RISC-V and removed clang Clang issues not falling into any other category labels Oct 12, 2023
@llvmbot
Copy link

llvmbot commented Oct 12, 2023

@llvm/issue-subscribers-backend-risc-v

Author: None (gyuminb)

### **Environment:**
  • Compiler: Clang-18
  • Target Architecture: RISC-V
  • Optimization Level: -O1, -O2, -O3
  • OS: (Ubuntu 22.04.2)

Summary:

While compiling code that deals with bit field operations and type casting, an unexpected behavior was noticed with optimization levels -O1, -O2, and -O3 in Clang for the RISC-V architecture. The behavior deviates from the expected results based on the C language standard and is not observed in the -O0 optimization level.

Steps to Reproduce:

  1. Compile the provided source code with Clang targeting RISC-V architecture.
  2. Use optimization levels -O1, -O2, or -O3.
  3. Execute the compiled binary.

Expected Result:

resultValue1: ffff
resultValue2: 0

Actual Result:

resultValue1: ffffffff
resultValue2: 0

Source Code to Reproduce:

#include&lt;stdio.h&gt;

typedef struct {
    unsigned int bitField : 13;
} CustomStruct;

unsigned int resultValue1 = 0;
short resultValue2 = 0;
CustomStruct customArray[2] = {{0U} , {0U}};

int main()
{
    resultValue1 = (unsigned int) ((unsigned short) (~(customArray[0].bitField)));
    printf("resultValue1: %x\n", resultValue1);

    resultValue2 = (short) (customArray[1].bitField);   
    printf("resultValue2: %x\n", resultValue2);
    return 0;
}

Observation:

The value for customArray[0].bitField is a 13-bit unsigned integer defined as a bit field. When all bits of this field are inverted using the ~ operator, all 13 bits are set to 1, producing a value of 0x1FFF.

Casting this value to (unsigned short) results in a 16-bit (2 bytes) value, which should then be 0xFFFF.

Further casting this value to (unsigned int) should maintain the value at 0xFFFF. This is the expected behavior as per the C language standard for type casting.

However, in the provided code, while this is the case without optimization (-O0), with optimization the value unexpectedly becomes 0xFFFFFFFF. It seems that after the cast to unsigned short, the extension to unsigned int isn't carried out correctly, possibly sign-extending rather than zero-extending the value.

This unexpected behavior suggests a potential issue with either a specific implementation of the RISC-V architecture or with this version of the Clang compiler. Such an action deviates from the expected behavior of standard C, indicating a probable compiler bug.

Additional Information:

  • https://godbolt.org/z/Pv3Gaacv9
  • The issue seems to stem from the slli and srli instructions used in succession in the optimized versions, resulting in sign-extension.

Recommendation:

Please verify the behavior observed using the provided Godbolt link and investigate the underlying cause in the Clang compiler for RISC-V. It's essential to ensure consistent behavior across optimization levels and adherence to the C language standard.

@dtcxzyw dtcxzyw added miscompilation confirmed Verified by a second party labels Oct 12, 2023
@dtcxzyw dtcxzyw self-assigned this Oct 12, 2023
dtcxzyw added a commit that referenced this issue Oct 13, 2023
When narrowing logic ops(OR/XOR) with constant rhs, `DAGCombiner` will fixup the constant rhs node.
It is incorrect when lhs is also a constant. For example, we will incorrectly replace `xor OpaqueConstant:i64<8191>, Constant:i64<-1>` with `xor (and OpaqueConstant:i64<8191>, Constant:i64<65535>), Constant:i64<-1>`.

Fixes #68855.
@EugeneZelenko EugeneZelenko added llvm:SelectionDAG SelectionDAGISel as well and removed backend:RISC-V labels Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed Verified by a second party llvm:SelectionDAG SelectionDAGISel as well miscompilation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants