Skip to content

Conversation

wenju-he
Copy link
Contributor

Skip the transform that replaces:
%a = trunc i64 %x to i16
%b = sext i16 %a to i32
with
%a = trunc i64 %x to i32
%b = shl i32 %a, 16
%c = ashr exact i32 %b, 16
when (pre-trunc) source type is wider than the final destination type. Modern architectures typically have efficient sign-extend instruction. It is preferable to preserve the sext for this case.

Resolves #116019

…ofitable

Skip the transform that replaces:
  %a = trunc i64 %x to i16
  %b = sext i16 %a to i32
with
  %a = trunc i64 %x to i32
  %b = shl i32 %a, 16
  %c = ashr exact i32 %b, 16
when (pre-trunc) source type is wider than the final destination type.
Modern architectures typically have efficient sign-extend instruction.
It is preferable to preserve the sext for this case.

Resolves llvm#116019
@wenju-he wenju-he requested a review from nikic as a code owner September 24, 2025 10:10
@wenju-he wenju-he requested review from Copilot and removed request for nikic September 24, 2025 10:10
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Sep 24, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 24, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Wenju He (wenju-he)

Changes

Skip the transform that replaces:
%a = trunc i64 %x to i16
%b = sext i16 %a to i32
with
%a = trunc i64 %x to i32
%b = shl i32 %a, 16
%c = ashr exact i32 %b, 16
when (pre-trunc) source type is wider than the final destination type. Modern architectures typically have efficient sign-extend instruction. It is preferable to preserve the sext for this case.

Resolves #116019


Full diff: https://github.com/llvm/llvm-project/pull/160483.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (+12-3)
  • (modified) llvm/test/Transforms/InstCombine/sext.ll (+11)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
index 9ca8194b44f8f..ea029e47730bf 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
@@ -1528,7 +1528,17 @@ Instruction *InstCombinerImpl::visitSExt(SExtInst &Sext) {
   }
 
   // Try to extend the entire expression tree to the wide destination type.
-  if (shouldChangeType(SrcTy, DestTy) && canEvaluateSExtd(Src, DestTy)) {
+  bool shouldExtendExpression = true;
+  Value *X = nullptr;
+  // Do not extend expression in the trunc + sext pattern when destination type
+  // is narrower than original (pre-trunc) type: modern architectures typically
+  // provide efficient sign-extend instruction, so preserving the sext is
+  // preferable here.
+  if (match(Src, m_Trunc(m_Value(X))))
+    if (X->getType()->getScalarSizeInBits() > DestBitSize)
+      shouldExtendExpression = false;
+  if (shouldExtendExpression && shouldChangeType(SrcTy, DestTy) &&
+      canEvaluateSExtd(Src, DestTy)) {
     // Okay, we can transform this!  Insert the new expression now.
     LLVM_DEBUG(
         dbgs() << "ICE: EvaluateInDifferentType converting expression type"
@@ -1548,8 +1558,7 @@ Instruction *InstCombinerImpl::visitSExt(SExtInst &Sext) {
                                       ShAmt);
   }
 
-  Value *X;
-  if (match(Src, m_Trunc(m_Value(X)))) {
+  if (X) {
     // If the input has more sign bits than bits truncated, then convert
     // directly to final type.
     unsigned XBitSize = X->getType()->getScalarSizeInBits();
diff --git a/llvm/test/Transforms/InstCombine/sext.ll b/llvm/test/Transforms/InstCombine/sext.ll
index c72614d526036..92748034f52ab 100644
--- a/llvm/test/Transforms/InstCombine/sext.ll
+++ b/llvm/test/Transforms/InstCombine/sext.ll
@@ -320,6 +320,17 @@ define i10 @test19(i10 %i) {
   ret i10 %d
 }
 
+define i32 @test20(i64 %i) {
+; CHECK-LABEL: @test20(
+; CHECK-NEXT:    [[A:%.*]] = trunc i64 [[I:%.*]] to i16
+; CHECK-NEXT:    [[B:%.*]] = sext i16 [[A]] to i32
+; CHECK-NEXT:    ret i32 [[B]]
+;
+  %a = trunc i64 %i to i16
+  %b = sext i16 %a to i32
+  ret i32 %b
+}
+
 define i32 @smear_set_bit(i32 %x) {
 ; CHECK-LABEL: @smear_set_bit(
 ; CHECK-NEXT:    [[TMP1:%.*]] = shl i32 [[X:%.*]], 24

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves instruction combining by preventing a potentially unprofitable transformation of trunc+sext patterns. When the source type is wider than the destination type, it preserves the original trunc+sext sequence instead of converting it to trunc+shl+ashr, since modern architectures have efficient sign-extend instructions.

  • Adds logic to detect when extending expressions would be counterproductive
  • Modifies the visitSExt function to skip transformation when source is wider than destination
  • Adds a test case to verify the new behavior

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp Implements the optimization logic to avoid unprofitable trunc+sext transformations
llvm/test/Transforms/InstCombine/sext.ll Adds test case verifying that i64->i16->i32 pattern preserves trunc+sext

Comment on lines 1532 to 1538
Value *X = nullptr;
// Do not extend expression in the trunc + sext pattern when destination type
// is narrower than original (pre-trunc) type: modern architectures typically
// provide efficient sign-extend instruction, so preserving the sext is
// preferable here.
if (match(Src, m_Trunc(m_Value(X))))
if (X->getType()->getScalarSizeInBits() > DestBitSize)
Copy link

Copilot AI Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable X is being reused for two different purposes. In the first block (lines 1532-1539), it's used to check the transformation condition, while later at line 1561 it's used to continue the existing trunc optimization. This creates unclear control flow since X's value depends on whether the match succeeded earlier. Consider using a separate variable name for the first check to make the code more readable.

Suggested change
Value *X = nullptr;
// Do not extend expression in the trunc + sext pattern when destination type
// is narrower than original (pre-trunc) type: modern architectures typically
// provide efficient sign-extend instruction, so preserving the sext is
// preferable here.
if (match(Src, m_Trunc(m_Value(X))))
if (X->getType()->getScalarSizeInBits() > DestBitSize)
Value *TruncSrc = nullptr;
// Do not extend expression in the trunc + sext pattern when destination type
// is narrower than original (pre-trunc) type: modern architectures typically
// provide efficient sign-extend instruction, so preserving the sext is
// preferable here.
if (match(Src, m_Trunc(m_Value(TruncSrc))))
if (TruncSrc->getType()->getScalarSizeInBits() > DestBitSize)

Copilot uses AI. Check for mistakes.


Value *X;
if (match(Src, m_Trunc(m_Value(X)))) {
if (X) {
Copy link

Copilot AI Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable X is being reused for two different purposes. In the first block (lines 1532-1539), it's used to check the transformation condition, while later at line 1561 it's used to continue the existing trunc optimization. This creates unclear control flow since X's value depends on whether the match succeeded earlier. Consider using a separate variable name for the first check to make the code more readable.

Copilot uses AI. Check for mistakes.

@wenju-he wenju-he requested a review from dtcxzyw September 24, 2025 10:12
@wenju-he wenju-he requested a review from leewei05 September 24, 2025 10:18
@wenju-he wenju-he requested a review from nikic September 24, 2025 11:51
@@ -320,6 +320,17 @@ define i10 @test19(i10 %i) {
ret i10 %d
}

define i32 @test20(i64 %i) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InstCombine doesn't change this test even without your patch (i16 is a legal type in this file).
Please change i16 to another illegal integer type or move the test into a separate file with a suitable data layout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InstCombine doesn't change this test even without your patch (i16 is a legal type in this file). Please change i16 to another illegal integer type or move the test into a separate file with a suitable data layout.

thanks @dtcxzyw, I didn't notice that.
You mean i16 is not a legal type in llvm/test/Transforms/InstCombine/sext.ll, right?
I moved the test into a new file llvm/test/Transforms/InstCombine/trunc-sext.ll with native integer widths set to 16,32 and 64, and DL.isLegalInteger returns true for them.

@wenju-he wenju-he requested a review from dtcxzyw September 30, 2025 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[InstCombine] sext(trunc(x)) should not be transformed to ashr(shl(trunc(x)))) when there is no chance to fold

3 participants