JIT: Transform 'CONST - x' to 'xor x, CONST' when possible by BoyBaykiller · Pull Request #126529 · dotnet/runtime

BoyBaykiller · 2026-04-04T00:48:24Z

Given any CONST - x and range info for x obtained from IntegralRange::ForNode see if we can transform it to x ^ CONST. These changes motivate improvements in IntegralRange::ForNode and RangeOps (#126553). It also made me find a missing GT_NOT in asserprop to enable VN based folding.

Found by looking at diffs from #126442

Example:

int Byte255(byte x)
{
    return 255 - x;
}

;; ------ BASE
G_M64250_IG02:  ;; offset=0x0000
       movzx    rax, dl
       neg      eax
       add      eax, 255
						;; size=10 bbWeight=1 PerfScore 0.75
G_M64250_IG03:  ;; offset=0x000A
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00

;; ------ DIFF
G_M64250_IG02:  ;; offset=0x0000
       movzx    rax, dl
       xor      eax, 255
						;; size=8 bbWeight=1 PerfScore 0.50
G_M64250_IG03:  ;; offset=0x0008
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00

int IntMin1(int x)
{
    return -1 - x;
}

;; ------ BASE
G_M50835_IG02:  ;; offset=0x0000
       mov      eax, edx
       neg      eax
       dec      eax
						;; size=6 bbWeight=1 PerfScore 0.75
G_M50835_IG03:  ;; offset=0x0006
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00
;; ------ DIFF
G_M50835_IG02:  ;; offset=0x0000
       mov      eax, edx
       not      eax
						;; size=4 bbWeight=1 PerfScore 0.50
G_M50835_IG03:  ;; offset=0x0004
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00

dotnet-policy-service · 2026-04-04T00:49:26Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

…loss of data'

…_no_tiered_compilation regression

BoyBaykiller · 2026-04-04T10:35:22Z

There is one type of regression where GetRangeFromAssertions doesn't handle XOR and we fail to remove useless JTRUE. Will fix that in a different PR.

saucecontrol · 2026-04-05T17:55:08Z

As a generalization, the rule is that you can implement subtraction as xor any time you can be sure there won't be a borrow, because binary subtraction is defined as an xor followed by borrow.

This is always true when the left operand is a bit pattern of all ones and the right operand is less than or equal to the left. So naturally, it works for subtraction of any integral type if left has all bits set.

It also happens to work when left has all but the MSB set, because in that case, the only borrow could come from beyond the MSB. This is why the optimization works for int.MaxValue even if subtraction is of type uint (meaning right could be larger).

This also means, for example, that with a byte subtraction, 127 - x can always be implemented as xor, provided you then zero-extend the byte, because any borrow will propagate to all the high bits of the register.

The reason I mention this is that if you have a range assertion that can prove right is smaller for any reason, it can apply there too. We have examples where this optimization is done manually in the libraries code:

runtime/src/libraries/System.Private.CoreLib/src/System/Numerics/BitOperations.cs

Lines 201 to 204 in 937dec6

    
           // LZCNT returns index starting from MSB, whereas BSR gives the index from LSB. 
        
           // 31 ^ BSR here is equivalent to 31 - BSR since the BSR result is always between 0 and 31. 
        
           // This saves an instruction, as subtraction from constant requires either MOV/SUB or NEG/ADD. 
        
           return 31 ^ (int)X86Base.BitScanReverse(value);

BoyBaykiller · 2026-04-05T18:59:08Z

Thanks for the explanation! I was just about to look for the more general case to use in #126553.
Clang seems to catch it: https://godbolt.org/z/e8PjefYKx

Unfortunately I am not sure there is a standard way to get general range information (or LE assertion) in morph. And it's only run if fgGlobalMorph...

As for handling smaller types: I skipped that because I still need to see what's the proper & easy way to see through this IR that we create for small types and get the actual type as defined in the C# source:

CAST      int <- ubyte <- int
*  LCL_VAR   int    V01 arg1          (last use)

saucecontrol · 2026-04-06T06:06:39Z

what's the proper & easy way to see through this IR that we create for small types

In IL, you'll always have a cast back to the small type after an arithmetic operation, because IL doesn't define those operations for anything smaller than int32. Within JIT, small types may be flagged as normalize on load or normalize on store, so JIT will force zero/sign extension as appropriate there.

Unfortunately I am not sure there is a standard way to get general range information (or LE assertion) in morph.

I'm not terribly familiar with that part of the code, but you can fetch assertions from morph. Here's an example that might lead you to the right parts.

runtime/src/coreclr/jit/morph.cpp

Lines 3226 to 3239 in 3f14c31

    
           // Small-typed arguments and aliased locals are normalized on load. Other small-typed locals are 
        
           // normalized on store. If this is one of the former, insert a narrowing cast on the load. 
        
           //         ie. Convert: var-short --> cast-short(var-int) 
        
           // 
        
           if (fgGlobalMorph && lclNode->OperIs(GT_LCL_VAR) && varDsc->lvNormalizeOnLoad() && 
        
               /* TODO-ASG: delete this zero-diff quirk */ lclNode->CanCSE()) 
        
           { 
        
               var_types lclVarType = varDsc->TypeGet(); 
        
               // Assertion prop can tell us to omit adding a cast here. This is useful when the local is a small-typed 
        
               // parameter that is passed in a register: in that case, the ABI specifies that the upper bits might be 
        
               // invalid, but the assertion guarantees us that we have normalized when we wrote it. 
        
               if (optLocalAssertionProp && 
        
                   optAssertionIsSubrange(lclNode, IntegralRange::ForType(lclVarType), apLocal) != NO_ASSERTION_INDEX)

And you can see some example assertions that could be interesting for this optimization if only they were more specific:

runtime/src/coreclr/jit/assertionprop.cpp

Lines 306 to 314 in 3f14c31

    
           case NI_AVX2_LeadingZeroCount: 
        
           case NI_AVX2_TrailingZeroCount: 
        
           case NI_AVX2_X64_LeadingZeroCount: 
        
           case NI_AVX2_X64_TrailingZeroCount: 
        
           case NI_X86Base_PopCount: 
        
           case NI_X86Base_X64_PopCount: 
        
               // Note: No advantage in using a precise range for IntegralRange. 
        
               // Example: IntCns = 42 gives [0..127] with a non -precise range, [42,42] with a precise range. 
        
               return {SymbolicIntegerValue::Zero, SymbolicIntegerValue::ByteMax};

I'm not following exactly why the comment says there's no advantage to a more specific range, though.

* use 'IntegralRange::ForNode' for somewhat better range info (could be improved still) * move to fgOptimizeAddition

BoyBaykiller · 2026-04-07T23:36:43Z

I've generalized the transformation to catch all cases, given CNS - x and range info for x.
I've switched to IntegralRange::ForNode to attain this range info, but as you said its somewhat coarse. Apparently there was a TODO once to make it more precise: #106982. I think this should be revisited.

In theory the case you showed would be naturally handled now if range for BitScanReverse was proper.

Also in order to handle this byte subtract case you mentioned:

byte Byte127(byte x)
{
    return (byte)(127 - x);
}

I need to detect the outer byte cast so I know the range is in [0, 255] which I don't know how to properly do in fgOptimizeAddition. However 255 - x works.

BoyBaykiller · 2026-04-08T02:51:02Z

@EgorBo PTAL

If this gets merged we can justify to add general RangeOps:Xor to rangecheck #126553 as this fixes some regressions from this PR.

I think it would also be nice to revisit making IntegralRange able to handle arbitrary ranges instead of just "symbolic constants". With a precise range it would automatically use XOR here and the code could be simplified to simple subtract:

runtime/src/libraries/System.Private.CoreLib/src/System/Numerics/BitOperations.cs

Lines 201 to 204 in 937dec6

    
           // LZCNT returns index starting from MSB, whereas BSR gives the index from LSB. 
        
           // 31 ^ BSR here is equivalent to 31 - BSR since the BSR result is always between 0 and 31. 
        
           // This saves an instruction, as subtraction from constant requires either MOV/SUB or NEG/ADD. 
        
           return 31 ^ (int)X86Base.BitScanReverse(value);

Copilot

Pull request overview

This PR improves CoreCLR JIT morph optimizations by recognizing when CONST - x (canonicalized as ADD(NEG(x), CONST)) can be safely rewritten as XOR(x, CONST) using IntegralRange::ForNode range information, and it enables additional VN-based folding in assertion propagation.

Changes:

Add an ADD(NEG(x), CONST) => XOR(x, CONST) transform in fgOptimizeAddition guarded by an IntegralRange-based validity check.
Update optVNBasedFoldCurStmt to include GT_NOT among the node kinds eligible for VN-based folding.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`src/coreclr/jit/morph.cpp`	Adds a range-checked transform to rewrite certain constant subtractions into XOR for better codegen.
`src/coreclr/jit/assertionprop.cpp`	Enables VN-based folding for `GT_NOT` nodes by adding it to the foldable opcode set.

Copilot · 2026-04-08T04:18:30Z

src/coreclr/jit/morph.cpp

+                    uint64_t mask = UINT64_MAX >> BitOperations::LeadingZeroCount(lo ^ hi);
+                    mask          = lo | mask;


mask computation can invoke undefined behavior when lo == hi (i.e., lo ^ hi == 0): BitOperations::LeadingZeroCount(0) returns 64, so UINT64_MAX >> 64 is an invalid shift in C++. This can miscompile the JIT itself and may incorrectly allow/deny the transform for singleton ranges. Handle the lo == hi case explicitly (e.g., treat the range mask as just lo, or otherwise ensure the shift count is < 64).

Suggested change

uint64_t mask = UINT64_MAX >> BitOperations::LeadingZeroCount(lo ^ hi);

mask = lo | mask;

uint64_t diff = lo ^ hi;

uint64_t mask = lo;

if (diff != 0)

{

mask |= UINT64_MAX >> BitOperations::LeadingZeroCount(diff);

}

BoyBaykiller · 2026-04-10T04:23:23Z

@EgorBo

EgorBo · 2026-04-10T15:16:12Z

src/coreclr/jit/morph.cpp


-            DEBUG_DESTROY_NODE(op1);
+                auto isSubToXorValid = [=](uint64_t cns, IntegralRange range) {


why do we need a single-use lambda here?

Just a personal style choice. We will use the same lambda in RangeOps::Xor. This make the concepts clear. And allows us to easily factor it out into a shared function, potentially.

EgorBo · 2026-04-10T15:18:08Z

src/coreclr/jit/morph.cpp

+                {
+                    add->ChangeOper(GT_XOR);
+                    add->gtOp1 = op1->gtGetOp1();
+                    return fgMorphTree(add);


why the path below just returns the changed node while this calls morph on it?

We may produce XOR(x, -1) which can be further transformed into NOT(x)

EgorBo · 2026-04-10T15:18:40Z

src/coreclr/jit/morph.cpp

+                uint64_t      cns   = (uint64_t)op2->AsIntConCommon()->IntegralValue();
+                if (isSubToXorValid(cns, range))
+                {
+                    add->ChangeOper(GT_XOR);


I assume you need to pass PRESERVE_VN here

Yeah ok. I saw enough cases in morph in other similar transformations where there was no PRESERVE_VN, so I wasnt sure. Also, should I be using ChangeOper or SetOper? @EgorBo

EgorBo · 2026-04-10T15:25:51Z

src/coreclr/jit/assertionprop.cpp

@@ -6118,6 +6037,7 @@ Compiler::fgWalkResult Compiler::optVNBasedFoldCurStmt(BasicBlock* block,
        case GT_LSH:
        case GT_RSH:
        case GT_RSZ:
+        case GT_NOT:


how is this related to this PR?

I was trying to check the diffs to find out what exactly this transformation helps with and this change adds additional diffs from my understanding

This PR can produce NOT (from XOR(x, -1)). And this exposed the fact that GT_NOT was being ignored by VN based (constant) folding. In 29009 libraries_tests_no_tiered_compilation diffs is an example case:

And adding this one line fixes the regression. Although if you really want I can do it in separate PR.

EgorBo · 2026-04-10T15:28:04Z

src/coreclr/jit/morph.cpp

-            add->gtOp2 = op1->AsOp()->gtGetOp1();
+            if (op2->IsIntegralConst())
+            {
+                // ADD(NEG(x), CONST) => XOR(x, CONST)


what makes it legal if ADD (or NEG?) are in checked context again?

EgorBo · 2026-04-10T15:31:49Z

src/coreclr/jit/morph.cpp


-            return add;
+                    // This mask is a OR of all numbers in [lo, hi]


my understanding that all we need to check is:

(KnownBits(x) & ~CONST) == 0

what would KnownBits(x) be?

* Add 'CONST - x' -> 'xor x, CONST' transformation

0d5fb4a

github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 4, 2026

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Apr 4, 2026

BoyBaykiller added 3 commits April 4, 2026 02:52

* use std::swap

35d21ee

try fix CI failure 'conversion from '__int64' to 'ssize_t', possible …

26c96f3

…loss of data'

* add missing GT_NOT to VN based folding to fix 29009 libraries_tests…

b8a3a15

…_no_tiered_compilation regression

BoyBaykiller changed the title ~~JIT: Transform 'CONST - x' -> 'xor x, CONST' when possible~~ JIT: Transform 'CONST - x' to 'xor x, CONST' when possible Apr 4, 2026

BoyBaykiller marked this pull request as ready for review April 4, 2026 10:27

BoyBaykiller mentioned this pull request Apr 4, 2026

JIT: Add and use RangeOps::Xor #126553

Draft

* generalize to catch all cases

04f1758

* use 'IntegralRange::ForNode' for somewhat better range info (could be improved still) * move to fgOptimizeAddition

build-analysis bot mentioned this pull request Apr 8, 2026

Unable to pull image from mcr.microsoft.com #117164

Open

EgorBo requested a review from Copilot April 8, 2026 04:14

Copilot started reviewing on behalf of EgorBo April 8, 2026 04:15 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

* feedback

9503eb4

build-analysis bot mentioned this pull request Apr 10, 2026

System.Net.NameResolution.Tests DNS failures: Name or service not known #126641

Open

EgorBo reviewed Apr 10, 2026

View reviewed changes

		uint64_t mask = UINT64_MAX >> BitOperations::LeadingZeroCount(lo ^ hi);
		mask = lo \| mask;

-                    uint64_t mask = UINT64_MAX >> BitOperations::LeadingZeroCount(lo ^ hi);
-                    mask          = lo | mask;
+                    uint64_t diff = lo ^ hi;
+                    uint64_t mask = lo;
+                    if (diff != 0)
+                    {
+                        mask |= UINT64_MAX >> BitOperations::LeadingZeroCount(diff);
+                    }


		DEBUG_DESTROY_NODE(op1);
		auto isSubToXorValid = [=](uint64_t cns, IntegralRange range) {

Conversation

BoyBaykiller commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dotnet-policy-service bot commented Apr 4, 2026

Uh oh!

BoyBaykiller commented Apr 4, 2026

Uh oh!

saucecontrol commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BoyBaykiller commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saucecontrol commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BoyBaykiller commented Apr 7, 2026

Uh oh!

BoyBaykiller commented Apr 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

BoyBaykiller commented Apr 10, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BoyBaykiller Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

BoyBaykiller commented Apr 4, 2026 •

edited

Loading

saucecontrol commented Apr 5, 2026 •

edited

Loading

BoyBaykiller commented Apr 5, 2026 •

edited

Loading

saucecontrol commented Apr 6, 2026 •

edited

Loading

BoyBaykiller Apr 10, 2026 •

edited

Loading