[AMD64][APX] Introduce the remaining NCI instructions - CTEST, CFCMOV by Ruihan-Yin · Pull Request #127536 · dotnet/runtime

Ruihan-Yin · 2026-04-28T22:56:12Z

Introduce the renaming APX-NCI instructions: CTESTcc and CFCMOVcc

CTESTcc:

with CTEST, now CCMP reg, 0 can be optimized to CTEST reg, reg, which is self contained and 1-byte smaller, this optimization comes with the PR as well

CFCMOVcc:

There is no CFCMOV optimization in JIT at the moment.

dotnet-policy-service · 2026-04-28T22:57:53Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR extends the x64 JIT’s APX support by introducing new conditional instructions (CTEST and CFCMOV), updating EVEX/APX encoding paths in the emitter, and adding targeted emitter unit tests.

Changes:

Add APX instruction definitions for CFCMOV and CTEST (and related instruction-range markers).
Update emitter logic to recognize/encode these new conditional instructions and related EVEX fields (e.g., DFV/NF/ND handling).
Add AMD64 emitter unit tests covering CFCMOV and CTEST encodings and integrate them into the unit test dispatcher.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
src/coreclr/jit/instrsxarch.h	Adds APX instruction definitions/ranges for CFCMOV + CTEST under TARGET_AMD64.
src/coreclr/jit/emitxarch.h	Updates emitter APIs and adds helpers to classify APX conditional instructions.
src/coreclr/jit/emitxarch.cpp	Implements new APX conditional classification and extends EVEX/APX emission logic (ND/NF/DFV, prefix/layout handling).
src/coreclr/jit/codegenxarch.cpp	Adds CCMP-to-CTEST optimization for compare-with-zero and introduces emitter unit tests for CFCMOV/CTEST.
src/coreclr/jit/codegenlinear.cpp	Wires new “cfcmov” and “ctest” unit test sections into genEmitterUnitTests().
src/coreclr/jit/codegen.h	Declares new emitter unit test entry points and tightens AMD64-only helpers.

Copilot · 2026-04-28T23:00:44Z

+INST3(cfcmovg,          "cfcmovg",          IUM_WR, 0x00004F,    BAD_CODE,    0x00004F,                                  1C,           2X,    INS_TT_NONE,    Reads_OF | Reads_SF | Reads_ZF | INS_Flags_Has_NDD | INS_Flags_Has_NF)
+#define LAST_CFCMOV_INSTRUCTION INS_cfcmovg
+#define FIRST_CTEST_INSTRUCTION INS_ctesto
+INST3(ctesto,            "ctesto",            IUM_RD, 0x000084,    0x00008F6,   BAD_CODE,                                  1C,           2X,    INS_TT_NONE,    Writes_OF | Writes_SF | Writes_ZF | Writes_CF | INS_FLAGS_Has_Sbit)


ctesto uses a different MI encoding value (0x00008F6) than the rest of the CTEST variants (0x00000F6). Since the intent (per emitter logic) appears to be that these variants share the same base encoding and differ only by the SCC field, this inconsistency is very likely a typo and can break emission for ctesto immediate forms. Make ctesto’s MI encoding consistent with the other CTEST instructions (or document why it must differ).

Suggested change

INST3(ctesto, "ctesto", IUM_RD, 0x000084, 0x00008F6, BAD_CODE, 1C, 2X, INS_TT_NONE, Writes_OF | Writes_SF | Writes_ZF | Writes_CF | INS_FLAGS_Has_Sbit)

INST3(ctesto, "ctesto", IUM_RD, 0x000084, 0x00000F6, BAD_CODE, 1C, 2X, INS_TT_NONE, Writes_OF | Writes_SF | Writes_ZF | Writes_CF | INS_FLAGS_Has_Sbit)

Corrected,

to provide more details, there are 2 equivalent forms for the MI form for CTEST

There is no semantical difference between the two, but for consistency, we use /0 variant.

Copilot · 2026-04-28T23:00:44Z

+        if (intConst->IconValue() == 0)
+        {
+            // ctest reg, reg is 1-byte shorter encoding than ccmp reg, 0.
+            instruction ctestIns = (instruction)(ccmpIns + FIRST_CTEST_INSTRUCTION - FIRST_CCMP_INSTRUCTION);


Deriving ctestIns via instruction-enum arithmetic is brittle (it silently depends on the relative ordering and contiguity of CCMP and CTEST instruction enums). Prefer an explicit mapping helper (e.g., derive insCC once and map to the corresponding CTEST instruction via a small table/switch) so future instruction table edits don’t break this transformation.

Suggested change

instruction ctestIns = (instruction)(ccmpIns + FIRST_CTEST_INSTRUCTION - FIRST_CCMP_INSTRUCTION);

// Derive the matching ctest instruction explicitly from the same condition

// instead of relying on the relative enum layout of CCMP/CTEST instructions.

instruction ctestIns = JumpKindToCtest(condDesc.jumpKind1);

added an assert to make sure the offset between FIRST_CCMP_INSTRUCTION and FIRST_CTEST_INSTRUCTION is not changed. This is a simply, 1:1 mapping, probably not necessarily to use a switch table.

But I am open to change the design if other reviewers are inclined to enhance the robustness with the switch table.

Co-authored-by: Copilot <copilot@github.com>

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Co-authored-by: Copilot <copilot@github.com>

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Copilot · 2026-04-30T19:35:45Z

+            assert((FIRST_CTEST_INSTRUCTION - FIRST_APX_INSTRUCTION) == 32);
+            instruction ctestIns = (instruction)(ccmpIns + FIRST_CTEST_INSTRUCTION - FIRST_CCMP_INSTRUCTION);


This peephole relies on a hard-coded ‘32’ offset and an assert that references FIRST_APX_INSTRUCTION (even though the conversion uses FIRST_CCMP_INSTRUCTION). To make the mapping more robust/clear, prefer a compile-time check (static_assert) on (FIRST_CTEST_INSTRUCTION - FIRST_CCMP_INSTRUCTION) and compute the offset once from those two symbols (removing the magic number and the unrelated FIRST_APX reference).

Suggested change

assert((FIRST_CTEST_INSTRUCTION - FIRST_APX_INSTRUCTION) == 32);

instruction ctestIns = (instruction)(ccmpIns + FIRST_CTEST_INSTRUCTION - FIRST_CCMP_INSTRUCTION);

constexpr int ccmpToCtestOffset = FIRST_CTEST_INSTRUCTION - FIRST_CCMP_INSTRUCTION;

static_assert(ccmpToCtestOffset == 32);

instruction ctestIns = static_cast<instruction>(ccmpIns + ccmpToCtestOffset);

Fix proposed here: #127536 (comment)

Copilot · 2026-04-30T19:35:46Z

                    dst += emitOutputByte(dst, 0x66);
                }
+
+                if (IsCFCMOV(ins))
+                {
+                    // XArch-APX-TODO: JIT does not emit sub-32b CMOV, whether to use 16b operands in CFCMOV is to be
+                    // determined.
+                    break;
+                }


For EA_2BYTE, this emits an operand-size prefix (0x66) and then breaks for CFCMOV, leaving the emitter to continue with a partially-applied size handling. If CFCMOV truly must not be encoded for 16-bit operands, it would be safer to assert/noway_assert before emitting the prefix (or otherwise prevent reaching this case), rather than emitting bytes and then bailing out mid-handling.

Updated the comments.
The reason to break here is that CFCMOV does not follow the normal encoding rule that

A = (B | 0x01) A: opcode for 16b/32b/64b B: opcode for 8b

The handling here is to mostly reuse the emitting infra but make sure CFCMOV gets the right opcode.

Co-authored-by: Copilot <copilot@github.com>

Ruihan-Yin · 2026-04-30T22:43:43Z

Failures look unrelated.

Hi, @tannergooding @kg, please take a look at this PR, thanks :)

introduce the remaining NCI instructions -CTEST, CFCMOV

d0d4999

Copilot AI review requested due to automatic review settings April 28, 2026 22:56

github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 28, 2026

dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Apr 28, 2026

Copilot AI reviewed Apr 28, 2026

View reviewed changes

This was referenced Apr 29, 2026

Android arm32 device not found (armeabi-v7a architecture unavailable) #125440

Open

Assertion failed: new_time >= loop->time in Component Governance Detection dotnet/dnceng#6510

Open

Ruihan-Yin and others added 2 commits April 29, 2026 12:04

Formatting.

849f26e

resolving comments.

c181556

Co-authored-by: Copilot <copilot@github.com>

Ruihan-Yin marked this pull request as ready for review April 30, 2026 17:51

Copilot AI review requested due to automatic review settings April 30, 2026 17:51

Copilot AI reviewed Apr 30, 2026

View reviewed changes

Comment thread src/coreclr/jit/emitxarch.cpp

Comment thread src/coreclr/jit/emitxarch.cpp Outdated

Comment thread src/coreclr/jit/emitxarch.cpp Outdated

Comment thread src/coreclr/jit/emitxarch.cpp Outdated

resolve comments

c958535

Co-authored-by: Copilot <copilot@github.com>

Copilot AI review requested due to automatic review settings April 30, 2026 19:30

formatting

e45ac57

Copilot AI reviewed Apr 30, 2026

View reviewed changes

resolve comments.

ffa4778

Co-authored-by: Copilot <copilot@github.com>

	INST3(ctesto, "ctesto", IUM_RD, 0x000084, 0x00008F6, BAD_CODE, 1C, 2X, INS_TT_NONE, Writes_OF \| Writes_SF \| Writes_ZF \| Writes_CF \| INS_FLAGS_Has_Sbit)
	INST3(ctesto, "ctesto", IUM_RD, 0x000084, 0x00000F6, BAD_CODE, 1C, 2X, INS_TT_NONE, Writes_OF \| Writes_SF \| Writes_ZF \| Writes_CF \| INS_FLAGS_Has_Sbit)

-            instruction ctestIns = (instruction)(ccmpIns + FIRST_CTEST_INSTRUCTION - FIRST_CCMP_INSTRUCTION);
+            // Derive the matching ctest instruction explicitly from the same condition
+            // instead of relying on the relative enum layout of CCMP/CTEST instructions.
+            instruction ctestIns = JumpKindToCtest(condDesc.jumpKind1);

		assert((FIRST_CTEST_INSTRUCTION - FIRST_APX_INSTRUCTION) == 32);
		instruction ctestIns = (instruction)(ccmpIns + FIRST_CTEST_INSTRUCTION - FIRST_CCMP_INSTRUCTION);

-            assert((FIRST_CTEST_INSTRUCTION - FIRST_APX_INSTRUCTION) == 32);
-            instruction ctestIns = (instruction)(ccmpIns + FIRST_CTEST_INSTRUCTION - FIRST_CCMP_INSTRUCTION);
+            constexpr int ccmpToCtestOffset = FIRST_CTEST_INSTRUCTION - FIRST_CCMP_INSTRUCTION;
+            static_assert(ccmpToCtestOffset == 32);
+            instruction ctestIns = static_cast<instruction>(ccmpIns + ccmpToCtestOffset);

Conversation

Ruihan-Yin commented Apr 28, 2026

Uh oh!

dotnet-policy-service Bot commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Ruihan-Yin Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Ruihan-Yin Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Ruihan-Yin Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Ruihan-Yin Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Ruihan-Yin commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Ruihan-Yin Apr 29, 2026 •

edited

Loading

Ruihan-Yin Apr 30, 2026 •

edited

Loading