Skip to content

Implement breakpoint disassembly support for Intel APX #114120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 28, 2025

Conversation

BruceForstall
Copy link
Contributor

Teach the amd64 breakpoint disassembler about APX, specifically the REX2 and extended EVEX encodings.

Update the tools to work with newer versions of gcc/gdb, such as handling new gdb output format in the parsing regular expressions.

Due to these newer versions, there are differences in the non-APX tables, apparently due to gcc/gdb bug fixes and improvements (e.g., supporting instructions previously unsupported).

Note that the APX code is untested due to lack of APX hardware. Also, the Windows SDK CONTEXT record does not define APX extended GPR (eGPR) registers yet, so accessing those registers is disabled.

The tables were generated using the following versions of gcc/gdb on Ubuntu 24.04.2 LTS, in WSL2:

gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
GNU gdb (Ubuntu 15.0.50.20240403-0ubuntu1) 15.0.50.20240403-git

Details:

  1. Change the "createOpcodes.cpp" tool to generate more varieties of possible instructions, to include encoding forms for REX2 and extended EVEX.
  2. Change createOpcodes to always generate 16 bytes of codes. Previously, the parser looked for a "58" followed by "59 pop" to indicate the end of an instruction sequence. This failed in various cases. Note that the longest legal x86 instruction sequence is 15 bytes, as defined by the architecture.
  3. Update the parser and table generation tool (Amd64InstructionTableGenerator.cs) to be able to parse REX2 and extended EVEX instructions, and generate a new EVEX table for EVEX map 4.
  4. The parser was updated to handle new gdb disassembly formats, such as different whitespace usage (spaces versus tabs), and using the "BCST" tag to indicate EVEX embedded broadcast.
  5. The native walker was updated to understand the new tables, including when to use them (thus, it needs to recognize REX2 and extended EVEX formats).
  6. Fixed bugs in existing AVX-512 (EVEX) handling of b, L'L, and pp bits: they were being read from the wrong prefix byte.
  7. There seem to be a couple existing bugs in NativeWalker::Decode which I annotated but did not feel confident fixing: a. the loop to read and process instruction prefixes only reads a single prefix. Thus, a case like 0x66 (operand size) followed by 0x40 (REX) improperly assumes the REX byte is the instruction opcode. b. if the instruction opcode (after the prefix) is 0xcc, DebuggerController::GetPatchedOpcode() is called to read the actual opcode, but it uses the wrong address to do so.

Contributes to #112588

@BruceForstall BruceForstall added area-Diagnostics-coreclr apx Related to the Intel Advanced Performance Extensions (APX) labels Apr 1, 2025
Copy link
Contributor

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements support for APX breakpoint disassembly by extending the decoder to recognize new instruction encoding forms such as REX2 and extended EVEX (map Evex_4). It also updates regular expressions for modern gcc/gdb disassembly output and revises instruction sample parsing and opcode table generation.

  • Added new encoding flags and map (Evex_4) for extended EVEX.
  • Updated regex patterns and sample parsing logic to accommodate different disassembly formats.
  • Modified debug logging conditions and updated opcode handling for REX2 and EVEX instructions.
Files not reviewed (2)
  • src/coreclr/debug/ee/amd64/gen_amd64InstrDecode/createOpcodes.cpp: Language not supported
  • src/coreclr/debug/ee/amd64/walker.cpp: Language not supported

@@ -1035,7 +1116,7 @@ private void AddOpCode(Map map, int opCodeExt, bool reg, int modrmReg, string ru
else
{
string oldstring = null;
if (Debug.debug)
if (true) // Debug.debug
Copy link
Preview

Copilot AI Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unconditional debug logging in AddOpCode may lead to excessive log output in production; consider using a conditional check (e.g., Debug.debug) or removing debug statements for production builds.

Suggested change
if (true) // Debug.debug
if (Debug.debug)

Copilot uses AI. Check for mistakes.

@BruceForstall
Copy link
Contributor Author

@BruceForstall
Copy link
Contributor Author

Note on testing: I have done no testing on this. I don't expect to be able to do any testing on the APX impact of this change for quite some time, until we have more available APX hardware. I think we should do whatever testing is appropriate to ensure this doesn't regress (and in fact, improves) existing x64 scenarios. Note that there are some bug fixes here for the existing AVX-512 support.

@tommcdon
Copy link
Member

tommcdon commented Apr 3, 2025

@BruceForstall Thanks for the work on debugger support for Intel APX! Please feel free to reach out to me offline if any assistance is needed with validation.

Copy link
Member

@tommcdon tommcdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It's a big change and so we should do some validation.


BYTE prefix = *ip;
if (prefix == 0xcc)
{
prefix = (BYTE)DebuggerController::GetPatchedOpcode(m_ip);
prefix = (BYTE)DebuggerController::GetPatchedOpcode(m_ip); // REVIEW: change `m_ip` to `ip`?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is critical, but in general since we have a const BYTE *ip = m_ip defined, it makes sense to switch to ip.

ip++;
// REVIEW: it looks like a bug that we don't loop here looking for additional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good observation! Please feel free to make a second PR changing to while(true) - it looks non-intentional that we would break out of the loop here.

@anthonycanino
Copy link
Contributor

Note on testing: I have done no testing on this. I don't expect to be able to do any testing on the APX impact of this change for quite some time, until we have more available APX hardware. I think we should do whatever testing is appropriate to ensure this doesn't regress (and in fact, improves) existing x64 scenarios. Note that there are some bug fixes here for the existing AVX-512 support.

Thanks for this work Bruce.

Regarding validation, what is typically done for the disassembler? Perhaps we can help evaluate.

Teach the amd64 breakpoint disassembler about APX, specifically the
REX2 and extended EVEX encodings.

Update the tools to work with newer versions of gcc/gdb, such as
handling new gdb output format in the parsing regular expressions.

Due to these newer versions, there are differences in the non-APX
tables, apparently due to gcc/gdb bug fixes and improvements (e.g.,
supporting instructions previously unsupported).

Note that the APX code is untested due to lack of APX hardware. Also,
the Windows SDK CONTEXT record does not define APX extended GPR (eGPR)
registers yet, so accessing those registers is disabled.

The tables were generated using the following versions of gcc/gdb on
Ubuntu 24.04.2 LTS, in WSL2:
```
gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
GNU gdb (Ubuntu 15.0.50.20240403-0ubuntu1) 15.0.50.20240403-git
```

Details:
1. Change the "createOpcodes.cpp" tool to generate more varieties of
   possible instructions, to include encoding forms for REX2 and
   extended EVEX.
2. Change createOpcodes to always generate 16 bytes of codes.
   Previously, the parser looked for a "58" followed by "59 pop" to
   indicate the end of an instruction sequence. This failed in various
   cases. Note that the longest legal x86 instruction sequence is 15
   bytes, as defined by the architecture.
3. Update the parser and table generation tool
   (Amd64InstructionTableGenerator.cs) to be able to parse REX2 and
   extended EVEX instructions, and generate a new EVEX table for EVEX
   map 4.
4. The parser was updated to handle new gdb disassembly formats, such as
   different whitespace usage (spaces versus tabs), and using the "BCST"
   tag to indicate EVEX embedded broadcast.
5. The native walker was updated to understand the new tables, including
   when to use them (thus, it needs to recognize REX2 and extended EVEX
   formats).
6. Fixed bugs in existing AVX-512 (EVEX) handling of `b`, `L'L`, and
   `pp` bits: they were being read from the wrong prefix byte.
7. There seem to be a couple existing bugs in `NativeWalker::Decode`
   which I annotated but did not feel confident fixing:
   a. the loop to read and process instruction prefixes only reads a
   single prefix. Thus, a case like 0x66 (operand size) followed by
   0x40 (REX) improperly assumes the REX byte is the instruction opcode.
   b. if the instruction opcode (after the prefix) is 0xcc,
   `DebuggerController::GetPatchedOpcode()` is called to read the actual
   opcode, but it uses the wrong address to do so.
@BruceForstall
Copy link
Contributor Author

/ba-g unrelated failures

@BruceForstall BruceForstall merged commit 29e3af6 into dotnet:main Apr 28, 2025
88 of 93 checks passed
@BruceForstall BruceForstall deleted the APXBreakpoints branch April 28, 2025 20:42
@github-actions github-actions bot locked and limited conversation to collaborators May 29, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
apx Related to the Intel Advanced Performance Extensions (APX) area-Diagnostics-coreclr
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants