Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Auto-Sync] Generate general instruction encoding format #2152

Closed
Rot127 opened this issue Aug 31, 2023 · 1 comment
Closed

[Auto-Sync] Generate general instruction encoding format #2152

Rot127 opened this issue Aug 31, 2023 · 1 comment

Comments

@Rot127
Copy link
Collaborator

Rot127 commented Aug 31, 2023

Generating certain instruction attributes from the td files is sometimes not possible, because those attributes are simply not defined.

Often those attributes are encoded in single flag bits and the authors of the td file simply treated them as part of the opcode and made them in-accessible this way.

E.g. instead of defining a single instructions with a variable for a flag bit F, they simply defined two instructions. one with the flag bit hard-coded to F = 1 and one with a hard-coded F = 0.

This is a problem, because fixing the td files for our use case can be enormously time consuming and would need many changes to the disassembler and the asm-writer. Both of which is very unlikely to be accepted upstream.

This leads to the problem, that we have to put a lot of effort to retrieve this information somehow. Currently mostly by scanning the asm-string if possible or patch the generated files. Both of which are not nice at all.

Examples:

  • AArch64 post-index detection (string searching).
  • ARM vector data type (the mnemonic post-fixes .u32, i16 etc. of certain vector instructions) (patching ARMGenAsmWriter.inc).

A possible way to avoid this is by generating the encoding format of the instruction.

Luckily, is the encoding format either already defined (e.g. for PPC or Hexagon), or matches roughly the base class of an instruction.

In LLVM instructions follow an inheritance hierarchy. Usually the class at the top represents all instructions with a certain encoding format.

Having the knowledge of those encoding formats, allows us to know the positions of relevant bits.
This way, if a certain attribute is not defined in the td files, we can simply test the bits of the instruction bytes.

@Rot127
Copy link
Collaborator Author

Rot127 commented Apr 22, 2024

Closed due to: #2045 (comment)

@Rot127 Rot127 closed this as completed Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant