-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Update RISCV-64 sleigh files to support vector, bit manipulation, and crypto extensions #5778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
To fix build errors:
|
I expect to fill in some gaps in this PR shortly. Scalar crypto extensions were skipped even though vector crypto extensions were added. openssl can use RISCV scalar crypto AES extension instructions but not (yet?) the vector crypto extensions. I also hope to add minimalist pcode semantics to allow decompilation of the simplest GCC-14 RISCV builtin intrinsic vector function examples - as used in Ghidra developers will have some serious design questions to thrash out when GCC-14 autovectorization support lands some time next year. |
|
||
# Thead semi's extensions currently recognized by binutils objdump | ||
# and documented in https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.0.0/xthead-2022-09-05-2.0.0.pdf | ||
@include "riscv.xthead.sinc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be guarded by ifdef
, with the define
in a new xhead slaspec file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's reasonable. Does the new xthead slaspec file get named in riscv.ldefs
so the user can invoke it, or do you suggest we generalize riscv.opinion
to look for the Tag_RISCV_arch
ELF attribute, recognize the current composition of extensions, and set finer-grain inclusion tags?
Tag_RISCV_arch: "rv64i2p1_m2p0_a2p1_f2p2_d2p2_v1p0_zicsr2p0_zifencei2p0_zmmul1p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0_xtheadfmemidx1p0"
Apparently binutils 2.41 concatenates all of the march
extensions passed to gcc, so the tail end of this attribute reads something like:
"This binary requires an ISA supporting:"
x
- a vendor specific Instruction Set Architecture module not currently part of a proposed standard profilethead
- the lower case vendor name publishing the extension setfmemidx
- the extension set module name1p0
- the version of this extension set, likely1.0
Composing standard and vendor extension profiles raises lots of Ghidra import design questions. Apparently questions the binutils team has already addressed - do their answers work for Ghidra?
Update: These design decisions may lead to significant refactoring or code bloat.
- Does every known RISCV-based CPU get its own slaspec file and 5 MB sla file in the Ghidra distribution?
- even this doesn't always work with chips containing heterogeneous cores
- Should Ghidra scan user-specific directories for additional compiled sla and slaspec files for any RISCV CPU extension combinations individuals find useful?
- Are Ghidra build-time decisions generating slaspec files moved to run-time actions generating temporary sla files after parsing the import's ELF attributes?
- Are a new set of runtime ifdef statements allowed in build-time sla files to enable specific extensions at run time?
Personal opinion:
- ratified extensions with non-conflicting opcode codepoints should continue to be included in the baseline 32 bit and 64 bit slaspec and sla files
- Ghidra does not want to be responsible for recognition of proprietary extensions, but these will surely exist. Searching user directories for slaspec and sla 'plugins' should be enabled just as java and python user directories are enabled.
- The baseline ELF importer may be extended to expose ELF attributes on imports, such as the
Tag_RISCV_arch
file attribute containing aggregate extensions used in each compilation unit.
THead extensions are now collected into separate slaspec files, which are now referenced in
|
@thixotropist I stubbed out some risc-v packed simd instructions and implemented some thead instructions here to fix some issues had with my work, feel free to cherry-pick the commits if you want. |
@madushan1000: Those look good - I'll be happy to cherrypick them into the branch. Have you any suggestions for RISCV integration tests to add to https://github.com/thixotropist/ghidra_import_tests? It's currently very weak in 32 bit and microcontroller exemplars, as I've been leaning towards linux-capable 64 bit examples. |
This sdk I'm working with has a bunch of rv32 examples, https://github.com/bouffalolab/bouffalo_sdk/tree/master/examples. |
…a Readme concerning ISA extensions
Fix store sizes for several instructions (My apologies for not noticing the pull request earlier)
I just tried latest version of thixotropist:isa_ext on BL808 BootROM, which is E907 (rv32 thead), and so far, everything looks okay. I still need to play with it more, but so far it was enough for everything I needed. Thanks everyone for this effort, I hope it will be possible to merge it at some point. |
This is a very large PR that has had some very active periods of development. I've been waiting to make sure that it was complete and stable enough before reviewing it. Looking through it now though, it looks like it should be ready for a review. |
Thanks for the comments - I'd like to see this merged too. The PR may still be in triage because it implicitly makes a lot of design decisions regarding pcode op typing and emulation, as well as ISA extension handling. The developers may need more discussion - public and internal - before they are willing to go down that path. The current state of the PR is stable. There are some newer RISCV ISA extensions for fractional floating point ops and saturating math - I don't plan on adding these to the existing PR, so it can be reviewed as is. As a discussion example, what does the Ghidra community want to see in the decompiler window when working with functions like: // compiled with RISCV march=rv64gcv, -O3, and -ffast-math
void test_1_ref(unsigned long long *in, unsigned long long *out, unsigned int size)
{
int i;
int upper_index = size - 1;
for (i=0; i < size; i++) {
out[i] = in[upper_index - i];
}
} SIMD or vector extensions can turn simple loops over structures into something not so simple. Type inference, compilable C extraction, and emulation in general are all abandoned with the design approach used in this PR. |
Is there anything we can do to help @GhidorahRex with this review? For instance
|
The associated semantics are only suggestive of the action taken by these instructions
|
@jobermayr is correct. I've added guards as suggested, with additional guards sensitive to quad floating point support. I'll commit these changes shortly. In general, I've used single-precision registers for half-precision FP ops and double-precision registers for quad-precision FP ops as interim semantics. This raises more general questions:
|
…ecision FP * Document RISCV datatests * Remove unneeded fp register definition
Add several RISCV Instruction Set extensions to Ghidra, following discussion #5744. This pull request tracks the tip of the binutils testsuite for vector, bitmap, and crypto instructions. You can verify the content by importing sample binaries from https://github.com/thixotropist/ghidra_import_tests. Import the RISCV-64 gas test suite, assemble to binary, then iterate on the Ghidra sleigh files until Ghidra and objdump give essentially the same disassembled output.
The sleigh files do not yet include pcode semantics. Recent updates to GCC-14 and libssl using RISCV vector and crypto extensions may give us sample binaries to work with, to see what pcode semantics actually add value with complex instructions like these.