Getting the address of a varnode (aka instruction operand) #4606

kkaempf · 2022-09-19T18:42:30Z

(rephrased to better match sleigh terminology)

I'm working on a processor description for VAX and would need to get the address of an instruction operand.

VAX has one-byte opcodes followed by operands with variable (1 to 5 bytes) length.

Examples (not exact mnemonics)

one-byte opcode, two one-byte operands

00000000: 90 01 50 - MOVE.B S^1, R0

one-byte opcode, one two-byte operand, one four-byte operand

00000000: 90 CF 34 12 E0 78 56 34 12 - MOVE.B (PC+0x1234), (R0 + 0x12345678)

Example 2 is the problem. The first operand ("CF 34 12") is PC-relative, it computes PC+0x1234, where PC is right after the final "12" value. In the example above, that would result in 0x1238.

Problem

To compute PC-relative offsets correctly, I need to know the operands memory address. However, neither inst_start, nor inst_next are usable here:

I can't use inst_start because the operand might be second and I don't know the size of the first operand.
I can't use inst_next because the operand might be first and I don't know the size of the second operand.

Are there any other options ?

The text was updated successfully, but these errors were encountered:

Example for NationalSecurityAgency/ghidra#4606 It should compute the EA after the W^ offset has been read (PC at 0x04) Signed-off-by: Klaus Kämpf <kkaempf@gmail.com>

gtackett · 2022-09-23T13:12:58Z

Wow! Implementing the VAX instruction set in Ghidra sounds like a very large task., certainly well beyond my level. But as a longtime VMS user and hobbyist, I'd sure love to see the end product.

(Likewise for the PDP-11, if anyone is interested in taking up that architecture.)

kkaempf · 2022-09-23T13:51:51Z

Wow! Implementing the VAX instruction set in Ghidra sounds like a very large task., certainly well beyond my level. But as a longtime VMS user and hobbyist, I'd sure love to see the end product.

😊

(Likewise for the PDP-11, if anyone is interested in taking up that architecture.)

It's on my list (now that I have some basic understanding of sleigh 😆 )

GhidorahRex · 2022-09-23T17:09:38Z

Since this is static within the instruction, I think context might work well here. Use something like op_addr = (0,3) noflow

For the opcode, use [ op_addr = 1; ]

hen for each operand, you know how big it is, so you can increment op_addr each time. [ rel_addr = inst_start+op_addr; op_addr = op_addr + bytes; ] with bytes being however many bytes are consumed by the currend operand.

kkaempf · 2022-09-23T18:03:44Z

Thanks, something similar was my initial attempt.

During parsing, the op_addr was computed correctly but it seems as if the final computation is done after the instruction is completely matched. So ever operand came out with the same (set by the last operand) op_addr.

However, I didn't specify noflow - need to check this out.

Thanks again ! Will report here 😉

Signed-off-by: Klaus Kämpf <kkaempf@gmail.com>

GhidorahRex · 2022-09-23T18:37:58Z

You may need to add a globalsetas well? There's possibly some other shenanigans that may need to be employed, but I'm pretty confident this can be done with just context.

GhidorahRex · 2022-09-29T14:27:22Z

Were you able to get this to work?

kkaempf · 2022-10-03T06:45:29Z

Sorry, I was out last week.

For the opcode, use [ op_addr = 1; ]

Not clear where to use this, as the opcode is a field (and I can't add disassembly actions to it, can I ? 🤔)

I tried setting the offset (aka op_addr) for each instruction in a branch. Resetting the op_addr for each instruction works that way, but makes disassembly like 10 times slower :-(

kkaempf · 2022-10-03T12:05:13Z

For the opcode, use [ op_addr = 1; ]

Not clear where to use this, as the opcode is a field (and I can't add disassembly actions to it, can I ? thinking)

Solved this with a non-visible operand

op_code: epsilon is epsilon [ op_addr = 1; ] { export epsilon; }

Works nicely, as the op_addr value gets reset when I add ..; op_code; .. to the bit pattern section.

However, when computing operands, every operand gets the final op_addr value (after all operands are parsed) instead of the value at the respective operand position.

kkaempf · 2022-10-21T16:54:24Z

I now created a minified VAX processor description to visualize the problem better.

I use lifting-bits disassembler with this binary:

81 af 00 af 00 af 00

It should disassemble to

ADDB3 B^0x3, B^0x5, B^0x7

but doesn't.

Each operand disassembles to the same value. 😞

kkaempf · 2022-10-31T11:38:43Z

I've now tried all kind of combinations of context, noflow, globalset etc. All give the same result: When exporting the result, I get the final value (after all operand have been processed) and not the intermediate ones.

This doesn't come as a surprise to me since ghidra has to process all operands twice. Once for computing inst_next and then again for computing the disassembled values (which might include inst_next).

kkaempf · 2022-10-31T11:50:44Z

I've solved it now by introducing an operand_offset variable.

(Adding _printf_s to Ghidra pointed me to the right places, esp. showing that ParserWalker's value retrieval functions where called twice - once reading 4-byte-value to match against the disassembler spec and once reading correctly-sized values to compute the correct disassembly values)

See f9a8788 for the C++ part and ecc24c7 for the Java part.

operand_offset is modeled like inst_start but with a different getValue() implementation:

inst_start has

Address addr = walker.getAddr();
return addr.getAddressableWordOffset();

operand_offset has

return walker.getOffset(-1);

This works nicely and fixes the issue at hand.

kkaempf · 2023-03-21T16:25:30Z

Will #4812 be considered now ? 🥺

ryanmkurtz · 2023-03-21T16:35:38Z

Sorry, I didn't realize this was tied to those PR's.

jbglaw · 2023-06-27T18:11:39Z

I found this issue because I searched for "VAX". I'm interested in this! However, I don't yet have any clue about Ghidra or Java. Is there any way to help with VAX support?

Just to add a comment: The VAX ISA does have one-byte and two-byte opcodes. So relying on them as one-byte long would be wrong. And then there's a ton of addressing modes. Plus the oddity of the CASE* instructions. Alas... I really would love to help here. It would be great to have something that helps to dissect machine ROMs or system binaries.

kkaempf · 2023-06-28T08:47:04Z

Hey @jbglaw , Ghidra VAX support is (mostly) done - except for #4812 😞 .

If you want to build from source, check out the vintage branch at https://github.com/kkaempf/ghidra-vintage

I'm also maintaining RPM packages for iopenSUSE Tumbleweed

kkaempf · 2023-06-28T08:51:29Z

I found this issue because I searched for "VAX". I'm interested in this! However, I don't yet have any clue about Ghidra or Java. Is there any way to help with VAX support?

Please check out and contribute to https://github.com/kkaempf/ghidra-vax 😉

Just to add a comment: The VAX ISA does have one-byte and two-byte opcodes. So relying on them as one-byte long would be wrong. And then there's a ton of addressing modes. Plus the oddity of the CASE* instructions.

This all should be working in ghidra.vax

Alas... I really would love to help here. It would be great to have something that helps to dissect machine ROMs

I'm already working on ROMs and I'd be happy to collaborate on http://ghidra-server.org/

jbglaw · 2023-06-28T09:23:52Z

Well, I just requested an account on ghidra-server.org. Let's see.

OTOH, as I'm a 100% newbie to Ghidra, my first step should be to get it running. Source builds seem to be not too trivial with Debian as it's missing build tools (at least in the requested version.) And then there's that one outstanding patch. Are there chances those will be merged? At least it doesn't look as if it would break anything else.

kkaempf · 2023-06-28T11:45:53Z

Well, I just requested an account on ghidra-server.org. Let's see.

🤞🏻

OTOH, as I'm a 100% newbie to Ghidra, my first step should be to get it running. Source builds seem to be not too trivial with Debian as it's missing build tools (at least in the requested version.)

If you're not afraid of downloading binaries (like gradle) on your machine, building should be as simple as

gradle \
  -Dfile.encoding=UTF-8 \
  --project-prop finalRelease=true \
  buildNatives_linux64

This will give you a .tar file which you can extract locally and start ghidra from there.

And then there's that one outstanding patch. Are there chances those will be merged?

Ghidra (the project) is generally slow in merging outside contributions :-/

At least it doesn't look as if it would break anything else.

Certainly not. It's just exposing a value that is already tracked internally.

jbglaw · 2023-06-28T12:07:44Z

gradle is the issue here. But I think I'll give it a try in a Docker container. Maybe wrap a script around it to have a nice receipt for getting the final tarball.

jbglaw · 2023-06-28T12:08:22Z

So let's hope that this other PR is merged, and thereafter maybe the VAX CPU description. I'll try to get it working locally. :)

jbglaw · 2023-06-29T09:58:36Z

Successfully built Ghidra (plain upstream sources, though with buildGhidra instead of buildNatives_linux64. The resulting ZIP file contains a working Ghidra afterwards. Next step is to pull in your patch and the VAX CPU description.

kkaempf changed the title ~~Getting the address of a varnode (aka instruction parameter)~~ Getting the address of a varnode (aka instruction operand) Sep 20, 2022

GhidorahRex added Type: Question Further information is requested Feature: Sleigh Feature: Processor/Unsupported labels Sep 23, 2022

GhidorahRex self-assigned this Sep 23, 2022

kkaempf added a commit to kkaempf/ghidra-vintage that referenced this issue Sep 23, 2022

[wip] Introduce inst_middle to address NationalSecurityAgency#4606

fd4bfe8

Signed-off-by: Klaus Kämpf <kkaempf@gmail.com>

kkaempf linked a pull request Oct 21, 2022 that will close this issue

prevent out-of-bounds access in findSymbol() #4681

Open

kkaempf linked a pull request Dec 11, 2022 that will close this issue

Introduce operand offset (C++ and Java) #4812

Open

ryanmkurtz closed this as completed Mar 21, 2023

ryanmkurtz reopened this Mar 21, 2023

ryanmkurtz removed the Type: Question Further information is requested label Mar 21, 2023

GhidorahRex assigned caheckman and unassigned GhidorahRex Jun 28, 2023

ryanmkurtz removed the Feature: Processor/Unsupported label Jan 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting the address of a varnode (aka instruction operand) #4606

Getting the address of a varnode (aka instruction operand) #4606

kkaempf commented Sep 19, 2022 •

edited

gtackett commented Sep 23, 2022

kkaempf commented Sep 23, 2022

GhidorahRex commented Sep 23, 2022

kkaempf commented Sep 23, 2022

GhidorahRex commented Sep 23, 2022

GhidorahRex commented Sep 29, 2022

kkaempf commented Oct 3, 2022

kkaempf commented Oct 3, 2022

kkaempf commented Oct 21, 2022

kkaempf commented Oct 31, 2022

kkaempf commented Oct 31, 2022

kkaempf commented Mar 21, 2023

ryanmkurtz commented Mar 21, 2023

jbglaw commented Jun 27, 2023

kkaempf commented Jun 28, 2023

kkaempf commented Jun 28, 2023

jbglaw commented Jun 28, 2023

kkaempf commented Jun 28, 2023

jbglaw commented Jun 28, 2023

jbglaw commented Jun 28, 2023

jbglaw commented Jun 29, 2023

Getting the address of a varnode (aka instruction operand) #4606

Getting the address of a varnode (aka instruction operand) #4606

Comments

kkaempf commented Sep 19, 2022 • edited

Problem

gtackett commented Sep 23, 2022

kkaempf commented Sep 23, 2022

GhidorahRex commented Sep 23, 2022

kkaempf commented Sep 23, 2022

GhidorahRex commented Sep 23, 2022

GhidorahRex commented Sep 29, 2022

kkaempf commented Oct 3, 2022

kkaempf commented Oct 3, 2022

kkaempf commented Oct 21, 2022

kkaempf commented Oct 31, 2022

kkaempf commented Oct 31, 2022

kkaempf commented Mar 21, 2023

ryanmkurtz commented Mar 21, 2023

jbglaw commented Jun 27, 2023

kkaempf commented Jun 28, 2023

kkaempf commented Jun 28, 2023

jbglaw commented Jun 28, 2023

kkaempf commented Jun 28, 2023

jbglaw commented Jun 28, 2023

jbglaw commented Jun 28, 2023

jbglaw commented Jun 29, 2023

kkaempf commented Sep 19, 2022 •

edited