Skip to content

Conversation

@jchv
Copy link
Contributor

@jchv jchv commented Jun 8, 2024

About a month ago you mentioned on Hacker News that adding COFF support should be easy and fairly self-contained. You were right.

This is an implementation of the delinker exporter that outputs the Microsoft COFF format. It's still very rough around the edges but I was able to use it to successfully delink and relink an i386 Windows executable. Interestingly, with some careful delinking, I can take a program compiled by MSVC for Windows, delink it to ELF and relink it with glibc to make it run on Linux, and delink it to COFF and relink it with MinGW to make it run on Windows again. 🤯

Feel free to leave a lot of comments. I can try to address comments, though I may take a bit depending on how busy I am. Not sure if you want to merge this upstream but I'm eager to get delinking into the world of Windows reverse engineering as it seems extremely promising for a wide variety of reverse engineering tasks.

Some notes:

  • Only tested for i386; would love to make sure at least AMD64 and AArch64 work. If I could track down old WinNT ports to other architectures, I could try to get some of those working, too.
  • Not tested with UNIX COFF. I don't have a toolchain to try it with. I think it is very plausible that we could make this exporter work for other COFF variants.
  • I exported out some code that seemed sharable between different exporters. I'm not sure if I put them in good places. I am not a Java programmer :)

@boricj
Copy link
Owner

boricj commented Jun 8, 2024

Ooh neat!

Interestingly, with some careful delinking, I can take a program compiled by MSVC for Windows, delink it to ELF and relink it with glibc to make it run on Linux, and delink it to COFF and relink it with MinGW to make it run on Windows again. 🤯

Yeah, delinking tends to do that to one's mind. I don't know what it would do to a CS101 teacher's mind, but I expect either some pretty fireworks or a one-way trip to a psychiatric hospital.

Feel free to leave a lot of comments. I can try to address comments, though I may take a bit depending on how busy I am. Not sure if you want to merge this upstream but I'm eager to get delinking into the world of Windows reverse engineering as it seems extremely promising for a wide variety of reverse engineering tasks.

I definitely want to merge this once it's ready. I expected that object file formats and ISAs to be mostly orthogonal, which is why I went for exporting object files rather than assembly when designing my tooling. This PR proves my theory.

Overall the code itself looks fairly good for a first draft. I'll add code comments later today, but I have some general observations:

  • I'm using conventional commits to generate release notes automatically with Git Changelog. The COFF exporter code should be in a commit titled feat(COFF): relocatable object file exporter, similar to this one.
  • I try to keep commits self-contained and the history clean, so code moved from the ELF exporter to src/main/java/ghidra/app/util/ should be separated into multiple refactor: move xxx class out of ElfRelocatableObjectExporter commits. Similarly, the COFF x86 relocation mapper should be in its own commit, like this one and this one. Ideally, the commit adding the COFF exporter should only contain the COFF exporter itself and no other changes.
  • I'd like to have some COFF tests to ensure some level of regression testing. I found out that it's way too easy to silently break the delinker in non-obvious ways when making changes, so this is the only way I've found to stop my Ghidra extension from collapsing on itself.
  • I have CICD running on this repository and the PR needs to pass it successfully in order to merge it.

I can deal with these issues myself if you need (and in particular I'll handle the tests myself, I need to make some adjustments to the test harness to support COFF anyways). This is the first ever non-trivial PR I'm receiving on one of my repositories, so this is kind of uncharted territory for me. I'll try to make the process as smooth as possible.

Some notes:

  • Only tested for i386; would love to make sure at least AMD64 and AArch64 work. If I could track down old WinNT ports to other architectures, I could try to get some of those working, too.
  • Not tested with UNIX COFF. I don't have a toolchain to try it with. I think it is very plausible that we could make this exporter work for other COFF variants.
  • I exported out some code that seemed sharable between different exporters. I'm not sure if I put them in good places. I am not a Java programmer :)
  • I expect that adding x86_64 would be easy since I already have a x86 relocation synthesizer and CISC ISAs are easy to analyze. RISC ISAs are trickier because the lack of pointer-sized immediate constants requires building them out of multiple instructions (HI16/LO16). Branch delay slots are a nightmare to deal with, but I've pulled it off with MIPS. I don't know about EPIC or VLIW ISAs, so who knows what pitfalls lurk inside Itanium...
  • One COFF variant that might be interesting to look into would be DJGCC, used to build programs that run on top of MS-DOS extenders (it's not a requirement at all for this PR). I don't know enough about the other COFF variants to have an opinion on them.
  • There's probably more code that can be shared, but it's one of those things that are obvious only in hindsight. So far I've mostly worried about supporting multiple ISAs than multiple object file formats. At any rate, I can deal with that at a later time.

@boricj boricj mentioned this pull request Jun 8, 2024
@jchv
Copy link
Contributor Author

jchv commented Jun 8, 2024

I'm rebasing right now. As for the CI failure... Apparently, Ghidra prior to a very recent version had an annoying typo in the CoffSymbolSectionNumber code, so I'll probably just replace N_UNDEF with zero for now.

Copy link
Owner

@boricj boricj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fairly good so far, here's the list of changes needed before I can merge this.

@boricj
Copy link
Owner

boricj commented Jun 8, 2024

I'm rebasing right now. As for the CI failure... Apparently, Ghidra prior to a very recent version had an annoying typo in the CoffSymbolSectionNumber code, so I'll probably just replace N_UNDEF with zero for now.

Yeah, at some point I'll need to revisit the range of Ghidra versions I should support, going all the way back to 10.3 probably no longer makes sense now. I'm fine with dropping N_UNDEF for now.

@boricj boricj self-assigned this Jun 8, 2024
@jchv jchv force-pushed the coff branch 3 times, most recently from 0feeaa5 to b543dc1 Compare June 8, 2024 20:03
@jchv
Copy link
Contributor Author

jchv commented Jun 8, 2024

OK, tried to address all of that. Just ran through it and I think I got the last of it. I'd be interested in getting this working with MIPS and other systems too, though yeah I'd favor keeping it out of the scope of this particular PR, since it's already pretty large.

Let me know if there's anything else you spot before merging. I checked with Organize Imports and spotless and I think all of the formatting issues are resolved.

@jchv
Copy link
Contributor Author

jchv commented Jun 8, 2024

It looks like tests are passing, but I think because its running from a fork PR it is running into some kind of permission issue? Not 100% sure.

@jchv jchv requested a review from boricj June 8, 2024 20:09
@boricj
Copy link
Owner

boricj commented Jun 8, 2024

I'd be interested in getting this working with MIPS and other systems too, though yeah I'd favor keeping it out of the scope of this particular PR, since it's already pretty large.

Sure, the MIPS relocation type mapper can come later, it's not necessary for x86 operation. My ELF object file exporter can tolerate a missing relocation mapper (no relocation would be emitted in that case), I won't require that from the COFF object file exporter before merging it.

Let me know if there's anything else you spot before merging. I checked with Organize Imports and spotless and I think all of the formatting issues are resolved.

It should be fine. I've seen some opportunities to harmonize and refactor code with the ELF exporter, but I won't require these to be addressed before merging. In fact, I will probably wait until I have a third object file exporter before doing that, it's still not obvious what would be actually good candidates for refactoring.

It looks like tests are passing, but I think because its running from a fork PR it is running into some kind of permission issue? Not 100% sure.

The tests do pass, they only cover the existing ELF exporter and relocation synthesizers, nothing covers the COFF exporter yet. It seems that the test report artifacts can't be created inside the GitHub Actions run itself. It looks like I'll have to set up a split workflow on my end to fix this. I'll ping you when it's done (I'll need that PR rebased on top of master for a linear history before merging, but there's no need to do that as long as I haven't sorted this out).

@boricj boricj force-pushed the master branch 2 times, most recently from f8ee82a to b6e21aa Compare June 8, 2024 21:43
@boricj
Copy link
Owner

boricj commented Jun 8, 2024

You can try rebasing your PR on top of the master branch. I think it's sorted out, but the runs on the PR won't use the updated workflows on my master branch as-is.

@jchv
Copy link
Contributor Author

jchv commented Jun 8, 2024

Rebased. Looks like the workflow still needs to be approved though.

FWIW, if you want to merge this some other way outside the normal GitHub PR flow, feel free.

@boricj
Copy link
Owner

boricj commented Jun 8, 2024

Looks like everything's sorted out. I'll merge the PR tomorrow (it's getting a bit late here), after playing with the COFF exporter on my side for a bit.

@boricj
Copy link
Owner

boricj commented Jun 9, 2024

I rebased and merged it into master.

Fixing the N_UNDEF constant ultimately led me to upgrade the extension to target Ghidra 11.1, which required a bunch of changes. If you need to work with older Ghidra versions I can add the COFF exporter on the v0.3.x branch, but I'd rather work on just the master branch if I can help it.

I'll look into adding some COFF tests and maybe add the MIPS relocation type mapper myself.

@boricj boricj closed this Jun 9, 2024
@jchv
Copy link
Contributor Author

jchv commented Jun 9, 2024

Oh, that's cool, thanks! I'm using NixOS unstable which is currently on Ghidra 11.0.3 still, so I think I'll need to wait a couple weeks most likely. That said, it's not too big of a deal since in the meantime I can continue using my local version.

@jchv jchv deleted the coff branch June 9, 2024 15:17
@gynt
Copy link

gynt commented Jun 10, 2024

This is so cool, nice!

@boricj boricj mentioned this pull request Jul 2, 2024
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants