Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGoblin panics when reading certain kinds of UPXed binaries #36
Comments
|
Hello and thanks for the bug report! Sorry this crate didn't work out for you Pretty sure this is a dup of #28 Should be relatively straightforward to fix, just been busy. The panic is just because it unwraps an option, I believe it's one of the last remaining places that unwraps in a parsing scenario. Unfortunately I definitely don't have time to generate the binaries. From the log you posted though, I'm not sure what's going on with them, but it looks like the magic numbers are bad, or offsets in the binary are bad, which makes me wary that the binaries are bad; would have to see actual stuff tho |
|
Oh, you're trying to parse binaries compressed with this? https://en.wikipedia.org/wiki/UPX Yea, from what I understand, that modifies the on disk image, which is completely unsupported by goblin, probably forever. I would have to essentially write a UPX decompressor, which isn't really goblin's job |
A UPX-compressed binary still has to present a certain minimum file structure to the OS to remain executable and leave the first icon resource uncompressed so Windows can display it. My goal is simply to heuristically generate launcher menu entries from a recursive filesystem traversal. That means that I only need best-effort support of the following functionality:
(And given that I don't remember MZ having space for an icon resource while the extra PE structures are wasted bloat in a DOS-only binary, I'm assuming that the DOS (MZ) vs. Windows (PE) distinction should still be externally visible without having to speak UPX.)
If you've got a place I can upload them to, I can offer you a zip of them.
I don't have the time to boot up my Windows XP retro-PC to test them on genuine Windows right now, but Wine doesn't have a problem running all of the panicking Windows ones except the 64-bit one (which only fails because my default Wine prefix is 32-bit for greater compatibility with old games).
I also took the liberty of re-testing the DOS ones that gave
(Every single binary except the |
|
I very much would like to see and test on these binaries, please.
Is emailing a possibility - m4b dot github dot io at gmail dot com ? Also firefox added some 1GB upload thing. Otherwise I don't know how to internet very well :/
Interesting; definitely need to see these binaries.
IIRC, I thought PE binaries were dos binaries; every "pe" binary has a dos header, which alternatively has a pointer to the PE header if it's PE, and not, if just DOS. This is the entire optional header part of the parsing, so should be easy to parse a dos and recognize just a dos file, iuc |
|
Actually a just checked code, and it assumes it's a PE binary by always assuming the PE pointer is valid; porting to parse DOS files should work without much effort |
I'm doubtful. When I had to send a project to my professor, I wound up getting fed up trying to outwit GMail's "nothing which might be virus-like" detector and just resorting to the release upload functionality on a private BitBucket repo. Give me a sec and I'll see if I can remember how to get Dropbox to generate a share URL. (I haven't touched it since they killed off the Public folder functionality which meant I didn't need to either use the Nautilus file manager or log into their website to construct a valid share URL for something new.)
Yes. All EXEs must have the (Though the more elegant applications like the QEMM97 installer would have both a fully-functional DOS application and a fully-functional Windows application within the same EXE file, similar to how ISO9660 images can contain both an ISO9660 filesystem for DOS and an HFS filesystem for MacOS and they can share as many or as few files as desired.) This delphiDabbler article has probably the best explanation I've seen for how the different Microsoft EXE formats relate to each other and how to tell them apart and parse them (including a flowchart). |
|
Here. I can't guarantee this'll be up until the end of time, but it's only half a megabyte, so I'll leave it in my Dropbox until some point in the indefinite future when I've forgotten why it was there and delete it. https://www.dropbox.com/s/cxrgj4eyfkb69n8/test_exes.zip?dl=0 It contains
|
|
@ssokolow Awesome, got it! Thanks for much for uploading it; investigating the binaries now, in particular: Seems the KERNEL32 import entry rva can't be transformed into an offset into the binary:
Now I need to figure out why |
|
Be aware that Give me a few minutes to figure out which of my PlayOnLinux prefixes are 64-bit and I'll pop open a console in one of them and confirm it there. |
|
OK, confirmed. Both (Correction, one of the two EXEs.) |
|
It's ok, I think I fixed it. Going to push now, onto master, and test some more. The KERNEL32 doesn't have an import lookup table, only an address table. I believe this means it requests an import from an absolute address from kernel32 iirc; since iirc windows builds all system binaries with absolute values And this is not a dup of #28 |
…ping when constructing synthetic imports et. al. add massive amounts of debugging for those dark times. add name() to section.
|
I think this commit fixes the issue, which was that import lookup tables are essentially optional. @ssokolow would you mind verifying on your end? Or if not that's cool too, otherwise will close, thanks so much for the info and files. I believe the magic errors are due to not properly parsing just a DOS file, which would be a simple PR iiuc. I'm not sure about the offset errors coming from the other binaries though, that's unusual |
|
I've never looked into how to reference a git repo from |
|
NP, no rush, I do think it's been resolved though. (And for the record it would be: [dependencies]
goblin = { git = "https://github.com/m4b/goblin" }and then you might need to do I need to remove map the unwraps to ok_or in exports, and then i believe PE is completely panic free, fwiw. |
OK, I've just tested commit bd68660 and here's what I found:
That line was a bit difficult to parse, but I assume you meant I ran a quick ripgrep search...
...to look for low-hanging fruit and then tossed out the lines which were in There are still a few other places where only human perfection during refactoring is protecting against denial-of-service attacks via intentionally malformed EXEs:
|
|
Thanks for the list! So I need to check but most of those are acceptable unwraps, e.g. in debug printers or implementations that state they are allowed to panic. The used in PE export are not, specifically unwrapping the option. Those will be easy to fix. As for binaries, thank you! I don't like committing binaries into goblin but I might make a goblin_binaries repo and have it as a submodule for optional tests |
No problem. :) I just hope that some day, someone with more time than I have decides to implement my idea for a "find all non-whitelisted panics" static analyzer to do that sort of thing properly.
Not ideal, but OK. (It's nice to be able to use such things in your logging or error reporting facilities without having to wrap them in a panic-catcher. Having your handler for unexpected errors unexpectedly error is not nice.)
I hope you mean Introducing new "never use this on untrusted input" API surface without good reason never sits well with me.
Again, happy to help. Eventually, when I have time to write a GUI for easily breaking apart Git repos with full history preserved, I'm thinking I might offer up a repo for everyone which just contains the most generally useful collection of test binaries I can think to generate. |
|
Thanks. Once I have time to go through and make a list of the functions to avoid calling to ensure panic-safety, including in my debug logging, I think this is what I'll wind up using in my project. |
|
Just for the record, I hadn't realized how much |
|
The parser uses no unsafe code except in PE array accesses which are easily verified to be safe |


I was investigating my options for parsing EXE files to determine what environment to auto-fill in my experimental game launcher (ie. DOSBox, Wine, Wine+qemu-user, Mono, etc.) and I managed to trigger some panics in goblin.
Here's a backtrace which appears to represent all of the panics:
While this renders it unsuitable for my project (the mere fact that Goblin is capable of dying at an
unwrap(when the other PE parser I've tried so far simply usedResultto indicate a parse failure) indicates that using it in my project would cause me more worry than simply writing my own MZ/NE/PE parser with Nom), I thought you'd want to know so you can fix the problem for others.If you want to re-create my test binaries, the source materials are in the test_exes folder of ssokolow/game_launcher and
build.shcontains instructions for the simplest, easiest way to install the requisite packages on a *buntu Linux 14.04 LTS machine like mine.To reiterate what
build.shsays, all compilers are optional, so producing just the binaries which caused panics here should only requireapt-get install upx-ucl mingw-w64and then downloading and unpacking OpenWatcom.