Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PE32 reassembly issues #24

Closed
Trass3r opened this issue Mar 18, 2021 · 12 comments
Closed

PE32 reassembly issues #24

Trass3r opened this issue Mar 18, 2021 · 12 comments

Comments

@Trass3r
Copy link

Trass3r commented Mar 18, 2021

Following up on #11 (comment).

error A2008: syntax error : pushal
error A2008: syntax error : popal
error A2152: coprocessor register cannot be first operand : fdivr ST(0)

Should be pushad/popad and fdivr ST(0),ST(0) (D8 F8).

And there are false positives where it identified RGB data in the resource section as references, I'm not sure if that is even possible:

_RSRC SEGMENT
ALIGN 16
...
          BYTE 077H
          BYTE 077H
          BYTE 077H
          BYTE 000H
          DWORD $L_6b6b6b
          DWORD $L_5f5f5f
          DWORD $L_535353
          BYTE 047H
          BYTE 047H
          BYTE 047H
          BYTE 000H

Also there were issues with name mangling. I had to use ml /link /entry:_EntryPoint instead of /entry:__EntryPoint (it then still resulted in warning LNK4216: Exported entry point __EntryPoint though).

Then it generated import declarations like

EXTERN __imp__CreateFileA:PROC
EXTERN _CreateFileA:PROC

while C:\Program Files (x86)\Windows Kits\10\Lib\10.0.18362.0\um\x86\kernel32.Lib contains stdcall decorated

_CreateFileA@28 __imp__CreateFileA@28
test.obj : error LNK2001: unresolved external symbol __imp__CreateFileA
test.obj : error LNK2001: unresolved external symbol _CreateFileA
@itsmattkc
Copy link

fsubr has the same issue. Instruction is disassembled as fsubr st(0) instead of fsubr st(0), st(0).

@itsmattkc
Copy link

It's worth noting that a lot of these issues appear to be exclusive to 32-bit binaries. When I try to reassemble the ex1 example, it works perfectly with when the executable is compiled 64-bit and reassembled with ml64. However when it's compiled 32-bit and reassembled with ml, that's when errors like error LNK2001: unresolved external symbol appear.

Upon investigation this would make sense because the exports are genuinely different between 32-bit and 64-bit Windows libraries. Here is some of the output of dumpbin /exports for the 32 and 64 bit versions (respectively) of kernel32.lib:
image

I've also had issues on some disassemblies produced by ddisasm (after manually correcting the unresolved external symbol issues) where ml will simply hang indefinitely. This does not occur with a program as simple as ex1, but on some larger disassemblies, ml appears to enter some kind of infinite loop, using CPU cycles but never exiting. I have a sneaking suspicion this is related to floating point instructions too (due to the nature of the binaries that trigger this and other reports of MASM having stability problems with incorrect float instructions), but this is really just a theory.

@kwarrick
Copy link
Contributor

kwarrick commented Jul 19, 2021

@Trass3r

I had to use ml /link /entry:_EntryPoint instead of /entry:__EntryPoint (it then still resulted in warning LNK4216: Exported entry point __EntryPoint though).

That second warning should be fixed now, but the entry point with a single underscore is still necessary as @itsmattkc shows 32-bit and 64-bit export symbols are handled differently.

@itsmattkc

Upon investigation this would make sense because the exports are genuinely different between 32-bit and 64-bit Windows libraries.

Yes, this is something I am still unhappy about, but it appears impossible to provide a symbol name in the assembly for 32-bit that link.exe will resolve correctly.

We have sidestepped the issue by adding a --generate-import-libs command-line argument that will generate .DEF file and call lib.exe to create a .LIB file in your working directory for each import. These will satisfy the linker.

$ cd ddisasm/examples/ex1
$ cl /nologo ex.c
$ ddisasm --asm out.asm --generate-import-libs ex.exe
$ ls *.lib
KERNEL32.lib
$ ml out.asm /link /subsystem:console /entry:_EntryPoint

That should fix all the unresolved external symbol errors.

@kwarrick
Copy link
Contributor

PUSHAL, POPAL, FDIVR, and FSUBR should all be printed correctly now.

@Trass3r
Copy link
Author

Trass3r commented Jul 20, 2021

What about the resource section issue?
And which commit fixed the instructions?

@kwarrick
Copy link
Contributor

kwarrick commented Jul 20, 2021

Can you provide a sample binary for the resource section issue?

The instructions are fixed in gtirb-pprinter:

GrammaTech/gtirb-pprinter@70ecde0
GrammaTech/gtirb-pprinter@701ed49

@kwarrick
Copy link
Contributor

The warning LNK4216: Exported entry point __EntryPoint problem was fixed here:
GrammaTech/gtirb-pprinter@c6af84b

@Trass3r
Copy link
Author

Trass3r commented Jul 27, 2021

We have sidestepped the issue by adding a --generate-import-libs command-line argument that will generate .DEF file and call lib.exe to create a .LIB file in your working directory for each import. These will satisfy the linker.

I see, nice idea. But it does not work on Linux / in the docker container of course: ERROR: Unable to find lib.exe.

@kwarrick
Copy link
Contributor

Yes, that is annoying. With your suggestion of UASM though, we should be able to use uasm and LLVM's lld-link on Linux. That will let us disassemble, reassemble, and link Windows PE without any MSVC dependencies.

I'll give an update when that lands in gtirb-pprinter.

@Trass3r
Copy link
Author

Trass3r commented Aug 3, 2021

Can you provide a sample binary for the resource section issue?

Maybe with an .rc file like:

testrsrc RCDATA
{
	0x77,
	0x77,
	0x77,
	0x00,
	0x6b,
	0x6b,
	0x6b,
	0x00,
	0x5f,
	0x5f,
	0x5f,
	0x00,
	0x53,
	0x53,
	0x53,
	0x00,
	0x47,
	0x47,
	0x47,
	0x00
}

Just with data that looks like an actual .text section address, like 0x401000 or something.

@kwarrick
Copy link
Contributor

kwarrick commented Aug 3, 2021

@Trass3r

Oh, this is clearer to me now. I overlooked the fact the original false positive you showed was in the _RSRC section.

Please correct me if I am wrong, but we should never be symbolizing data in resource data. Right?

@Trass3r
Copy link
Author

Trass3r commented Aug 3, 2021

I think so cause it's pure external data, wouldn't know how to put a reference in there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants