Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PE code cave disassembly #327

Closed
mwilliams31 opened this issue Apr 4, 2019 · 13 comments

Comments

@mwilliams31
Copy link

commented Apr 4, 2019

Multiple ShadowHammer samples decrypt their shellcode via a function called from ___crtExitProcess(). The screenshot below is MD5 cdb0a09067877f30189811c7aea3f253.
exit_proc

SUB_0051b908 cannot be viewed because it lies within a 511-byte code cave within the .text section. Ghidra sets the .text section boundary based on the VirtualSize (0x0011A801) as shown in the screenshot below. The RawSize of the .text section is 0x0011AA00.
ghidra_mem_map

For reference, IDA Pro's segmentation is shown below. It appears to treat everything up to the next section as part of the .text section and attempts to disassemble it.
ida_mem_map

Expanding the .text section in Ghidra's memory map does not cause it to re-read the file's bytes and undefined bytes are displayed. Outside of modifying the sample's PE header, is there a way to force Ghidra to read and disassemble these bytes?

@mwilliams31 mwilliams31 added the question label Apr 4, 2019

@ryanmkurtz

This comment has been minimized.

Copy link
Collaborator

commented Apr 4, 2019

No, there is currently not a way to force that. You could write a script that does it though! You'd have to reread in the file because Ghidra doesn't keep around the bytes it didn't load into the address space.

So the extra bytes at the end of the section in the file are actually code and not just all zeros? Do you know if that's common?

@ryanmkurtz ryanmkurtz self-assigned this Apr 4, 2019

@mwilliams31

This comment has been minimized.

Copy link
Author

commented Apr 4, 2019

Thanks for the quick reply Ryan. Correct, the executable bytes at the end of the .text section (0x401000-0x51b800) start at file offset 0x11ac02 (VA: 0x51b802) and end at offset 0x11add5 (VA: 0x5ab9d5). The remaining raw section bytes are NULLs. This type of code execution is uncommon but the technique (code caving) has been around for some time.

@ryanmkurtz

This comment has been minimized.

Copy link
Collaborator

commented Apr 4, 2019

No need to read the bytes from the file then...you could just initialize your new memory block to 0 when you create it. This would be a pretty simple script to write.

@mwilliams31

This comment has been minimized.

Copy link
Author

commented Apr 4, 2019

Any thought about changing the way Ghidra loads sections to account for executable code in caves for executable sections?

@ryanmkurtz

This comment has been minimized.

Copy link
Collaborator

commented Apr 4, 2019

We'll have to discuss how to best handle this generically. I'll leave this open as an enhancement request.

@ryanmkurtz ryanmkurtz added enhancement and removed question labels Apr 4, 2019

@0x6d696368

This comment has been minimized.

Copy link

commented Apr 8, 2019

I propose (regardless of this issue) the following:
On the import dialog (where you can select Format, Language, etc.) should be an option to manually adjust the PE header and section fields.

Here is how I currently import the ShadowHammer binary:

  1. Import the PE binary as Format: Binary
  2. Run PE annotation analysis.
  3. Change the header fields. In this case set .text length (both virtual and raw) to 0x11b000.
  4. Ctrl+A to select the whole binary.
  5. "Extract and Import ...".
  6. On that import dialog select Format: PE

Hope that helps.

A reasonable improvement would also be to mimic the platform loader, i.e., load the binary like Windows would load it. In this case load the PE code cave despite it not being within a section and hence not being loaded.
Isn't there some alignment for PE sections? It is quite possible that Windows follows that alignment but Ghidra doesn't hence not loading the code.

Anyway just my 2 cent.

@0x6d696368

This comment has been minimized.

Copy link

commented Apr 8, 2019

So the extra bytes at the end of the section in the file are actually code and not just all zeros? Do you know if that's common?

The extra bytes are code. It's a small shellcode. It is basically the whole malicious part of the binary.
I assume it was placed in that code cave with good reason. It's a malware technique.

Not sure how common, because IDA automatically will include the code outside the .text section into the analysis. But if you are dealing with malware things like that are to be expected and a good tool should handle them.

@recvfrom

This comment has been minimized.

Copy link

commented Apr 9, 2019

From running cdb0a09067877f30189811c7aea3f253 with a debugger, I confirmed that the extra bytes get loaded into memory. I'm wondering if this happens when the VirtualSize aligned to the SectionAlignment becomes greater than SizeOfRawData, as @0x6d696368 hinted at.

From a set of 10,000 "clean" PE files I downloaded from VirusTotal a few months back, there were ~7,100 with an executable section where VirtualSize < SizeOfRawData and SizeOfRawData < Aligned(VirtualSize, SectionAlignment), but only 67 had extra bytes that weren't all zeroes. Expanding this search to include data sections as well, there were ~2,300 where the extra data wasn't all zeroes. By far, the biggest culprit was the rsrc section.

I'd be in favor of having Ghidra load as many bytes as the Windows loader does, as @0x6d696368 suggested. It will take more research to identify the exact logic that it uses, though.

@0x6d696368

This comment has been minimized.

Copy link

commented Apr 9, 2019

I looked further into this and here is the bug:

int rawDataSize = sections[i].getSizeOfRawData();

int virtualSize = sections[i].getVirtualSize();

It does not do the required alignment as per https://docs.microsoft.com/en-us/windows/desktop/debug/pe-format

What needs to happen:

  • SectionAlignment (in IMAGE_OPTIONAL_HEADER32):
The alignment (in bytes) of sections when they are loaded into  memory. It must be greater than or equal to FileAlignment. The default  is the page size for the architecture.
  • FileAlignment (in IMAGE_OPTIONAL_HEADER32):
The alignment factor (in bytes) that is used to align the raw data  of sections in the image file. The value should be a power of 2 between  512 and 64 K, inclusive. The default is 512. If the SectionAlignment is  less than the architecture's page size, then FileAlignment must match  SectionAlignment.
  • VirtualSize:
The total size of the section when loaded into memory. If this value  is greater than SizeOfRawData, the section is zero-padded. This field  is valid only for executable images and should be set to zero for object  files.
  • SizeOfRawData
The size of the section (for object files) or the size of the  initialized data on disk (for image files). For executable images, this  must be a multiple of FileAlignment from the optional header. If this is  less than VirtualSize, the remainder of the section is zero-filled.  Because the SizeOfRawData field is rounded but the VirtualSize field is  not, it is possible for SizeOfRawData to be greater than VirtualSize as  well. When a section contains only uninitialized data, this field should  be zero.

The code at

int rawDataSize = sections[i].getSizeOfRawData();

could be changed to:

int rawDataSize = ( (sections[i].getSizeOfRawData() + fileAlignment - 1) / fileAlignment) * fileAlignment)

And the code at

int virtualSize = sections[i].getVirtualSize();

could be changed to:

int virtualSize = ( (sections[i].getVirtualSize() + sectionAlignment - 1) / sectionAlignment) * sectionAlignment)

I think that would fix the problem.

@ryanmkurtz can you implement the above fix? If not I can try to setup a build environment this weekend and implement it.

@recvfrom

This comment has been minimized.

Copy link

commented Apr 9, 2019

Since SizeOfRawData is used to know how many bytes exist in the section on disk, I can't imagine we'd want to increase this value and then read in more bytes than the PE says that that section contains, in the case where it's not already aligned to FileAlignment. We should see what the Windows loader does in this case.

For one binary in my test set, the SectionAlignment is zero, which would trigger a divide-by-zero error in this alignment code. We should guard against hitting that case, and also see what the Windows loader does with that (maybe it uses the architecture page size instead, for instance).

@d-millar

This comment has been minimized.

Copy link
Collaborator

commented Apr 9, 2019

OK, so @0x6d696368's analysis is right on the money. I either overlooked this when I was doing our malware PE triage or someone yanked the code. Note that in FileHeader.processSections, lines 313-315, we're doing the right thing, i.e. modifying sizeOfRawData using PortableExecutable.computeAlignment to permanently modify the value based on fileAlignment. We don't have the requisite fix for virtualSize however, namely using sectionAlignment to modify its value. Probably, processSections is the wrong place to do this. Note the extra logic in PeLoader.getVirtualSize. I think the alignment needs to be done after all the other logic there. Also, a little sketchy having this logic done (potentially) over and over in a getter method. I know, it's only called by PeLoader.processMemoryBlocks and is a private method, but....

I would recommend maybe renaming getVirtualSize, adding the logic to re-align based on sectionAlignment, and calling SectionHeader.setVirtualSize to modify the value permanently a la the code in processSections. Also, per @recvfrom's comment, we need a zero check on alignment in PortableExecutable.computeAlignment.

I suppose it would also make sense to create some JUnits to check this stuff out. When I originally started fuzzing these files, I relied heavily on Ange Albertini's (aka corkami) PE documentation. He has numerous files exhibiting aberrant PE behavior, which would make for good test cases. Note though, some of them are likely to trip AV sensors.

@0x6d696368

This comment has been minimized.

Copy link

commented Apr 9, 2019

Since SizeOfRawData is used to know how many bytes exist in the section on disk, I can't imagine we'd want to increase this value

I guess it depends. That's why it would be nice to have a manual override. This would make my workaround of import as RAW, then modify header, then "Extract and Import ..." obsolete. And would also allow for fixing corrupted PE headers.

For one binary in my test set, the SectionAlignment is zero, which would trigger a divide-by-zero error in this alignment code. We should guard against hitting that case, and also see what the Windows loader does with that (maybe it uses the architecture page size instead, for instance).

Yes. That's bad.

Anyone know any Windows loader research that already answers all the questions of how Windows handles PE headers that are out of spec?

I suppose it would also make sense to create some JUnits to check this stuff out. When I originally started fuzzing these files, I relied heavily on Ange Albertini's (aka corkami) PE documentation. He has numerous files exhibiting aberrant PE behavior, which would make for good test cases. Note though, some of them are likely to trip AV sensors.

+1 for unit tests. Radare2 has some good regression tests.

0x6d696368 pushed a commit to 0x6d696368/ghidra that referenced this issue Apr 12, 2019
0x6d696368
fix issue NationalSecurityAgency#327: Align virtualSize while importi…
…ng PE files; fixed division by zero if PE has fileAlignment = 0

This fixes all issues of NationalSecurityAgency#327.
The PE loading now behaves almost like the Windows nativ loader. The only difference is that we truncate the uninitialized data. The line to remove to gain full Windows PE loader behaviour (at least to the extend I observed it) is noted in a comment.
@0x6d696368

This comment has been minimized.

Copy link

commented Apr 12, 2019

@d-millar Thanks for the hints into the code.

I prepared PR #418

It aligns virtualSize in processSections. This is analogous to the alignment of sizeOfRawData.
I then truncate virtualSize to the maximum of virtualSize or sizeOfRawData so the analysis doesn't end up with lots of uninitialized memory areas. While this truncation isn't 1:1 what the Windows PE loader does, it doesn't lose any data. The only thing that could happen is that data references to these uninitialized memory areas can not be resolved.
We can debate whether to remove the truncation or not.

Regarding doing exactly what the Windows PE loader does this is it now. Ghidra however loads a lot of binaries that Windows simply rejects as "invalid PE files", e.g. I also fixed the div by 0 in the PortableExecutable.computeAlignment (treating it as no alignment at all), which means that Ghidra loads those PEs as well. Windows simply won't recognize them as valid so we can't check how Windows would load them.

Anyway, anyone any hints how to make a .jar with only the changes to place into Ghidra/patch/? So I can give this fix to people that can't compile from source without having to send the full base.jar.

0x6d696368 pushed a commit to 0x6d696368/ghidra that referenced this issue Apr 26, 2019
0x6d696368 pushed a commit to 0x6d696368/ghidra that referenced this issue Apr 26, 2019

@ryanmkurtz ryanmkurtz added this to the 9.1 milestone May 31, 2019

@ghidra1 ghidra1 closed this in 69f8247 Jun 3, 2019

ghidra1 pushed a commit that referenced this issue Jun 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.