Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parsing PE with zero raw_data_size of section #396

Merged
merged 7 commits into from
Oct 27, 2024
Merged

Conversation

ideeockus
Copy link
Contributor

  1. Sometimes I need to parse PE header without parsing DOS header + DOS stub

  2. Seems that finding offset by rva is not implemented correctly and breaks down on some PE files. See 'cannot map exception_rva (0x5e000) into offset' when parsing PE32+ library #307

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like there are two changes, one adding a new public function, and the other fixing a bug in the rva helper function, yes? need to understand why you need the new pub function, but the other fix probably is ok, though i would prefer a test showing how it fixes a bug

src/pe/header.rs Outdated
@@ -569,6 +569,29 @@ impl Header {
optional_header,
})
}

pub fn parse_without_dos(bytes: &[u8]) -> error::Result<Self> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs documentation if pub; i also want to understand what's different here and why this can't be the original function for parsing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m4b Hello! This bug occurs when parsing file AmdCleanupUtility https://www.amd.com/en/resources/support-articles/faqs/GPU-601.html. Should I add .exe file to tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to understand why you need the new pub function

Do you mean error on DosHeader::parse(&bytes)? should be handled inside original Header::parse?
And try to parse without DosHeader in case of Error?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm asking is why is a new function being added, when does the user call it, and why can't we handle this transparently inside of the regular parse function? Ideally I'd prefer not to add another pub function if possible; maybe it is not possible, I don't know

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can handle this transparently inside of the regular parse function, but

  1. users may expect that parse returns error when DOS stub is missing
  2. there are existing tests that check invalid stub will not parse
  3. when it is known that we have not DOS stub in buffer, it makes sense to use parse_without_dos

Copy link
Contributor Author

@ideeockus ideeockus Jul 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added docstrings for this two functions, is it ok now?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok thank you for clarifying when this happens, and why to use; i've suggested grammar for the documentation of parse_without_dos

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your patience! this is the last round, to summarize:

  1. documentation grammar fixes I suggested
  2. making the private impl function would be nice

and then this is ready to go, thank you!

src/pe/header.rs Outdated
@@ -569,6 +570,30 @@ impl Header {
optional_header,
})
}

/// Parses PE header from the given bytes that not contains DOS stub, default DosHeader will be used
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grammar: Parses PE header from the given bytes, a default DosHeader and DosStub are generated, and any malformed header or stub is ignored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for suggestion, fixed

src/pe/utils.rs Outdated
@@ -64,6 +64,8 @@ fn section_read_size(section: &section_table::SectionTable, file_alignment: u32)

if virtual_size == 0 {
read_size
} else if size_of_raw_data == 0 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm a little worried about this; this function is used all over goblin, can you add a comment why this check is done, and why it returns the virtual_size in this case? also there are no tests added to verify anything here

Copy link
Contributor Author

@ideeockus ideeockus Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

analogically to virtual_size == 0 -> used read_size, if read_size == 0 -> used virtual_size. i saw pefile does this and there are no errors in parsing the AMD sample, so i added this check too and it works

I think this is not normal, but can happen on different types of modified files, so parser should handle this.
(also found doc about PE malformations https://www.virusbulletin.com/uploads/pdf/conference_slides/2007/CaseySheehanVB2007.pdf)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also added test + sample for this case: took file from https://www.amd.com/en/resources/support-articles/faqs/GPU-601.html + packed with upx.

is it ok if i add this file to tests?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what the license of that binary is 😅 so probably let's not add it for now; also, it seems to parse just fine for me without this patch? e.g.:

cargo run --example rdr -- ~/Downloads/amdcleanuputility.exe

has no issue, neither does bingrep

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parse amdcleanuputility.exe itself is fine. But if binary will packed with upx, parsing goes broken, while binary can be run properly.

Ok, test removed from PR

src/pe/header.rs Outdated
@@ -569,6 +569,29 @@ impl Header {
optional_header,
})
}

pub fn parse_without_dos(bytes: &[u8]) -> error::Result<Self> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok thank you for clarifying when this happens, and why to use; i've suggested grammar for the documentation of parse_without_dos

src/pe/header.rs Outdated
@@ -543,6 +543,7 @@ pub struct Header {
}

impl Header {
/// Parses PE header from the given bytes
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for adding documentation here; can you add the part about how we will fail if dos stub or header is wrong?

e.g.:

Parses PE header from the given bytes; this will fail if the DosHeader or DosStub is malformed or missing in some way

src/pe/header.rs Outdated
Comment on lines 577 to 595
let mut offset = dos_header.pe_pointer as usize;
let signature = bytes.gread_with(&mut offset, scroll::LE).map_err(|_| {
error::Error::Malformed(format!("cannot parse PE signature (offset {:#x})", offset))
})?;
let coff_header = CoffHeader::parse(bytes, &mut offset)?;
let optional_header = if coff_header.size_of_optional_header > 0 {
let opt_hdr = bytes.pread::<optional_header::OptionalHeader>(offset)?;
Some(opt_hdr)
} else {
None
};

Ok(Header {
dos_header,
dos_stub: DosStub::default(),
signature,
coff_header,
optional_header,
})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have said this before, but all of this is basically identical, it might be nice to make a private function:

fn parse_impl(bytes: &[u8]), header: DosHeader, stub: DosStub) -> Self {
  // highlighted section
}

and then the two pub functions call this respectively (your new function supply Default to the header and stub args)

@ideeockus ideeockus requested a review from m4b September 2, 2024 07:28
Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this looks great, sorry for the delay! If you just delete the binary (and test), we can merge this; we can even put it on a minor release of 0.9 if you want, since it has no breaking changes. There is another change upcoming that will need to rebase against this, or you rebase against them, i'd prefer the first, since we can minor release it if want, and also your PR has been outstanding longer :)

@m4b
Copy link
Owner

m4b commented Oct 27, 2024

@ideeockus just fyi, in the next breaking release 0.10, the pe::Header will have a lifetime parameter, just curious, will this be ok for your usecases? If people need owned versions I was thinking of some ways to allow this, but if it's not readily needed, I won't think too hard about it

@m4b m4b merged commit f401d18 into m4b:master Oct 27, 2024
6 checks passed
@m4b
Copy link
Owner

m4b commented Oct 27, 2024

ok published in 0.9.2, thanks for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants