New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PE imphash does not match YARA, VirusTotal, pefile #299
Comments
The following code shown below produces the same imphash as the other tools for the file sample above. I suggest that the get_imphash code be reviewed to ensure it is following the standard defined by Mandiant here as there may be discrepancies between lief and the other tools:
Note that I get inconsistent results when testing with another file:
|
Looking further into this, it appears that the discrepancy comes from LIEF having a more up-to-date ordinal table mapping than pefile: https://github.com/lief-project/LIEF/tree/master/src/PE/utils/ordinals_lookup_tables Here is pefile's table for reference: https://github.com/erocarrera/pefile/tree/master/ordlookup I'm not sure how you resolve this without getting someone from Mandiant involved to provide guidance. |
Hello, |
@romainthomas No problem. Based on some private conversations I've had, I believe the best way to move forward with this is to treat LIEF's imphash calculation as its own implementation of the imphash spec. VirusTotal, YARA, and pefile may be using their own variations of the imphash spec and any changes among them will break backward compatibility. I'd suggest that if users really want the other specs, then they can code them in themselves (YARA's ordinals can be found here, pefile's ordinals can be found here, no clue what VirusTotal uses but I think it may be some version of pefile). I'd also suggest we leave this issue open in case any folks from Mandiant would like to add their thoughts. |
@jshlbrd that seems reasonable. though, i'd recommend that we document clearly that LIEF imphash != pefile imphash != XXX imphash. chatting with people internally, it sounds like there are no plans to further tweak the algorithm. i think the feeling is that the algorithm works well as-is, and though updates could be made to the ordinal mapping, the algorithm is still deterministic. practically speaking, if this mapping is updated, then everyone that relies on the implementation must re-index their dataset. regarding what we use and to quote a colleague:
|
Ok, sounds good for me. |
sample attached |
Describe the bug
The imphash calculated by lief.PE.get_imphash() does match the imphash calculated by other tools. Here's an example:
To Reproduce
Compare the output by lief.PE.get_imphash() to other tools mentioned above.
Expected behavior
The imphash output of lief.PE.get_imphash() matches other tools commonly used in the industry.
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: