Instruction bytes change between versions of the executable #1

KulaGGin · 2022-02-16T19:56:27Z

Sigmaker generates correct byte pattern but the byte sequence for the instruction itself changes. In my case byte sequence for the instruction mov rdx, rcx changed from 48 89 CA to 48 8B D1:

And so when I generate a signature in the version on the left, Sigmaker generates 48 89 CA C1 E8 04 pattern, and it won't find it in the version on the right.

Don't know how often it happens in the wild.

An obvious fix would be to replace all bytes to ?? for instructions that can be represented by different byte sequences. Not sure if it's easy enough or possible to just ask some assembler "can this instruction be represented by multiple byte sequences?"

The text was updated successfully, but these errors were encountered:

kweatherman · 2022-02-16T20:24:25Z

To be clear, when you say "changes" do you mean between an incremental/updated version of the same target executable?

KulaGGin · 2022-02-16T20:28:55Z

To be clear, when you say "changes" do you mean between an incremental/updated version of the same target executable?

Yes.

kweatherman · 2022-02-16T21:26:37Z

Therein lies the rub.
I know you do research and what not into signaturing, binary diffing, etc., so you might have thought about a lot of this already.

To do this procedurally/programmatically I see a few possible solutions (more like research directions) depending on the use case.
AFAIK the two main use cases for game hacking (game exploiting, modding, etc.) is:

To grab new offsets from one executable module to another for updates.
To get offsets to functions and/or data at runtime dynamically.

For the first case, assuming one is still in IDA, we have the luxury of having all the disassembler data.
If one has HexRays there is some kind of intermediate representation (IR) available.
Otherwise maybe one could use mcsema or similar to lift into LLVM maybe.
And/or up to using more advanced matching techniques using call graphs, etc.
Then develop a custom signature format that has more details to enable some sort of fuzzy matching.
IDA does internally have a classification (from decode_insn()) where it basically lumps a bunch of x86 opcodes into a type like NN_call or NN_jmp, etc.. Maybe this combined with saving unique opprand values would be enough for a fuzzy matching system.

For the 2nd use case, since we probably shooting for speed and probably don't want to take the time to do disassembly, dynamic code analysis, etc., then we're more restricted on a solution.

The goal for a good tool is to automate as much of these things as possible.
I think it's in the realm of doing statistical analysis over a corpus of before and after binary updates, learning how compiler(s) of interest make binary code variations based on code, changes, etc., that this could somehow be automated. So while making the signatures the code would know where and when to replace some bytes of a signature with ?? wildcards since it could predict which parts of instructions are likely to change.
With a setup for proper automated feature extraction, maybe could be modeled into a machine learning method.
Probably a decent sized research project to see if this can even be done (plus probably learn a lot in the process of doing this).

Practically, and for the time being until research is done on one of these solutions for the use cases:
What I do, and what I found that others do when asked about use case details, people just end up manually tweaking the signatures. Loosening them up with more wildcards.
They will compare byte by byte what still matches, pray that the signature will still be unique in the end, and just replace more bytes with wildcards.
So in your example the only matching between the two cases is 0x48 so it would need to be "48 ?? ??" to still match.
If the signature is no longer unique after the change(s), then one can probably, hopefully, extend more bytes to the signature length until it is again.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instruction bytes change between versions of the executable #1

Instruction bytes change between versions of the executable #1

KulaGGin commented Feb 16, 2022 •

edited

kweatherman commented Feb 16, 2022

KulaGGin commented Feb 16, 2022

kweatherman commented Feb 16, 2022 •

edited

Instruction bytes change between versions of the executable #1

Instruction bytes change between versions of the executable #1

Comments

KulaGGin commented Feb 16, 2022 • edited

kweatherman commented Feb 16, 2022

KulaGGin commented Feb 16, 2022

kweatherman commented Feb 16, 2022 • edited

KulaGGin commented Feb 16, 2022 •

edited

kweatherman commented Feb 16, 2022 •

edited