Fix: extract correct mnemonic on IDA Pro #19
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue
IDA
GetDisasm()
may not be an unambiguous way to extract an instruction mnemonic.No-operand Mnemonic
GetDisasm()
concatenates the comment with the disassembly. On no-operand mnemonics, the comment is concatenated without any separator, whether it's an automatic or manual comment. For instance:retn ; I'm out!
GetDisasm()
returns"retn; I'm out!"
se_blr # Branch unconditionally
GetDisasm()
returns"se_blr# Branch unconditionally"
In
jvd/ida/ida_utils.py
, extracting the mnemonic withGetDisasm(head).split()[0]
produce an extra character in those cases, i.e.retn;
andse_blr#
respectively, which is incorrect.Intel Prefixes
Intel prefixes such as
rep
andlock
inrep stosd
andlock xadd op1, op2
for instance, are also returned byGetDisasm()
.jvd
thus extract the first word, the prefix, rather than the mnemonic.Other processors might have such prefix as well.
Proposed Fix
Keep the mnemonic from
print_insn_mnem(head)
collected three lines above.In the above example,
retn
andse_blr
are properly returned.In prefix cases, the fix returns the actual mnemonic only, without the prefix. It seems more semantically correct to return the mnemonic alone than the prefix alone as with original code. It is not clear though whether the prefix should be returned as part of
a) mnemonic
b) operands
c) something else
d) none of the above
The proposed fix goes for d) since it is not returned at all.