Relative calls lifting #4

tathanhdinh · 2019-03-07T15:48:33Z

Describe the bug
The LowIR of relative calls may not be correct.

To Reproduce
The instruction

e8 e9 fa ff ff        call -0x517

at the concrete address 0 is lifted into

-------------ISMark (0, 5)-------------
T_0:I64 := 0xFFFFFFFFFFFFFAEE:I64
T_1:I64 := 0x5:I64
RSP := (RSP - 0x8:I64)
[RSP] := T_1:I64
RIP := T_0:I64
-------------IEMark (5)-------------

Expected behavior
The instruction is a relative call, so the target PC should be relative to the next instruction, i.e. RIP := RIP - 0x517 + 5 (where 5 is the size of the instruction). Moreover, the value being pushed into the stack should be calculated relatively with RIP, i.e. [RSP] = RIP + 5.

Environment:

Windows 7 64 bit
.NET Core version: 2.2.200
B2R2 version: master:HEAD

Additional context
PR #5 is a dirty fix where B2R2 lifts this instruction to:

-------------ISMark (0, 5)-------------
T_0:I64 := (RIP + 0xFFFFFFFFFFFFFAF3:I64)
T_1:I64 := (RIP + 0x5:I64)
RSP := (RSP - 0x8:I64)
[RSP] := T_1:I64
RIP := T_0:I64
-------------IEMark (5)-------------

Thoughts
B2R2 currently computes directly the target using the current "concrete" address of the instruction and the containing basic block (then there is no need to use a symbolic PC).

In this call instruction, B2R2 pushes the returned "concrete PC" into the stack (e.g. [RSP] := 0x5) as well as compute the next "concrete PC".

The PR #5 makes B2R2 symbolizes the PC, then [RSP] := PC + 0x5. I think it would be a "more expected" behaviour in situations where we need a symbolique PC (e.g. when we need to formally prove something which should always be true regardless of the value of PC).

PS. The issue is just a personal point of view and may not a bug but I cannot set the label myself, sorry.

The text was updated successfully, but these errors were encountered:

sangkilc · 2019-03-08T02:03:45Z

Thanks for the report. In fact this is not a bug, the target address computation is based on the concrete address of the instruction itself as you indicated. Do you have any specific reason why RIP should appear in the lifted IR? IMHO, it will complicate the formula anyways.

tathanhdinh · 2019-03-08T16:31:23Z

Thanks you for the response.
There're cases where we need a symbolique RIP, for example in the snippet:

0x0        e8 fb ff ff ff        call -0x5
0x5        58                    pop rax

which is more or less mov rax, rip.

Because of RIP is always concrete, then it's not possible to prove rax == rip in LowIR; and it has effects on optimization and caching.

For example, in implementing a DBI we may get several the same basic block at different addresses, a method to accelerate DBI is to optimize this basic block and put the result into a cache. Given the basic block (well, it's not because of call and ret but just an example)

0x0        53                    push rbx
0x1        c3                    ret
0x2        e8 f4 ff ff ff        call -0xc
0x7        58                    pop rax
0x8        48 31 d8              xor rax, rbx

an optimizer would be able to know that rax = 0 at the end of the block; but it's not possible yet in current B2R2.

sangkilc · 2019-03-08T21:44:07Z

0x0        e8 fb ff ff ff        call -0x5
0x5        58                    pop rax

From your example above, the call instruction will jump to itself, and it will go infinite loop. So I think you wanted to say something like:

0x0        e8 00 00 00 00        call 0x5
0x5        58                    pop rax

Now, this is effectively the same asmov rax, 0, where 0 is the address of the first instruction. If the first instruction is at 0x42, then the above snippet will be lifted as mov rax, 0x42, and so on.

We still know that rax will be always the same as the address of the first instruction (call instr), and an optimizer can benefit from it regardless of the fact that we have the symbolic RIP or not. Let me know if I am missing something.

tathanhdinh · 2019-03-09T08:45:00Z

Thanks you for the response.

...the call instruction will jump to itself, and it will go infinite loop. So I think you wanted to say something like:

I feel ashamed :(, your example is correct, this is indeed call 0x5 (or call 0 under Intel's assembly).

Yes, the optimizer will reduce this snippet into mov rax, 0x5 (instead of mov rax, rip + 0x5). But that's the problem that I want to say about: if we have many blocks of:

call 0x5
pop rax

at different base addresses, then we can not reduce them into a general formula rax = rip + 0x5: the only way we can do is to enumerate one by one so that the optimizer will re-optimize each block with corresponding base rip.

sangkilc · 2019-03-09T11:16:00Z

Sorry but I am not still sure about "reducing multiple of those call-pop pairs".

Suppose we have consecutive call-pop pairs like this:

0  call 5
5  pop rax
6  call 11
11 pop rax
12 call 17
17 pop rax

Now, we can translate this into

mov rax, 5
mov rax, 11
mov rax, 17

Now, with what you have mentioned, we would have:

mov rax, rip
mov rax, rip
mov rax, rip

But this is wrong, because RIP will change as you convert the instruction to mov rax, rip. The address of the instruction will change, and the corresponding RIP will also change. This means, after executing the above three instructions, we will have different a value in rax. So, to me, it is more accurate to use concrete addresses.

tathanhdinh · 2019-03-09T13:26:19Z

Thanks you for the response, I really appreciate that.

Sorry for my bad explanation, this is not what I mean in saying replace the "call-pop" by the single mov rax, rip + 0x5, and your example

mov rax, rip
mov rax, rip
mov rax, rip

is indeed a very good example to say that it will not work.

What I mean is by symbolizing RIP, we get a general formula rax = base_address + 0x5. Though "call-pop" is a too trivial example to see the effect, we can imagine a more sophisticated basic block where the result on of registers depending on the base address, and there are many copies of it with different base addresses.

An emulator (using LowIR as internal IR and optimizing with symbolic PC) will have chance to directly compute the result without replaying each concrete basic block.

sangkilc · 2019-03-09T22:03:39Z

Thank you for the explanation. I now see what you mean. I think such a global optimization can definitely benefit, but we may not change our design decision at this point for several reasons:

Having explicit RIP for every relative branch instructions makes ASTs unnecessarily big. It is just one more node for one instruction, but overall there are too many relative branches in a program.
The optimization you mentioned is difficult to implement in reality, and we are not sure if the optimization is critical for the efficiency of an analyzer or an emulator.
Making RIP explicit is never difficult, but requires some carefully modification on the entire front-end of our system.

So all in all, If we see more needs on this, we may resurrect this issue and the PR, but for now, we will not handle this issue. Hope this makes sense to you, and thanks for the fruitful discussion.

FileInfo.EntryPoint now returns 0UL when there is no entry point. See GitLab #4

tathanhdinh added the bug Something isn't working label Mar 7, 2019

tathanhdinh mentioned this issue Mar 7, 2019

[Intel] Fix relative call lifting #5

Closed

sangkilc assigned sijung07 Mar 7, 2019

sangkilc added question Further information is requested and removed bug Something isn't working labels Mar 8, 2019

sangkilc closed this as completed Mar 9, 2019

sangkilc added a commit that referenced this issue Mar 23, 2019

[BinFile] Fix EntryPoint semantics

6d2deb5

FileInfo.EntryPoint now returns 0UL when there is no entry point. See GitLab #4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relative calls lifting #4

Relative calls lifting #4

tathanhdinh commented Mar 7, 2019 •

edited

Loading

sangkilc commented Mar 8, 2019

tathanhdinh commented Mar 8, 2019 •

edited

Loading

sangkilc commented Mar 8, 2019

tathanhdinh commented Mar 9, 2019 •

edited

Loading

sangkilc commented Mar 9, 2019

tathanhdinh commented Mar 9, 2019 •

edited

Loading

sangkilc commented Mar 9, 2019

Relative calls lifting #4

Relative calls lifting #4

Comments

tathanhdinh commented Mar 7, 2019 • edited Loading

sangkilc commented Mar 8, 2019

tathanhdinh commented Mar 8, 2019 • edited Loading

sangkilc commented Mar 8, 2019

tathanhdinh commented Mar 9, 2019 • edited Loading

sangkilc commented Mar 9, 2019

tathanhdinh commented Mar 9, 2019 • edited Loading

sangkilc commented Mar 9, 2019

tathanhdinh commented Mar 7, 2019 •

edited

Loading

tathanhdinh commented Mar 8, 2019 •

edited

Loading

tathanhdinh commented Mar 9, 2019 •

edited

Loading

tathanhdinh commented Mar 9, 2019 •

edited

Loading