MacOS Classic - Analyze dataflow and preserved registers #573

gbody · 2018-02-13T13:37:19Z

Looking at how some of the procedures that get defined after Analyze Dataflow, it appears to me that some of registers that are being pushed at the begining and pulled at the end of the procedure are being included in the procedure signature. Should the registers being push at the begining and later pulled prior to exiting, be listed in the preserved registers for the procedure after Analyze dataflow has completed.

Pre Analyze Dataflow

Entry code pushing registers to stack
void _DATAINIT()
{
_DATAINIT_entry:
l0010FC02:
a7 = fp
a5 = a5world
a7 = a7 - 0x04
Mem0[a7:word32] = a4
a7 = a7 - 0x04
Mem0[a7:word32] = a3
a7 = a7 - 0x04
Mem0[a7:word32] = a2
a7 = a7 - 0x04
Mem0[a7:word32] = a1
a7 = a7 - 0x04
Mem0[a7:word32] = a0
a7 = a7 - 0x04
Mem0[a7:word32] = d7
a7 = a7 - 0x04
Mem0[a7:word32] = d6
a7 = a7 - 0x04
Mem0[a7:word32] = d5
a7 = a7 - 0x04
Mem0[a7:word32] = d4
a7 = a7 - 0x04
Mem0[a7:word32] = d3
a7 = a7 - 0x04
Mem0[a7:word32] = d2
a7 = a7 - 0x04
Mem0[a7:word32] = d1

=> Procedure processing

Exit code pulling registers from stack
l0010FC48:
d1 = Mem0[a7:word32]
a7 = a7 + 0x04
d2 = Mem0[a7:word32]
a7 = a7 + 0x04
d3 = Mem0[a7:word32]
a7 = a7 + 0x04
d4 = Mem0[a7:word32]
a7 = a7 + 0x04
d5 = Mem0[a7:word32]
a7 = a7 + 0x04
d6 = Mem0[a7:word32]
a7 = a7 + 0x04
d7 = Mem0[a7:word32]
a7 = a7 + 0x04
a0 = Mem0[a7:word32]
a7 = a7 + 0x04
a1 = Mem0[a7:word32]
a7 = a7 + 0x04
a2 = Mem0[a7:word32]
a7 = a7 + 0x04
a3 = Mem0[a7:word32]
a7 = a7 + 0x04
a4 = Mem0[a7:word32]
a7 = a7 + 0x04
return
_DATAINIT_exit:
}
Exit code pulling registers from stack

Post Analyze dataflow
word32 _DATAINIT(word32 d0, word32 d1, word32 a4, ptr32 & d1Out, ptr32 & d7Out, ptr32 & a3Out, ptr32 & a4Out)
{
_DATAINIT_entry:
l0010FC02:
word32 d0_121
word32 a7_120 = fp - 0x0030
branch Mem0[0x0010FDB4:word16] == 0x01 l0010FC16
l0010FC12:
d0_121 = -0x01
goto l0010FC48
l0010FC16:
word32 a3_88 = a5world - Mem0[0x0010FDB0:word32]
ZEROBUFFER(d1, dwLoc3C, Mem0[0x0010FDB0:word32], a3_88)
Mem101[fp - 0x3C:word32] = 0x0010FDB0 + Mem0[0x0010FDB8:word32]
Mem104[fp - 0x40:word32] = a3_88
uncompress_world(dwArg00, dwArg04)
Mem111[fp - 0x3C:word32] = 0x0010FDB0 + Mem104[0x0010FDBC:word32]
Mem114[fp - 0x40:word32] = a3_88
Mem117[fp - 0x44:word32] = a5world
relocate_world(dwArg00, dwArg04, dwArg08)
a7_120 = fp - 0x38
d0_121 = 0x00
l0010FC48:
word32 a7_56 = a7_120 + 0x04
word32 d1_55
*d1Out = Mem0[a7_120:word32]
word32 d7_67
*d7Out = Mem0[a7_56 + 0x0014:word32]
word32 a3_75
*a3Out = Mem0[a7_56 + 0x0024:word32]
word32 a4_77
*a4Out = Mem0[a7_56 + 0x0028:word32]
return d0_121
_DATAINIT_exit:
}

uxmal · 2018-02-15T08:28:38Z

Not having had time to look at this closely yet, my first suspicion is that Reko is not correctly handling the sequence:

relocate_world(dwArg00, dwArg04, dwArg08)
a7_120 = fp - 0x38

causing it to consider the stack to be unbalanced, which in forces it to consider the register assignments as generating output values, rather than restoring the registers to the values they had on entry to the procedure.

gbody · 2018-02-15T13:54:07Z

@uxmal looking at it further, it appears to me that it might be having problems with the call to ZEROBUFFER. it's pushing a ptr to A5World - word32 buffer size at (a4), then pushing word32 buffer size at (a4), then calls procedure ZEROBUFFER. The procedure ZEROBUFFER is interesting in that it pulls the return address and puts it in a0, and then pulls the size/length of the buffer , then the address of the buffer. Then clears the buffer and returns via jmp.l (a0). The RTL code generated is a call to a0 and then return.
Is it this handling of pulling the return address and jmp.l (a0) throwing the stack off?

Probably should be a seperate question (issue). Looking at RTL code the setup of parameters prior to calling the procedure ZERROBUFFER, there is reference to a5 which is the A5World pointer address. While previously your talked about a structure that points to A5World. The actual pointer to A5World should be an offset from the start of A5World memory region that points to the A5World zero offset address. This pointer would then be used for all offsets relative to the A5World that are being referenced.
Is this how it works, or is there something hidden under the hood to make it work?

Assembly section from _DATAINIT
0010FC16 26 4D movea.l a5,a3
0010FC18 97 D4 suba.l (a4),a3
0010FC1A 2F 0B move.l a3,-(a7)
0010FC1C 2F 14 move.l (a4),-(a7)
0010FC1E 61 00 01 4C bsr ZEROBUFFER
0010FC22 20 2C 00 08 move.l $0008(a4),d0
0010FC26 48 74 08 00 pea (a4,d0)
0010FC2A 2F 0B move.l a3,-(a7)
0010FC2C 61 00 00 2E bsr uncompress_world

RTL code for above assembly
l0010FC16:
a3 = a5
a3 = a3 - Mem0[a4:word32]
CVZNX = cond(a3)
a7 = a7 - 0x04
v22 = a3
Mem0[a7:word32] = v22
CVZN = cond(v22)
v23 = Mem0[a4:word32]
a7 = a7 - 0x04
v24 = v23
Mem0[a7:word32] = v24
CVZN = cond(v24)
call ZEROBUFFER (retsize: 4;)
d0 = Mem0[a4 + 0x08:word32]
CVZN = cond(d0)
a7 = a7 - 0x04
Mem0[a7:word32] = a4 + d0
a7 = a7 - 0x04
v25 = a3
Mem0[a7:word32] = v25
CVZN = cond(v25)
call uncompress_world (retsize: 4;)

uxmal · 2018-04-20T00:39:28Z

Indeed, the jmp.l (a0) is throwing Reko for a loop. What's going on here is that the return address is being "reified", and treated very similar to how RISC machines use their link registers. Currently Reko doesn't explicitly manage the return address (continuation, in computer science speak). I'm going to have to think about how to make continuations explicit in the codebase.

uxmal · 2019-11-10T09:51:25Z

I'm going to have to look closer at this tomorrow. It's curious that this has regressed since none of the logic in the data flow analysis "should" affect the symbol generation which happens much earlier in the decompilation process.

gbody mentioned this issue Nov 9, 2019

MacOS Classic Analysis development, An internal error occurred. Offset must be non-negative. #792

Closed

uxmal closed this as completed in 69513c7 Apr 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MacOS Classic - Analyze dataflow and preserved registers #573

MacOS Classic - Analyze dataflow and preserved registers #573

gbody commented Feb 13, 2018

uxmal commented Feb 15, 2018

gbody commented Feb 15, 2018

uxmal commented Apr 20, 2018

uxmal commented Nov 10, 2019

MacOS Classic - Analyze dataflow and preserved registers #573

MacOS Classic - Analyze dataflow and preserved registers #573

Comments

gbody commented Feb 13, 2018

uxmal commented Feb 15, 2018

gbody commented Feb 15, 2018

uxmal commented Apr 20, 2018

uxmal commented Nov 10, 2019