Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The big cheap fix #446

Merged
4 commits merged into from Nov 6, 2018
Merged

The big cheap fix #446

4 commits merged into from Nov 6, 2018

Conversation

ghost
Copy link

@ghost ghost commented Oct 30, 2018

Generally when reversing, a bottom-up approach is used. The problem with render.cpp is that it uses two languages, which makes this approach very difficult to achieve the desired results. Therefore, this approach is more of a top-down sort of technique.

The idea is that we now have a binary exact version of this file (barring the order of the initial push/pop), so the inlined code can be integrated and will retain the same output with much more ease than trying to guess what registers the C compiler is going to hog. Once the inlined code replaces the base assembly code, the rest can be converted to C since we know the assembly is already correct.

At the same time, the project will still retain the pure C version for newer compilers, which can't produce binary parity anyway...

Edit: also, I discovered some dead code thanks to the raw ASM approach. There appears to be four different calls to draw transparent tiles (leftover from the demo when tiles were drawn differently). This code wasn't present in the current project.

@ghost
Copy link
Author

ghost commented Nov 1, 2018

Ok so this may or may not be necessary. Apparently you can use the __forceinline prefix so that the asm gets inlined no matter what. BIG sigh of relief. So now it's pretty much back to square one: add real asm to everything until the C code uses the correct registers, with constant crashing in the mean time ;)

Edit: I'll go ahead and keep this here for now, as it reduces the project from 30% to 24% difference. But it shouldn't be merged until the dilemma is solved.

@ghost
Copy link
Author

ghost commented Nov 5, 2018

Alright guys, the sado-master piece is finished! I found a dirty trick that let's using both asm and C possible, by hooking the variables as they would be called into the inlined function. I.e.

__asm {
  xor edx,edx
  xor ecx,ecx
  mov esi,src
  mov edi,dst
}
__forceinline void __usercall draw_cel_piece(edx, ecx, esi, edi)

While this doesn't bring the project down to 24% like dumping the whole thing in asm does, I think this is good enough ;)
26% change, 40099 insertions(+), 37222 deletions(-)

@ghost ghost merged commit 1324082 into diasurgical:nightly Nov 6, 2018
@ghost ghost deleted the CheapSkateFix branch November 6, 2018 03:47
@ghost
Copy link
Author

ghost commented Nov 6, 2018

Since there are no objections I merged this one. Everything was tested thoroughly to ensure each CEL type is drawn correctly. I also tested the registers to make sure there were enough free. EAX is the only reserved one, the rest can be used when future assembly functions are inlined.

I'm pretty sure that the inlined functions were all __fastcall consisting of one argument:
__forceinline void __fastcall draw_cel(int width<ECX>)
The rest are NOT passed to the stack, because they are already in the same registers ahead of time. The problem is I don't know the proper way to assign variables to the correct register. Apparently you can do this in C:

register BYTE *dst;
register int width;

But it still doesn't guarantee they will be as follows:

EDI -> destination buffer
ESI -> source buffer
EBX -> light table for XLAT
ECX -> loop count (fastcall fixes this)
EBP -> temp C loop
EDX/EAX -> temp use

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant